From 1960c01080a3294d5122355199fbdf1a29b7ae1b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=88=92=E9=85=92=E7=9A=84=E6=9D=8E=E7=99=BD?= Date: Tue, 29 Jul 2025 23:53:33 +0800 Subject: [PATCH 1/2] Update README.md --- README.md | 153 +++++++++--------------------------------------------- 1 file changed, 25 insertions(+), 128 deletions(-) diff --git a/README.md b/README.md index 768d7f8..5fd37be 100644 --- a/README.md +++ b/README.md @@ -10,146 +10,43 @@ [![GitHub Contributors](https://img.shields.io/github/contributors/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/graphs/contributors) [![GitHub License](https://img.shields.io/github/license/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/blob/main/LICENSE) - [English](./README.md) | [中文文档](./README-CN.md) -🚀The latest 2.1 version has fully upgraded AI modes———welcome to experience it!⬇️ + ### **[Important Announcement] Refactoring Plan for Weibo_PublicOpinion_AnalysisSystem** -
- - -
+Dear all contributors, users, and followers, -**Weibo Public Opinion Analysis and Prediction System** is a **social network public opinion analysis system** designed to monitor, analyze, and predict public opinion trends on social media platforms such as Weibo. This system leverages deep learning, natural language processing (NLP), and machine learning technologies to extract valuable public opinion information from vast amounts of social media data, helping governments, enterprises, and other organizations promptly understand public attitudes, respond to emergencies, and optimize decision-making. 📈 +Hello everyone, -Through powerful data collection and processing capabilities, the Weibo Public Opinion Analysis and Prediction System achieves real-time data collection, sentiment analysis, topic classification, and public opinion prediction, ensuring that users can obtain accurate and comprehensive insights into public opinion in the complex and changing social network environment. The system adopts a modular design, making it easy to maintain and expand, aiming to provide users with an efficient and reliable public opinion analysis tool, assisting various organizations in making informed decisions in the information age. +I am the initiator and main developer of this project. First and foremost, I want to personally thank you for your continued attention, contributions, and enthusiasm for the `Weibo_PublicOpinion_AnalysisSystem` project. -## ✨ Features +Over the past period, as the project has expanded, I have noticed several challenges that require attention: -- **Real-time Data Collection**: Utilize web scraping technologies to obtain user-generated content from social platforms like Weibo in real-time. -- **Data Cleaning and Processing**: Preprocess collected data, including tokenization, removal of stop words, emojis, and URLs. -- **Topic Classification**: Automatically classify posts and comments into topics using machine learning and natural language processing techniques. -- **Sentiment Analysis**: Analyze the sentiment orientation (positive, neutral, negative) within texts to understand public emotions. -- **Public Opinion Monitoring and Prediction**: Monitor changes in public opinion in real-time and predict future trends based on historical data. -- **Data Visualization**: Display analysis results through charts and graphics for easy understanding and decision-making. -- **User Management**: Provide user registration, login, and session management features to ensure system security and personalized services. +1. **Architectural and Module Issues:** Through rapid iteration, many modules have been integrated. However, a lack of unified top-level design has led to some module conflicts and a need for structural optimization. +2. **High Barrier to Entry:** A significant current challenge is that users need to configure their own crawlers and scrape data from scratch. This makes the deployment and startup process relatively complex, creating an inconvenience for many new users. +3. **Development and Presentation Limitations:** The development progress of various functional modules has been uneven. Additionally, the existing dashboard paradigm has limitations in compatibility and scalability that hinder my future development goals. +4. **Constraints of the Self-Trained Model:** Considering its size and maintenance costs, the previously trained model has become a constraint on the project's long-term development. -## 🚀 Getting Started +After a careful evaluation of these points, and in light of current technological trends (especially in LLMs, and Agents), I have decided to initiate a **comprehensive, bottom-up architectural refactoring** of the project, with the goal of providing a more user-friendly tool for everyone. -Follow the steps below to run the project on your system. +**My next update plan will focus on:** -### Prerequisites +1. **Optimizing the Core Architecture:** I will be moving away from the current dashboard-centric presentation to design a more lightweight and flexible system framework. +2. **Focusing on Core Competencies:** The new architecture will refocus my efforts on the crawling, processing, and in-depth analysis of Weibo data, aiming to build a stable and efficient data core. +3. **Integrating Advanced Large Language Models (LLMs):** I plan to discontinue maintenance of the self-trained model and will instead utilize APIs to call mainstream large language models for analysis tasks, enhancing the system's analytical capabilities and flexibility. +4. **The Ultimate Goal: A New Model of "Deployable Core + Online Service":** + - **For Developers:** I aim to refine the project into a **"minimal, user-friendly, low-cost, modular"** public opinion analysis **core engine** to facilitate secondary development and private deployment. + - **For General Users:** Leveraging the new architecture, I **plan to introduce a new "Online Service" version, designed to address the challenges of deployment and data acquisition.** + - **Providing a Shared Database:** I will begin building and maintaining a **continuously updated, shared database**. This will allow users to access our data source directly, **removing the need to configure and run their own crawlers.** + - **Simplifying the User Experience:** This will eliminate the need for a complex local setup, enabling a **click-to-use** experience. + - **Retaining Personalized Analysis:** Users will still be able to configure their own LLM API keys in the online service to perform personalized, in-depth analysis with our data core. -- [Python](https://www.python.org/) 3.7 or higher -- [MySQL](https://www.mysql.com/) Database -- [Conda](https://docs.conda.io/en/latest/) (optional, for environment management) -- A valid Weibo account (for data collection) -- At least one of the following API keys for AI analysis features: - - OpenAI API key - - Anthropic (Claude) API key - - DeepSeek API key +This refactoring is a necessary step in our development. I understand this will require adjusting and, in some cases, rewriting code to which many of you have contributed. However, for the long-term health of the project and to make it accessible to a broader audience, I believe this step is essential. -### Installation Steps +In the coming weeks, I will begin to outline the new project blueprint and will keep the community updated on my progress. I value your wisdom and support now more than ever. -1. Clone the repository: - ```bash - git clone https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem.git - cd Weibo-Public-Opinion-Analysis-System +Thank you once again for your understanding and support! Let's look forward to the next evolution of `Weibo_PublicOpinion_AnalysisSystem`. -2. Create and activate a virtual environment (optional): +Sincerely, - ```bash - conda create -n weibo_opinion_analysis python=3.8 - conda activate weibo_opinion_analysis - ``` - -3. Install dependencies: - - ```bash - pip install -r requirements.txt - ``` - -4. Configure the MySQL database: - - - Run `createTables.sql` to create the necessary database tables. - - Modify the database connection settings in `config.py` to match your MySQL configuration. - -5. Configure AI Analysis (Optional): - - Set up environment variables for AI analysis features: - ```bash - # For OpenAI API (Required for GPT models) - export OPENAI_API_KEY="your-openai-key" - - # For Anthropic API (Required for Claude models) - export ANTHROPIC_API_KEY="your-anthropic-key" - - # For DeepSeek API (Required for DeepSeek models) - export DEEPSEEK_API_KEY="your-deepseek-key" - ``` - - Note: At least one API key must be configured to use AI analysis features. - - Supported AI Models: - - OpenAI: GPT-3.5-Turbo, GPT-4 - - Anthropic: Claude-3 (Opus, Sonnet, Haiku) - - DeepSeek: DeepSeek-V3 (deepseek-chat), DeepSeek-R1 (deepseek-reasoner) - -6. Start the Flask application: - - ```bash - python app.py - ``` - -7. Access the application: Open your browser and navigate to http://localhost:5000 to use the system. - -## 🛠️ Technology Stack - -The Weibo Public Opinion Analysis and Prediction System employs a range of modern technologies to ensure efficiency and scalability: - -- **[Flask](https://flask.palletsprojects.com/en/stable/)** - A lightweight web application framework. -- **[MySQL](https://www.mysql.com/)** - A relational database used to store collected and processed data. -- **[Scrapy](https://scrapy.org/)** - A powerful web scraping framework used for data collection. -- **[Jieba](https://github.com/fxsjy/jieba)** - A Chinese text segmentation tool used for text preprocessing. -- **[SnowNLP](https://github.com/isnowfy/snownlp)** - A Chinese natural language processing library used for sentiment analysis. -- **[BERT](https://github.com/google-research/bert)** - A pre-trained language model used for topic classification. -- **[Pandas](https://pandas.pydata.org/)** - A data analysis and manipulation library. -- **[Matplotlib](https://matplotlib.org/)** - A data visualization library. -- **[Scikit-learn](https://scikit-learn.org/)** - A machine learning library used for model training and evaluation. -- **[TensorFlow](https://www.tensorflow.org/)** or **[PyTorch](https://pytorch.org/)** - Deep learning frameworks used for advanced model development. -- **[OpenAI GPT](https://openai.com/)** - Advanced language models for text analysis. -- **[Anthropic Claude](https://www.anthropic.com/)** - AI models for sophisticated text analysis. -- **[DeepSeek](https://deepseek.com/)** - Advanced Chinese-English bilingual AI models. - -## 🤝 Contribution - -We welcome your contributions! Follow the steps below to participate in the project: - -1. Fork this repository. -2. Create your feature branch (`git checkout -b feature/your-feature`). -3. Commit your changes (`git commit -m 'Add some feature'`). -4. Push to the branch (`git push origin feature/your-feature`). -5. Open a Pull Request. - -Please ensure that all tests pass before submitting and follow the project's coding standards. - -## 📜 License - -This project is licensed under the [GPL-2.0 License](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/blob/main/LICENSE) - see the [LICENSE](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/blob/main/LICENSE) file for details. - -## 🌟 Show Your Support - -If you like this project, please give it a star ⭐ on [GitHub](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem)! - -## 📫 Contact Us - -If you have any questions or suggestions, feel free to contact us through the following methods: - -- GitHub Issues: [Create a new issue](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/issues) -- Email: 670939375@qq.com - -## ✨ Contributors - -Thanks to the following contributors: - -[![Contributors](https://contrib.rocks/image?repo=666ghj/Weibo_PublicOpinion_AnalysisSystem)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/graphs/contributors) +Project Initiator From 68f63c73e3a906f8dc53a9a3057c6a0f6f398443 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=88=92=E9=85=92=E7=9A=84=E6=9D=8E=E7=99=BD?= Date: Tue, 29 Jul 2025 23:53:55 +0800 Subject: [PATCH 2/2] Delete README-CN.md --- README-CN.md | 156 --------------------------------------------------- 1 file changed, 156 deletions(-) delete mode 100644 README-CN.md diff --git a/README-CN.md b/README-CN.md deleted file mode 100644 index c5b9274..0000000 --- a/README-CN.md +++ /dev/null @@ -1,156 +0,0 @@ -
- - - - Weibo Public Opinion Analysis System Logo - - [![GitHub Stars](https://img.shields.io/github/stars/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/stargazers) - [![GitHub Forks](https://img.shields.io/github/forks/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/network) - [![GitHub Issues](https://img.shields.io/github/issues/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/issues) - [![GitHub Contributors](https://img.shields.io/github/contributors/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/graphs/contributors) - [![GitHub License](https://img.shields.io/github/license/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/blob/main/LICENSE) - - - [English](./README.md) | [中文文档](./README-CN.md) -
- -🚀最新2.0版本已全面升级AI模式,欢迎体验!⬇️ - -
- - -
- -**微博舆情分析预测系统** 是一个用于监控、分析和预测社交媒体平台(如微博)上的公众舆情趋势的**社交网络舆情分析系统**。该系统利用深度学习、自然语言处理(NLP)和机器学习技术,从大量社交媒体数据中提取有价值的舆情信息,帮助政府、企业及其他组织及时了解公众态度、应对突发事件并优化决策。📈 - -通过强大的数据采集与处理能力,微博舆情分析预测系统实现了实时数据收集、情感分析、话题分类和舆情预测等功能,确保用户能够在复杂多变的社交网络环境中获得准确、全面的舆情洞察。系统采用模块化设计,易于维护和扩展,旨在为用户提供一个高效、可靠的舆情分析工具,助力各类组织在信息化时代做出明智决策。 - -## ✨ 功能 - -- **实时数据采集**:通过网络爬虫技术,从微博等社交平台实时获取用户生成内容。 -- **数据清洗与处理**:对采集到的数据进行预处理,包括分词、去停用词、表情符号和网址的去除等。 -- **话题分类**:利用机器学习和自然语言处理技术,对帖子和评论进行自动话题分类。 -- **情感分析**:分析文本中的情感倾向(正面、中性、负面),帮助理解公众情绪。 -- **舆情监控与预测**:实时监控舆情变化,并基于历史数据预测未来的舆情趋势。 -- **数据可视化**:通过图表和图形直观展示分析结果,便于用户理解和决策。 -- **用户管理**:提供用户注册、登录和会话管理功能,确保系统的安全性和个性化服务。 - -## 🚀 开始使用 - -按照以下步骤在您的系统上运行该项目。 - -### 前提条件 - -- [Python](https://www.python.org/) 3.7 或更高版本 -- [MySQL](https://www.mysql.com/) 数据库 -- [Conda](https://docs.conda.io/en/latest/)(可选,用于环境管理) -- 合法的微博账号(用于数据采集) -- 以下API密钥中至少需要一个(用于AI分析功能): - - OpenAI API密钥 - - Anthropic(Claude)API密钥 - - DeepSeek API密钥 - -### 安装步骤 - -1. 克隆仓库: - ```bash - git clone https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem.git - cd Weibo-Public-Opinion-Analysis-System - -1. 创建并激活虚拟环境(可选): - - ```bash - conda create -n weibo_opinion_analysis python=3.8 - conda activate weibo_opinion_analysis - ``` - -2. 安装依赖: - - ```bash - pip install -r requirements.txt - ``` - -3. 配置MySQL数据库: - - - 运行 `createTables.sql` 创建所需的数据库表。 - - 修改 `config.py` 中的数据库连接配置,确保与您的MySQL设置匹配。 - -4. 配置AI分析功能(可选): - - 设置AI分析功能所需的环境变量: - ```bash - # OpenAI API配置(使用GPT模型必需) - export OPENAI_API_KEY="你的openai密钥" - - # Anthropic API配置(使用Claude模型必需) - export ANTHROPIC_API_KEY="你的anthropic密钥" - - # DeepSeek API配置(使用DeepSeek模型必需) - export DEEPSEEK_API_KEY="你的deepseek密钥" - ``` - - 注意:至少需要配置一个API密钥才能使用AI分析功能。 - - 支持的AI模型: - - OpenAI:GPT-3.5-Turbo、GPT-4 - - Anthropic:Claude-3(Opus、Sonnet、Haiku) - - DeepSeek:DeepSeek-V3(deepseek-chat)、DeepSeek-R1(deepseek-reasoner) - -5. 启动Flask应用: - - ```bash - python app.py - ``` - -5. 访问应用: 打开浏览器,访问 `http://localhost:5000` 以使用系统。 - -## 🛠️ 技术栈 - -微博舆情分析预测系统采用了一系列现代技术,以确保其高效性和可扩展性: - -- **[Flask](https://flask.palletsprojects.com/en/stable/)** - 轻量级的Web应用框架。 -- **[MySQL](https://www.mysql.com/)** - 关系型数据库,用于存储采集和处理的数据。 -- **[Scrapy](https://scrapy.org/)** - 强大的网络爬虫框架,用于数据采集。 -- **[Jieba](https://github.com/fxsjy/jieba)** - 中文分词工具,用于文本预处理。 -- **[SnowNLP](https://github.com/isnowfy/snownlp)** - 中文自然语言处理库,用于情感分析。 -- **[BERT](https://github.com/google-research/bert)** - 预训练的语言模型,用于话题分类。 -- **Pandas** - 数据分析和处理库。 -- **[Matplotlib](https://matplotlib.org/)** - 数据可视化库。 -- **[Scikit-learn](https://scikit-learn.org/)** - 机器学习库,用于模型训练和评估。 -- **[TensorFlow](https://www.tensorflow.org/)** 或 **[PyTorch](https://pytorch.org/)** - 深度学习框架,用于高级模型开发。 -- **[OpenAI GPT](https://openai.com/)** - 先进的语言模型,用于文本分析。 -- **[Anthropic Claude](https://www.anthropic.com/)** - 智能AI模型,用于复杂文本分析。 -- **[DeepSeek](https://deepseek.com/)** - 先进的中英双语AI模型。 - -## 🤝 贡献 - -我们欢迎您的贡献!以下是参与项目的步骤: - -1. Fork 本仓库。 -2. 创建您的功能分支 (`git checkout -b feature/新功能`)。 -3. 提交您的更改 (`git commit -m '添加新功能'`)。 -4. 推送到分支 (`git push origin feature/新功能`)。 -5. 打开一个 Pull Request。 - -请确保在提交之前运行所有测试,并遵循项目的编码规范。 - -## 📜 许可证 - -本项目采用 [GPL-2.0 License](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/blob/main/LICENSE) 许可证 - 详情请参阅 [LICENSE](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/blob/main/LICENSE) 文件。 - -## 🌟 支持一下 - -如果您喜欢这个项目,请在 [GitHub](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem) 上给它一个星 ⭐! - -## 📫 联系我们 - -有任何问题或建议,欢迎通过以下方式联系我们: - -- GitHub Issues: [创建新问题](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/issues) -- 邮箱: 670939375@qq.com - -## ✨ 贡献者 - -感谢以下这些优秀的贡献者: - -[![Contributors](https://contrib.rocks/image?repo=666ghj/Weibo_PublicOpinion_AnalysisSystem)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/graphs/contributors)