# πŸ“Š Weibo Public Opinion Multi-Agent Analysis System Weibo Public Opinion Analysis System Logo [![GitHub Stars](https://img.shields.io/github/stars/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/stargazers) [![GitHub Forks](https://img.shields.io/github/forks/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/network) [![GitHub Issues](https://img.shields.io/github/issues/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/issues) [![GitHub License](https://img.shields.io/github/license/666ghj/Weibo_PublicOpinion_AnalysisSystem?style=flat-square)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/blob/main/LICENSE) [English](./README-EN.md) | [δΈ­ζ–‡ζ–‡ζ‘£](./README.md)
banner
## πŸ“ Project Overview **Weibo Public Opinion Multi-Agent Analysis System** is an innovative public opinion analysis platform built from scratch, utilizing multi-agent collaborative architecture to provide accurate, real-time, and comprehensive Weibo public opinion monitoring and analysis services. The system achieves full-process automation from data collection and sentiment analysis to report generation through the collaboration of five specialized AI agents. ### πŸš€ Key Features - **Multi-Agent Collaborative Architecture**: 5 specialized agents working together to complete the full process of public opinion analysis - **Comprehensive Data Collection**: Integrating Weibo crawlers, news search, multimedia content, and other multi-dimensional data sources - **Deep Sentiment Analysis**: Precise multilingual sentiment recognition based on fine-tuned BERT/GPT-2/Qwen models - **Intelligent Report Generation**: Automatically generate structured HTML analysis reports with custom template support - **Agent Forum Communication**: ForumEngine provides information sharing and collaborative decision-making platform for agents - **High-Performance Asynchronous Processing**: Support concurrent processing of multiple public opinion tasks with real-time status monitoring - **Cloud Data Support**: Convenient cloud database service with 100,000+ daily real data ## πŸ—οΈ System Architecture ### Overall Architecture Diagram ```mermaid graph TB subgraph "Frontend Display Layer" UI[Web Interface
Flask + Streamlit] end subgraph "Multi-Agent Collaboration Layer" QE[QueryEngine
News Search Agent] ME[MediaEngine
Multimedia Search Agent] IE[InsightEngine
Deep Insight Agent] RE[ReportEngine
Report Generation Agent] Forum[ForumEngine
Agent Forum Communication Center] end subgraph "Data Processing Layer" MS[MindSpider
Weibo Crawler System] SA[SentimentAnalysis
Sentiment Analysis Model Collection] DB[(MySQL
Database)] end subgraph "External Service Layer" LLM[LLM API
DeepSeek/Kimi/Gemini] Search[Search API
Tavily/Bocha] end UI --> QE UI --> ME UI --> IE UI --> RE QE --> Search ME --> Search IE --> MS IE --> SA QE --> LLM ME --> LLM IE --> LLM RE --> LLM MS --> DB SA --> DB %% Agent Forum Communication Mechanism QE <--> Forum ME <--> Forum IE <--> Forum RE <--> Forum ``` ### Agent Collaboration Workflow The system's core workflow is based on multi-agent collaboration: 1. **QueryEngine (News Query Agent)**: Uses Tavily API to search authoritative news reports, providing official information sources 2. **MediaEngine (Multimedia Search Agent)**: Conducts multimodal content search through Bocha API to gather social media perspectives 3. **InsightEngine (Deep Insight Agent)**: Queries local Weibo database, combines multiple sentiment analysis models for deep analysis 4. **ForumEngine (Forum Monitoring Agent)**: Real-time monitoring of agent log outputs, extracts key information and promotes collaboration 5. **ReportEngine (Report Generation Agent)**: Based on analysis results from all agents, uses Gemini LLM to generate comprehensive HTML reports ### Project Code Structure ``` Weibo_PublicOpinion_AnalysisSystem/ β”œβ”€β”€ QueryEngine/ # News Query Engine Agent β”‚ β”œβ”€β”€ agent.py # Agent main logic β”‚ β”œβ”€β”€ llms/ # LLM interface wrapper β”‚ β”œβ”€β”€ nodes/ # Processing nodes β”‚ β”œβ”€β”€ tools/ # Search tools β”‚ └── utils/ # Utility functions β”œβ”€β”€ MediaEngine/ # Multimedia Search Engine Agent β”‚ β”œβ”€β”€ agent.py # Agent main logic β”‚ β”œβ”€β”€ llms/ # LLM interfaces β”‚ β”œβ”€β”€ tools/ # Search tools β”‚ └── ... # Other modules β”œβ”€β”€ InsightEngine/ # Data Insight Engine Agent β”‚ β”œβ”€β”€ agent.py # Agent main logic β”‚ β”œβ”€β”€ llms/ # LLM interface wrapper β”‚ β”‚ β”œβ”€β”€ deepseek.py # DeepSeek API β”‚ β”‚ β”œβ”€β”€ kimi.py # Kimi API β”‚ β”‚ β”œβ”€β”€ openai_llm.py # OpenAI format API β”‚ β”‚ └── base.py # LLM base class β”‚ β”œβ”€β”€ nodes/ # Processing nodes β”‚ β”‚ β”œβ”€β”€ first_search_node.py # First search node β”‚ β”‚ β”œβ”€β”€ reflection_node.py # Reflection node β”‚ β”‚ β”œβ”€β”€ summary_nodes.py # Summary nodes β”‚ β”‚ β”œβ”€β”€ search_node.py # Search node β”‚ β”‚ β”œβ”€β”€ sentiment_node.py # Sentiment analysis node β”‚ β”‚ └── insight_node.py # Insight generation node β”‚ β”œβ”€β”€ tools/ # Database query and analysis tools β”‚ β”‚ β”œβ”€β”€ media_crawler_db.py # Database query tool β”‚ β”‚ └── sentiment_analyzer.py # Sentiment analysis integration tool β”‚ β”œβ”€β”€ state/ # State management β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ └── state.py # Agent state definition β”‚ β”œβ”€β”€ prompts/ # Prompt templates β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ └── prompts.py # Various prompts β”‚ └── utils/ # Utility functions β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ config.py # Configuration management β”‚ └── helpers.py # Helper functions β”œβ”€β”€ ReportEngine/ # Report Generation Engine Agent β”‚ β”œβ”€β”€ agent.py # Agent main logic β”‚ β”œβ”€β”€ llms/ # LLM interfaces β”‚ β”‚ └── gemini.py # Gemini API dedicated β”‚ β”œβ”€β”€ nodes/ # Report generation nodes β”‚ β”‚ β”œβ”€β”€ template_selection.py # Template selection node β”‚ β”‚ └── html_generation.py # HTML generation node β”‚ β”œβ”€β”€ report_template/ # Report template library β”‚ β”‚ β”œβ”€β”€ η€ΎδΌšε…¬ε…±ηƒ­η‚ΉδΊ‹δ»Άεˆ†ζž.md β”‚ β”‚ β”œβ”€β”€ ε•†δΈšε“η‰Œθˆ†ζƒ…η›‘ζ΅‹.md β”‚ β”‚ └── ... # More templates β”‚ └── flask_interface.py # Flask API interface β”œβ”€β”€ ForumEngine/ # Forum Communication Engine Agent β”‚ └── monitor.py # Log monitoring and forum management β”œβ”€β”€ MindSpider/ # Weibo Crawler System β”‚ β”œβ”€β”€ main.py # Crawler main program β”‚ β”œβ”€β”€ BroadTopicExtraction/ # Topic extraction module β”‚ β”‚ β”œβ”€β”€ get_today_news.py # Today's news fetching β”‚ β”‚ └── topic_extractor.py # Topic extractor β”‚ β”œβ”€β”€ DeepSentimentCrawling/ # Deep sentiment crawling β”‚ β”‚ β”œβ”€β”€ MediaCrawler/ # Media crawler core β”‚ β”‚ └── platform_crawler.py # Platform crawler management β”‚ └── schema/ # Database schema β”‚ └── init_database.py # Database initialization β”œβ”€β”€ SentimentAnalysisModel/ # Sentiment Analysis Model Collection β”‚ β”œβ”€β”€ WeiboSentiment_Finetuned/ # Fine-tuned BERT/GPT-2 models β”‚ β”œβ”€β”€ WeiboMultilingualSentiment/ # Multilingual sentiment analysis β”‚ β”œβ”€β”€ WeiboSentiment_SmallQwen/ # Small Qwen model β”‚ └── WeiboSentiment_MachineLearning/ # Traditional machine learning methods β”œβ”€β”€ SingleEngineApp/ # Individual Agent Streamlit apps β”‚ β”œβ”€β”€ query_engine_streamlit_app.py β”‚ β”œβ”€β”€ media_engine_streamlit_app.py β”‚ └── insight_engine_streamlit_app.py β”œβ”€β”€ templates/ # Flask templates β”‚ └── index.html # Main interface template β”œβ”€β”€ static/ # Static resources β”œβ”€β”€ logs/ # Runtime log directory β”œβ”€β”€ app.py # Flask main application entry β”œβ”€β”€ config.py # Global configuration file └── requirements.txt # Python dependency list ``` ## πŸš€ Quick Start ### System Requirements - **Operating System**: Windows 10/11 (Linux/macOS also supported) - **Python Version**: 3.11+ - **Conda**: Anaconda or Miniconda - **Database**: MySQL 8.0+ (or choose our cloud database service) - **Memory**: 8GB+ recommended ### 1. Create Conda Environment ```bash # Create conda environment named pytorch_python11 conda create -n pytorch_python11 python=3.11 conda activate pytorch_python11 ``` ### 2. Install Dependencies ```bash # Install basic dependencies pip install -r requirements.txt # If you need local sentiment analysis functionality, install PyTorch # CPU version pip install torch torchvision torchaudio # CUDA 11.8 version (if you have GPU) pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 # Install transformers and other AI-related dependencies pip install transformers scikit-learn xgboost ``` ### 3. Install Playwright Browser Drivers ```bash # Install browser drivers (for crawler functionality) playwright install chromium ``` ### 4. System Configuration #### 4.1 Configure API Keys Edit the `config.py` file and fill in your API keys: ```python # MySQL Database Configuration DB_HOST = "localhost" DB_PORT = 3306 DB_USER = "your_username" DB_PASSWORD = "your_password" DB_NAME = "weibo_analysis" DB_CHARSET = "utf8mb4" # DeepSeek API (Apply at: https://www.deepseek.com/) DEEPSEEK_API_KEY = "your_deepseek_api_key" # Tavily Search API (Apply at: https://www.tavily.com/) TAVILY_API_KEY = "your_tavily_api_key" # Kimi API (Apply at: https://www.kimi.com/) KIMI_API_KEY = "your_kimi_api_key" # Gemini API (Apply at: https://api.chataiapi.com/) GEMINI_API_KEY = "your_gemini_api_key" # Bocha Search API (Apply at: https://open.bochaai.com/) BOCHA_Web_Search_API_KEY = "your_bocha_api_key" # Silicon Flow API (Apply at: https://siliconflow.cn/) GUIJI_QWEN3_API_KEY = "your_guiji_api_key" ``` #### 4.2 Database Initialization **Option 1: Use Local Database** ```bash # Local MySQL database initialization cd MindSpider python schema/init_database.py ``` **Option 2: Use Cloud Database Service (Recommended)** We provide convenient cloud database service with 100,000+ daily real Weibo data, currently **free application** during the promotion period! - Real Weibo data, updated in real-time - Pre-processed sentiment annotation data - Multi-dimensional tag classification - High-availability cloud service - Professional technical support **Contact us to apply for free cloud database access: πŸ“§ 670939375@qq.com** ### 5. Launch System #### 5.1 Complete System Launch (Recommended) ```bash # In project root directory, activate conda environment conda activate pytorch_python11 # Start main application (automatically starts all agents) python app.py ``` Visit http://localhost:5000 to use the complete system #### 5.2 Launch Individual Agents ```bash # Start QueryEngine streamlit run SingleEngineApp/query_engine_streamlit_app.py --server.port 8503 # Start MediaEngine streamlit run SingleEngineApp/media_engine_streamlit_app.py --server.port 8502 # Start InsightEngine streamlit run SingleEngineApp/insight_engine_streamlit_app.py --server.port 8501 ``` #### 5.3 Standalone Crawler System ```bash # Enter crawler directory cd MindSpider # Project initialization python main.py --setup # Run complete crawler workflow python main.py --complete --date 2024-01-20 # Run topic extraction only python main.py --broad-topic --date 2024-01-20 # Run deep crawling only python main.py --deep-sentiment --platforms xhs dy wb ``` ## πŸ’Ύ Database Configuration ### Local Database Configuration 1. **Install MySQL 8.0+** 2. **Create Database**: ```sql CREATE DATABASE weibo_analysis CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; ``` 3. **Run Initialization Script**: ```bash cd MindSpider python schema/init_database.py ``` ### Auto-Crawling Configuration Configure automatic crawling tasks for continuous data updates: ```python # Configure crawler parameters in MindSpider/config.py CRAWLER_CONFIG = { 'max_pages': 200, # Maximum pages to crawl 'delay': 1, # Request delay (seconds) 'timeout': 30, # Timeout (seconds) 'platforms': ['xhs', 'dy', 'wb', 'bili'], # Crawling platforms 'daily_keywords': 100, # Daily keywords count 'max_notes_per_keyword': 50, # Max content per keyword 'use_proxy': False, # Whether to use proxy } ``` ### Cloud Database Service (Recommended) **Why Choose Our Cloud Database Service?** - **Rich Data Sources**: 100,000+ daily real Weibo data covering hot topics across all industries - **High-Quality Annotations**: Professional team manually annotated sentiment data with 95%+ accuracy - **Multi-Dimensional Analysis**: Including topic classification, sentiment tendency, influence scoring and other multi-dimensional tags - **Real-Time Updates**: 24/7 continuous data collection ensuring timeliness - **Technical Support**: Professional team providing technical support and customization services **Application Method**: πŸ“§ Email Contact: 670939375@qq.com πŸ“ Email Subject: Apply for Weibo Public Opinion Cloud Database Access πŸ“ Email Content: Please describe your use case and expected data volume requirements **Promotion Period Benefits**: - Free basic cloud database access - Free technical support and deployment guidance - Priority access to new features ## βš™οΈ Advanced Configuration ### Modify Key Parameters #### Agent Configuration Parameters Each agent has dedicated configuration files that can be adjusted according to needs: ```python # QueryEngine/utils/config.py class Config: max_reflections = 2 # Reflection rounds max_search_results = 15 # Maximum search results max_content_length = 8000 # Maximum content length # MediaEngine/utils/config.py class Config: comprehensive_search_limit = 10 # Comprehensive search limit web_search_limit = 15 # Web search limit # InsightEngine/utils/config.py class Config: default_search_topic_globally_limit = 200 # Global search limit default_get_comments_limit = 500 # Comment retrieval limit max_search_results_for_llm = 50 # Max results for LLM ``` #### Sentiment Analysis Model Configuration ```python # InsightEngine/tools/sentiment_analyzer.py SENTIMENT_CONFIG = { 'model_type': 'multilingual', # Options: 'bert', 'multilingual', 'qwen' 'confidence_threshold': 0.8, # Confidence threshold 'batch_size': 32, # Batch size 'max_sequence_length': 512, # Max sequence length } ``` ### Integrate Different LLM Models The system supports multiple LLM providers, switchable in each agent's configuration: ```python # Configure in each Engine's utils/config.py class Config: default_llm_provider = "deepseek" # Options: "deepseek", "openai", "kimi", "gemini" # DeepSeek configuration deepseek_api_key = "your_api_key" deepseek_model = "deepseek-chat" # OpenAI compatible configuration openai_api_key = "your_api_key" openai_model = "gpt-3.5-turbo" openai_base_url = "https://api.openai.com/v1" # Kimi configuration kimi_api_key = "your_api_key" kimi_model = "moonshot-v1-8k" # Gemini configuration gemini_api_key = "your_api_key" gemini_model = "gemini-pro" ``` ### Change Sentiment Analysis Models The system integrates multiple sentiment analysis methods, selectable based on needs: #### 1. BERT-based Fine-tuned Model (Highest Accuracy) ```bash # Use BERT Chinese model cd SentimentAnalysisModel/WeiboSentiment_Finetuned/BertChinese-Lora python predict.py --text "This product is really great" ``` #### 2. GPT-2 LoRA Fine-tuned Model (Faster Speed) ```bash cd SentimentAnalysisModel/WeiboSentiment_Finetuned/GPT2-Lora python predict.py --text "I'm not feeling great today" ``` #### 3. Small Qwen Model (Balanced) ```bash cd SentimentAnalysisModel/WeiboSentiment_SmallQwen python predict_universal.py --text "This event was very successful" ``` #### 4. Traditional Machine Learning Methods (Lightweight) ```bash cd SentimentAnalysisModel/WeiboSentiment_MachineLearning python predict.py --model_type "svm" --text "Service attitude needs improvement" ``` #### 5. Multilingual Sentiment Analysis (Supports 22 Languages) ```bash cd SentimentAnalysisModel/WeiboMultilingualSentiment python predict.py --text "This product is amazing!" --lang "en" ``` ### Integrate Custom Business Database #### 1. Modify Database Connection Configuration ```python # Add your business database configuration in config.py BUSINESS_DB_HOST = "your_business_db_host" BUSINESS_DB_PORT = 3306 BUSINESS_DB_USER = "your_business_user" BUSINESS_DB_PASSWORD = "your_business_password" BUSINESS_DB_NAME = "your_business_database" ``` #### 2. Create Custom Data Access Tools ```python # InsightEngine/tools/custom_db_tool.py class CustomBusinessDBTool: """Custom business database query tool""" def __init__(self): self.connection_config = { 'host': config.BUSINESS_DB_HOST, 'port': config.BUSINESS_DB_PORT, 'user': config.BUSINESS_DB_USER, 'password': config.BUSINESS_DB_PASSWORD, 'database': config.BUSINESS_DB_NAME, } def search_business_data(self, query: str, table: str): """Query business data""" # Implement your business logic pass def get_customer_feedback(self, product_id: str): """Get customer feedback data""" # Implement customer feedback query logic pass ``` #### 3. Integrate into InsightEngine ```python # Integrate custom tools in InsightEngine/agent.py from .tools.custom_db_tool import CustomBusinessDBTool class DeepSearchAgent: def __init__(self, config=None): # ... other initialization code self.custom_db_tool = CustomBusinessDBTool() def execute_custom_search(self, query: str): """Execute custom business data search""" return self.custom_db_tool.search_business_data(query, "your_table") ``` ### Custom Report Templates #### 1. Create Template Files Create new Markdown templates in the `ReportEngine/report_template/` directory: ```markdown # Enterprise Brand Public Opinion Monitoring Report ## πŸ“Š Executive Summary {executive_summary} ## πŸ” Brand Mention Analysis ### Mention Volume Trends {mention_trend} ### Sentiment Distribution {sentiment_distribution} ## πŸ“ˆ Competitor Analysis {competitor_analysis} ## 🎯 Key Insights Summary {key_insights} ## ⚠️ Risk Alerts {risk_alerts} ## πŸ“‹ Improvement Recommendations {recommendations} --- *Report Type: Enterprise Brand Public Opinion Monitoring* *Generation Time: {generation_time}* *Data Sources: {data_sources}* ``` #### 2. Use in Web Interface The system supports uploading custom template files (.md or .txt format), selectable when generating reports. ## 🀝 Contributing Guide We welcome all forms of contributions! ### How to Contribute 1. **Fork the project** to your GitHub account 2. **Create Feature branch**: `git checkout -b feature/AmazingFeature` 3. **Commit changes**: `git commit -m 'Add some AmazingFeature'` 4. **Push to branch**: `git push origin feature/AmazingFeature` 5. **Open Pull Request** ### Contribution Types - πŸ› Bug fixes - ✨ New feature development - πŸ“š Documentation improvements - 🎨 UI/UX improvements - ⚑ Performance optimization - πŸ§ͺ Test case additions ### Development Standards - Code follows PEP8 standards - Commit messages use clear Chinese/English descriptions - New features need corresponding test cases - Update related documentation ## πŸ“„ License This project is licensed under the [MIT License](LICENSE). Please see the LICENSE file for details. ## πŸŽ‰ Support & Contact ### Get Help - **Project Homepage**: [GitHub Repository](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem) - **Issue Reporting**: [Issues Page](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/issues) - **Feature Requests**: [Discussions Page](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/discussions) ### Contact Information - πŸ“§ **Email**: 670939375@qq.com - πŸ’¬ **QQ Group**: [Join Technical Discussion Group] - 🐦 **WeChat**: [Scan QR Code for Technical Support] ### Business Cooperation - 🏒 **Enterprise Custom Development** - πŸ“Š **Big Data Services** - πŸŽ“ **Academic Collaboration** - πŸ’Ό **Technical Training** ### Cloud Service Application **Free Cloud Database Service Application**: πŸ“§ Send email to: 670939375@qq.com πŸ“ Subject: Weibo Public Opinion Cloud Database Application πŸ“ Description: Your use case and requirements ## πŸ‘₯ Contributors Thanks to these excellent contributors: [![Contributors](https://contrib.rocks/image?repo=666ghj/Weibo_PublicOpinion_AnalysisSystem)](https://github.com/666ghj/Weibo_PublicOpinion_AnalysisSystem/graphs/contributors) ---
**⭐ If this project helps you, please give us a star!** Made with ❀️ by [Weibo Public Opinion Analysis Team](https://github.com/666ghj)