feat: v4 multimodal chat input, multi-format support, and annotation detection
- Replace st.chat_input with st-multimodal-chatinput (Ctrl+V paste, drag-drop, file button) - Extract _process_uploaded_file() shared handler (eliminates ~70 duplicated lines) - Add XLSX (openpyxl), XLS (xlrd), DOC (olefile) parsers to file_parser.py - Add backend/annotation_detector.py: circle detection (HoughCircles) + arrow detection (HoughLinesP clustering) + OCR correlation + LLM context formatting - Add annotation_result field to AgentState with session persistence - Wire annotation detection into process_input and _format_ocr_context - Add 11 new tests: 7 annotation detector + 4 multi-format parser - Update all docs: CLAUDE.md, README.md, CODE_GUIDE.md, ROADMAP.md
This commit is contained in:
+14
-6
@@ -751,14 +751,20 @@ def parse_file(file_path, file_type="") -> dict:
|
||||
# .png/.jpg/.jpeg/.bmp/.webp → _parse_image()
|
||||
# .pdf → _parse_pdf()
|
||||
# .docx → _parse_docx()
|
||||
# .xlsx → _parse_xlsx()
|
||||
# .xls → _parse_xls()
|
||||
# .doc → _parse_doc()
|
||||
# 其他 → _parse_text() (UTF-8 / GBK)
|
||||
```
|
||||
|
||||
### 各解析器的回退链
|
||||
|
||||
- **图片**:EasyOCR(ch_sim+en)→ PaddleOCR → 仅返回元信息 + 安装提示
|
||||
- **图片**:PaddleOCR(精确识别首选)→ EasyOCR(ch_sim+en)→ 仅返回元信息 + 安装提示
|
||||
- **PDF**:pdfplumber → PyMuPDF → 失败
|
||||
- **DOCX**:python-docx(含表格内容提取)→ 失败
|
||||
- **XLSX**:openpyxl(含多 sheet 支持)→ 失败
|
||||
- **XLS**:xlrd(旧版 Excel 格式)→ 失败
|
||||
- **DOC**:olefile(二进制格式,尽力而为提取)→ 失败
|
||||
- **文本**:UTF-8 → GBK → 失败
|
||||
|
||||
---
|
||||
@@ -1158,20 +1164,22 @@ st.json(state) # 打印完整状态(调试用,记得删除)
|
||||
|
||||
| 文件 | 行数 | 角色 |
|
||||
|------|------|------|
|
||||
| `app.py` | ~530 | Streamlit UI 入口 |
|
||||
| `agent/state.py` | ~40 | 状态类型定义 |
|
||||
| `agent/nodes.py` | ~523 | 14 个工作流节点 |
|
||||
| `app.py` | ~670 | Streamlit UI 入口(多模态聊天输入) |
|
||||
| `agent/state.py` | ~48 | 状态类型定义(26 字段) |
|
||||
| `agent/nodes.py` | ~740 | 15 个工作流节点 |
|
||||
| `agent/graph.py` | ~232 | 状态图编译 + 路由 |
|
||||
| `backend/llm.py` | ~105 | LLM 工厂 (3 个后端) |
|
||||
| `backend/rag_adapter.py` | ~156 | ChromaDB 语义搜索 |
|
||||
| `backend/error_kb.py` | ~226 | 错误知识库 |
|
||||
| `backend/embeddings.py` | ~49 | 嵌入模型工厂 |
|
||||
| `backend/file_parser.py` | ~194 | 多格式文件解析 |
|
||||
| `backend/file_parser.py` | ~320 | 多格式文件解析(7 种格式) |
|
||||
| `backend/layout_analyzer.py` | ~495 | A4 模板布局分析 |
|
||||
| `backend/ocr_extractor.py` | ~380 | OCR 字段精确提取 |
|
||||
| `backend/annotation_detector.py` | ~250 | 批注检测(圈选 + 箭头) |
|
||||
| `backend/validation.py` | ~27 | 验证服务 HTTP 客户端 |
|
||||
| `backend/session.py` | ~113 | 会话 JSON CRUD |
|
||||
| `prompts/loader.py` | ~54 | Prompt 热重载 |
|
||||
| `prompts/*.md` (7 个) | — | Prompt 模板 |
|
||||
| `validation_service/main.py` | ~130 | FastAPI 验证服务 |
|
||||
| `.env.example` | ~62 | 配置模板 |
|
||||
| `requirements.txt` | ~32 | Python 依赖 |
|
||||
| `requirements.txt` | ~42 | Python 依赖 |
|
||||
|
||||
Reference in New Issue
Block a user