feat: v4 multimodal chat input, multi-format support, and annotation detection

- Replace st.chat_input with st-multimodal-chatinput (Ctrl+V paste, drag-drop, file button)
- Extract _process_uploaded_file() shared handler (eliminates ~70 duplicated lines)
- Add XLSX (openpyxl), XLS (xlrd), DOC (olefile) parsers to file_parser.py
- Add backend/annotation_detector.py: circle detection (HoughCircles) + arrow detection (HoughLinesP clustering) + OCR correlation + LLM context formatting
- Add annotation_result field to AgentState with session persistence
- Wire annotation detection into process_input and _format_ocr_context
- Add 11 new tests: 7 annotation detector + 4 multi-format parser
- Update all docs: CLAUDE.md, README.md, CODE_GUIDE.md, ROADMAP.md
This commit is contained in:
2026-05-20 23:43:16 +08:00
parent c9f003e1b7
commit 9bb011e429
16 changed files with 1257 additions and 164 deletions
+14 -6
View File
@@ -751,14 +751,20 @@ def parse_file(file_path, file_type="") -> dict:
# .png/.jpg/.jpeg/.bmp/.webp → _parse_image()
# .pdf → _parse_pdf()
# .docx → _parse_docx()
# .xlsx → _parse_xlsx()
# .xls → _parse_xls()
# .doc → _parse_doc()
# 其他 → _parse_text() (UTF-8 / GBK)
```
### 各解析器的回退链
- **图片**EasyOCRch_sim+en)→ PaddleOCR → 仅返回元信息 + 安装提示
- **图片**PaddleOCR(精确识别首选)→ EasyOCRch_sim+en)→ 仅返回元信息 + 安装提示
- **PDF**pdfplumber → PyMuPDF → 失败
- **DOCX**python-docx(含表格内容提取)→ 失败
- **XLSX**openpyxl(含多 sheet 支持)→ 失败
- **XLS**xlrd(旧版 Excel 格式)→ 失败
- **DOC**olefile(二进制格式,尽力而为提取)→ 失败
- **文本**UTF-8 → GBK → 失败
---
@@ -1158,20 +1164,22 @@ st.json(state) # 打印完整状态(调试用,记得删除)
| 文件 | 行数 | 角色 |
|------|------|------|
| `app.py` | ~530 | Streamlit UI 入口 |
| `agent/state.py` | ~40 | 状态类型定义 |
| `agent/nodes.py` | ~523 | 14 个工作流节点 |
| `app.py` | ~670 | Streamlit UI 入口(多模态聊天输入) |
| `agent/state.py` | ~48 | 状态类型定义26 字段) |
| `agent/nodes.py` | ~740 | 15 个工作流节点 |
| `agent/graph.py` | ~232 | 状态图编译 + 路由 |
| `backend/llm.py` | ~105 | LLM 工厂 (3 个后端) |
| `backend/rag_adapter.py` | ~156 | ChromaDB 语义搜索 |
| `backend/error_kb.py` | ~226 | 错误知识库 |
| `backend/embeddings.py` | ~49 | 嵌入模型工厂 |
| `backend/file_parser.py` | ~194 | 多格式文件解析 |
| `backend/file_parser.py` | ~320 | 多格式文件解析7 种格式) |
| `backend/layout_analyzer.py` | ~495 | A4 模板布局分析 |
| `backend/ocr_extractor.py` | ~380 | OCR 字段精确提取 |
| `backend/annotation_detector.py` | ~250 | 批注检测(圈选 + 箭头) |
| `backend/validation.py` | ~27 | 验证服务 HTTP 客户端 |
| `backend/session.py` | ~113 | 会话 JSON CRUD |
| `prompts/loader.py` | ~54 | Prompt 热重载 |
| `prompts/*.md` (7 个) | — | Prompt 模板 |
| `validation_service/main.py` | ~130 | FastAPI 验证服务 |
| `.env.example` | ~62 | 配置模板 |
| `requirements.txt` | ~32 | Python 依赖 |
| `requirements.txt` | ~42 | Python 依赖 |