feat: v4 multimodal chat input, multi-format support, and annotation detection
- Replace st.chat_input with st-multimodal-chatinput (Ctrl+V paste, drag-drop, file button) - Extract _process_uploaded_file() shared handler (eliminates ~70 duplicated lines) - Add XLSX (openpyxl), XLS (xlrd), DOC (olefile) parsers to file_parser.py - Add backend/annotation_detector.py: circle detection (HoughCircles) + arrow detection (HoughLinesP clustering) + OCR correlation + LLM context formatting - Add annotation_result field to AgentState with session persistence - Wire annotation detection into process_input and _format_ocr_context - Add 11 new tests: 7 annotation detector + 4 multi-format parser - Update all docs: CLAUDE.md, README.md, CODE_GUIDE.md, ROADMAP.md
This commit is contained in:
@@ -44,3 +44,6 @@ class AgentState(TypedDict, total=False):
|
||||
# 需求7:OCR 单据字段精确提取结果
|
||||
ocr_extraction_result: dict
|
||||
uploaded_file_path: str
|
||||
|
||||
# 需求8:图片批注检测(圈选/箭头标记)
|
||||
annotation_result: dict
|
||||
|
||||
Reference in New Issue
Block a user