agent_jrxml

Author	SHA1	Message	Date
panda	00f718fbda	fix: prompt 添加字段声明和 font 标签格式的强制规则	2026-05-26 08:23:39 +08:00
panda	520c8b19d0	fix: 五轮修正失败根因修复 - 评分公式去掉field_coverage权重, namespace无条件检查, OCR自动发现文档类型	2026-05-24 22:44:37 +08:00
panda	f25a93b539	WIP: baseline on fix/retry-failure-root-causes	2026-05-24 22:38:30 +08:00
panda	2d5183d2bd	fix: OCR fidelity scoring reform — prevent false fail from language-mismatched field names Root cause (from review): field_coverage compared English JRXML field names against Chinese OCR field names with set intersection — always zero. Combined with 0.5 weight in score formula, caused valid JRXML (XSD pass, 82% element coverage) to score 0.41 < 0.5 → fail → correction loop → progressive destruction. Changes: - Scoring weight: element_coverage 0.8 + field_coverage 0.2 (was 0.5+0.5) - Validate node: only fail on fidelity when BOTH score<0.5 AND element_coverage<0.4 - Field name regex: \w+ → [^"]+ to support non-ASCII field names - Field matching: also try _sanitize_field_name conversion (Chinese→_uXXXX_) - correction.md: namespace check always active, not conditional on error keywords	2026-05-24 15:36:40 +08:00
panda	bd5bfbac2d	fix: band-level windowed refine_layout + programmatic map_fields to prevent 91.5% content loss Root cause: LLM receiving full 34k-char JRXML would regenerate from scratch instead of modifying coordinates in-place, shrinking output to ~3k chars. Solution (programmatic node control, not prompt engineering): - New agent/jrxml_windower.py: decompose JRXML into header (never sent to LLM) + individual bands. Split bands >4000 chars at element boundaries. Reassemble with element count validation (>10% change = rollback). - Rewrite refine_layout: per-band windowed LLM processing (~2-4k chars each). LLM cannot "reimagine" the entire report. - Rewrite map_fields: 100% programmatic regex $F{field_N} -> real name replacement. Zero LLM calls, zero content loss. - _sanitize_field_name: non-ASCII chars escaped to _uXXXX_ format for valid JRXML identifiers. - Tests: 48 new unit tests (windower 28 + map_fields 20). All passing. Full suite 385 tests, zero regressions.	2026-05-24 08:55:38 +08:00
panda	bb6cc6e241	feat: add Java JRXML-to-PNG rendering pipeline with pixel-level SSIM comparison - lib/java/: Java renderer (JrxmlRenderer) using JasperReports 6.21.0 - JrxmlDebug for diagnostics, JrxmlGen for format reference - download_jars.sh for one-time dependency setup - agent/nodes.py: _render_jrxml_to_png() and _compute_pixel_similarity() - Pixel comparison integrates into validate node (SSIM < 0.4 fails) - Pixel fidelity context injected into correct_jrxml for targeted fixes - tests/test_pixel_comparison.py: 15 unit tests (render, SSIM, integration) - .gitignore: exclude lib/java/.jar, lib/java/.class, tmp/ - CLAUDE.md: v11 changelog documenting the rendering pipeline - All non-LLM tests pass (97/97)	2026-05-23 15:09:55 +08:00
panda	9de75d2f25	fix: escape $F{field_N} in correction.md to prevent Python format KeyError $F{field_N} was being parsed by str.format() as a replacement field, causing KeyError and crashing correct_jrxml node. Changed to $F{{field_N}} (double braces -> literal brace in output).	2026-05-23 11:27:31 +08:00
panda	23cdfa8c2b	fix: map_fields empty-retry + correction prompt field_N guidance - map_fields: retry with simplified prompt on empty LLM response - correction.md: add explicit guidance for undeclared field_N errors (add <field> declarations + try OCR name replacement) - MAX_RETRY=5 now effective (was overridden by .env:3)	2026-05-23 11:15:09 +08:00
panda	1210b926c3	fix: MAX_RETRY 5 + rolling continuation + namespace-aware JRXML extraction - MAX_RETRY: 3→5 (graph.py:35, nodes.py:25) with env override - Rolling continuation: _generate_with_continuation() auto-detects truncated JRXML and sends anchor-based continuation, max 3 rounds - JRXML extraction: regex/end-tag now namespace-prefix aware (ns0:jasperReport, ns:jasperReport, etc.) - All 5 generation nodes refactored to use continuation helper - Tests updated: scenario1 accepts ns-prefixed root, max_retry verifies graph termination - stop_reason capture + WARNING log on max_tokens truncation - Correction prompt now injects OCR context + layout schema	2026-05-23 10:58:46 +08:00
panda	4dfc418fc5	fix: escape {field_N} braces in prompt templates to prevent .format() KeyError $F{field_1} literal text in skeleton_generation/refine_layout/field_mapping prompts was being parsed as Python .format() placeholder, causing KeyError on every image-based initial_generation request. Escaped with double braces so .format() outputs literal {field_1} for the LLM.	2026-05-22 08:12:56 +08:00
panda	43a0542a11	feat: layered precise generation for A4 report images 3-phase pipeline to solve LLM prompt overflow from too many OCR elements: Phase 1 (generate_skeleton): compressed layout schema → skeleton JRXML Phase 2 (refine_layout): sampled coordinates → pixel-level position tuning Phase 3 (map_fields): OCR field names → replace $F{field_N} placeholders Only triggered when layout_schema.total_rows > 0 on initial_generation intent. Text requests and all other intents are unaffected (zero behavior change).	2026-05-21 08:34:32 +08:00
panda	9bb011e429	feat: v4 multimodal chat input, multi-format support, and annotation detection - Replace st.chat_input with st-multimodal-chatinput (Ctrl+V paste, drag-drop, file button) - Extract _process_uploaded_file() shared handler (eliminates ~70 duplicated lines) - Add XLSX (openpyxl), XLS (xlrd), DOC (olefile) parsers to file_parser.py - Add backend/annotation_detector.py: circle detection (HoughCircles) + arrow detection (HoughLinesP clustering) + OCR correlation + LLM context formatting - Add annotation_result field to AgentState with session persistence - Wire annotation detection into process_input and _format_ocr_context - Add 11 new tests: 7 annotation detector + 4 multi-format parser - Update all docs: CLAUDE.md, README.md, CODE_GUIDE.md, ROADMAP.md	2026-05-20 23:43:16 +08:00
panda	70614dff5e	feat: comprehensive v2 upgrade — streaming, error KB, file upload, layout analysis Major changes: - Streaming: LLM统一 _BaseLLM 接口 (invoke + stream), generate/modify/correct 节点使用 get_stream_writer() 实现逐字输出, UI 节点平铺展开自动折叠 - Prompt外部化: 7个prompt拆分到 prompts/*.md, loader.py 支持热重载 - 错误自增长: backend/error_kb.py — 指纹去重 + ChromaDB持久化, correct_jrxml→validate 通过时自动入库, retrieve同时搜索错误KB - 文件上传: backend/file_parser.py — PDF/DOCX/图片/文本解析, 侧边栏多文件上传, 文本自动注入下一条消息 - A4模板识别: backend/layout_analyzer.py — 三种模式(完整A4/行片段修改/行片段新建), PaddleOCR元素提取 + 行分组 + JRXML section匹配 - 会话历史下载: jrxml_versions版本追踪 + 侧边栏历史版本下载按钮 - 预览修复: route_after_save跳过预览/导出意图的验证循环 - Ctrl+C修复: JS注入拦截Streamlit裸c键清缓存 Docs: CLAUDE.md (完整项目文档), ROADMAP.md (改进路线图) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-19 15:02:53 +08:00

13 Commits