agent_jrxml

Author	SHA1	Message	Date
panda	520c8b19d0	fix: 五轮修正失败根因修复 - 评分公式去掉field_coverage权重, namespace无条件检查, OCR自动发现文档类型	2026-05-24 22:44:37 +08:00
panda	bd5bfbac2d	fix: band-level windowed refine_layout + programmatic map_fields to prevent 91.5% content loss Root cause: LLM receiving full 34k-char JRXML would regenerate from scratch instead of modifying coordinates in-place, shrinking output to ~3k chars. Solution (programmatic node control, not prompt engineering): - New agent/jrxml_windower.py: decompose JRXML into header (never sent to LLM) + individual bands. Split bands >4000 chars at element boundaries. Reassemble with element count validation (>10% change = rollback). - Rewrite refine_layout: per-band windowed LLM processing (~2-4k chars each). LLM cannot "reimagine" the entire report. - Rewrite map_fields: 100% programmatic regex $F{field_N} -> real name replacement. Zero LLM calls, zero content loss. - _sanitize_field_name: non-ASCII chars escaped to _uXXXX_ format for valid JRXML identifiers. - Tests: 48 new unit tests (windower 28 + map_fields 20). All passing. Full suite 385 tests, zero regressions.	2026-05-24 08:55:38 +08:00
panda	bb6cc6e241	feat: add Java JRXML-to-PNG rendering pipeline with pixel-level SSIM comparison - lib/java/: Java renderer (JrxmlRenderer) using JasperReports 6.21.0 - JrxmlDebug for diagnostics, JrxmlGen for format reference - download_jars.sh for one-time dependency setup - agent/nodes.py: _render_jrxml_to_png() and _compute_pixel_similarity() - Pixel comparison integrates into validate node (SSIM < 0.4 fails) - Pixel fidelity context injected into correct_jrxml for targeted fixes - tests/test_pixel_comparison.py: 15 unit tests (render, SSIM, integration) - .gitignore: exclude lib/java/.jar, lib/java/.class, tmp/ - CLAUDE.md: v11 changelog documenting the rendering pipeline - All non-LLM tests pass (97/97)	2026-05-23 15:09:55 +08:00
panda	4dfc418fc5	fix: escape {field_N} braces in prompt templates to prevent .format() KeyError $F{field_1} literal text in skeleton_generation/refine_layout/field_mapping prompts was being parsed as Python .format() placeholder, causing KeyError on every image-based initial_generation request. Escaped with double braces so .format() outputs literal {field_1} for the LLM.	2026-05-22 08:12:56 +08:00
panda	43a0542a11	feat: layered precise generation for A4 report images 3-phase pipeline to solve LLM prompt overflow from too many OCR elements: Phase 1 (generate_skeleton): compressed layout schema → skeleton JRXML Phase 2 (refine_layout): sampled coordinates → pixel-level position tuning Phase 3 (map_fields): OCR field names → replace $F{field_N} placeholders Only triggered when layout_schema.total_rows > 0 on initial_generation intent. Text requests and all other intents are unaffected (zero behavior change).	2026-05-21 08:34:32 +08:00

5 Commits