Files
agent_jrxml/kb_data/d198ae3b32cd49f09736c4290dd1223a/db5e3844c382439dba91c29f4f29eeca/meta.json
T
panda bd5bfbac2d fix: band-level windowed refine_layout + programmatic map_fields to prevent 91.5% content loss
Root cause: LLM receiving full 34k-char JRXML would regenerate from scratch
instead of modifying coordinates in-place, shrinking output to ~3k chars.

Solution (programmatic node control, not prompt engineering):

- New agent/jrxml_windower.py: decompose JRXML into header (never sent to
  LLM) + individual bands. Split bands >4000 chars at element boundaries.
  Reassemble with element count validation (>10% change = rollback).

- Rewrite refine_layout: per-band windowed LLM processing (~2-4k chars
  each). LLM cannot "reimagine" the entire report.

- Rewrite map_fields: 100% programmatic regex $F{field_N} -> real name
  replacement. Zero LLM calls, zero content loss.

- _sanitize_field_name: non-ASCII chars escaped to _uXXXX_ format for
  valid JRXML identifiers.

- Tests: 48 new unit tests (windower 28 + map_fields 20). All passing.
  Full suite 385 tests, zero regressions.
2026-05-24 08:55:38 +08:00

13 lines
350 B
JSON

{
"kb_id": "db5e3844c382439dba91c29f4f29eeca",
"user_id": "d198ae3b32cd49f09736c4290dd1223a",
"name": "测试csv",
"description": "csv",
"created_at": "2026-05-23T15:31:44.119922+00:00",
"updated_at": "2026-05-23T15:31:44.119922+00:00",
"fields": [],
"templates": [],
"file_count": 0,
"chunk_count": 0,
"parse_status": "empty"
}