Root cause: LLM receiving full 34k-char JRXML would regenerate from scratch
instead of modifying coordinates in-place, shrinking output to ~3k chars.
Solution (programmatic node control, not prompt engineering):
- New agent/jrxml_windower.py: decompose JRXML into header (never sent to
LLM) + individual bands. Split bands >4000 chars at element boundaries.
Reassemble with element count validation (>10% change = rollback).
- Rewrite refine_layout: per-band windowed LLM processing (~2-4k chars
each). LLM cannot "reimagine" the entire report.
- Rewrite map_fields: 100% programmatic regex $F{field_N} -> real name
replacement. Zero LLM calls, zero content loss.
- _sanitize_field_name: non-ASCII chars escaped to _uXXXX_ format for
valid JRXML identifiers.
- Tests: 48 new unit tests (windower 28 + map_fields 20). All passing.
Full suite 385 tests, zero regressions.
- Replace truncated 12-char UUID with full 32-char UUID (128-bit entropy)
- Add validate_session_id() regex check to prevent path traversal
- Add _check_session_id() guard on all 6 API endpoints
- Change _step_counter from module global to contextvars.ContextVar
- Filter None values from node_state before merging into agent_state
- Log save_session failures instead of silently swallowing them
- Add finishStreaming() in catch/finally blocks to prevent UI lockup
- Fix broken multiline docstring in chat() endpoint
- api_server.py: rename 'filename' to 'file_name' in upload_file log extra
dict to avoid collision with Python logging's reserved LogRecord attribute
- test_e2e_ocr.py: replace return statements with assert in test functions
to fix PytestReturnNotNoneWarning
3-phase pipeline to solve LLM prompt overflow from too many OCR elements:
Phase 1 (generate_skeleton): compressed layout schema → skeleton JRXML
Phase 2 (refine_layout): sampled coordinates → pixel-level position tuning
Phase 3 (map_fields): OCR field names → replace $F{field_N} placeholders
Only triggered when layout_schema.total_rows > 0 on initial_generation intent.
Text requests and all other intents are unaffected (zero behavior change).