Root cause: layout_schema.regions is a list of region dicts, not a dict.
_log_ocr_layers() was calling .keys() on it, causing agent_error.
Also fixed: ProcessSection now stays visible after streaming ends (error or
completion), so generated content is not lost. Header shows ✓/✕/pulse indicators.
Error handler now refreshes session state for partial JRXML download.
Sidebar now has 快捷操作 section matching Streamlit app functionality:
- 预览 — sends "预览报表" to preview current JRXML
- 撤销 — sends "撤销上一步修改" to revert last change
- 重置 — sends "重新来,清空当前报表" to reset session
Session store now tracks history_states for undo availability check.
api_server.py passed "jpg" (no dot) from rsplit, but file_parser.py
parser dict keys all have dots (".jpg"), causing image files to fall
through to _parse_text() which fails on binary data, skipping ALL OCR
and layout analysis. Every image upload was affected.
- file_parser.py: normalize file_type to always have leading dot
- api_server.py: use Path.suffix instead of manual rsplit
- api_server.py: rename 'filename' to 'file_name' in upload_file log extra
dict to avoid collision with Python logging's reserved LogRecord attribute
- test_e2e_ocr.py: replace return statements with assert in test functions
to fix PytestReturnNotNoneWarning
cmd /k "cd /d "%~dp0" && ..." breaks because inner quotes around
%~dp0 close the outer quoted string prematurely when the path
contains spaces (D:\Idea Project\...). Fix: remove outer quotes,
escape && as ^&^& so it passes through to the new cmd instance.
3-phase pipeline to solve LLM prompt overflow from too many OCR elements:
Phase 1 (generate_skeleton): compressed layout schema → skeleton JRXML
Phase 2 (refine_layout): sampled coordinates → pixel-level position tuning
Phase 3 (map_fields): OCR field names → replace $F{field_N} placeholders
Only triggered when layout_schema.total_rows > 0 on initial_generation intent.
Text requests and all other intents are unaffected (zero behavior change).
- OCR: EasyOCR (primary, ch_sim+en) with PaddleOCR fallback for Windows compatibility
- Validation: _check_minimum_content() rejects empty-shell JRXML (no band/textField)
- Retry: MAX_RETRY 3→5, exhaustion records pending_failure_context for next-turn auto-injection
- Finalize: only saves jrxml_versions on pass, preserves last good final_jrxml on fail
- Extract JRXML: improved empty markdown block handling and XML fragment fallback
- UI: real-time node progress via placeholder updates, initial "analyzing" feedback
- UI: use agent_state (full) instead of node_state (partial) for summary card routing
- UI: unknown template_type now gives LLM meaningful image context instead of metadata
- Docs: updated CLAUDE.md and CODE_GUIDE.md to reflect all v3 changes
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add rag submodule for semantic JRXML chunk retrieval, refactor
retrieve node to use RAGSearcher, and fix missing api_key in
Anthropic SDK client initialization.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The langchain-anthropic wrapper fails auth with MiniMax because
it sends an api_key that conflicts with ANTHROPIC_AUTH_TOKEN at
the SDK level, causing the request to be sent with incorrect
auth headers. Use raw Anthropic SDK directly with a simple
MiniMaxLLM wrapper class instead.
Root cause: MiniMax requires the API key ONLY via ANTHROPIC_AUTH_TOKEN
(system env), not via api_key parameter or OPENAI_API_KEY. Setting
os.environ["NO_PROXY"]="*" is also needed to prevent httpx from
using a proxy that interferes with the auth header.
Note: E2E testing with streamlit run app.py still pending.
- Add LLM_PROVIDER env var (openai/anthropic) to switch cloud backend
- Use ChatAnthropic for anthropic provider with custom base_url
- Add CONTEXT_MAX_TOKENS, CONTEXT_KEEP_RECENT, SESSIONS_DIR,
HISTORY_MAX_SNAPSHOTS to .env and .env.example
- Add langchain-anthropic dependency to requirements.txt
Note: E2E testing blocked — the configured MiniMax API key
(sk-cp-...) returns 401 across all endpoints (Anthropic and OpenAI).
The API key may be expired or lack text-generation model access.