Commit Graph

41 Commits

Author SHA1 Message Date
panda 0af774ae9d fix: failure recovery forces modify_report intent bypassing LLM classify
- process_input sets _failure_recovery flag when injecting pending_failure_context
- classify_intent skips LLM classification when flag is set, directly routes to modify_jrxml
- Smart truncation for intent classify: keep head 200 + tail 300 chars instead of head 500
  (prevents user's actual message from being truncated away by long injected context)
- This fixes the bug where "retry" or pasted error messages were misclassified as
  consult_question or initial_generation after max retry exhaustion
2026-05-23 11:18:02 +08:00
panda 23cdfa8c2b fix: map_fields empty-retry + correction prompt field_N guidance
- map_fields: retry with simplified prompt on empty LLM response
- correction.md: add explicit guidance for undeclared field_N errors
  (add <field> declarations + try OCR name replacement)
- MAX_RETRY=5 now effective (was overridden by .env:3)
2026-05-23 11:15:09 +08:00
panda 1210b926c3 fix: MAX_RETRY 5 + rolling continuation + namespace-aware JRXML extraction
- MAX_RETRY: 3→5 (graph.py:35, nodes.py:25) with env override
- Rolling continuation: _generate_with_continuation() auto-detects
  truncated JRXML and sends anchor-based continuation, max 3 rounds
- JRXML extraction: regex/end-tag now namespace-prefix aware
  (ns0:jasperReport, ns:jasperReport, etc.)
- All 5 generation nodes refactored to use continuation helper
- Tests updated: scenario1 accepts ns-prefixed root, max_retry
  verifies graph termination
- stop_reason capture + WARNING log on max_tokens truncation
- Correction prompt now injects OCR context + layout schema
2026-05-23 10:58:46 +08:00
panda 83e801a0b8 fix: auto-inject JasperReports namespace before XSD validation
AI-generated JRXML often omits the xmlns declaration on the root element.
The XSD schema requires targetNamespace, so validation would fail with
"Element 'jasperReport': No matching global declaration available".

_ensure_jr_namespace() detects missing xmlns and injects it before
schema validation, making the validator tolerant of namespace-free JRXML.
2026-05-23 09:44:08 +08:00
panda c2cae5665e fix: replace complex bat scripts with Python launcher + minimal bat wrappers
Root cause: Windows batch files written with LF endings caused cmd.exe to
misparse labels and Chinese characters, producing garbled "not a command"
errors. The Python launcher avoids encoding issues entirely.

- start.py: reliable cross-platform launcher (kill ports, start 3 services,
  wait for health, print status)
- start.bat / start_all.bat: minimal 4-line ASCII wrappers
- stop.bat: inline Python for port-based process killing
2026-05-23 09:32:32 +08:00
panda c8924c625c fix: rewrite startup scripts with reliable helpers, stderr logging, visible windows
- Replace /MIN (hidden window) with normal windows so errors are visible
- Redirect stderr to logs/*.log for post-mortem
- Extract killport/wait_health/wait_port into callable helpers
- Use !N! (delayed expansion) for retry counters
- stop.bat now shows which PIDs it kills with port labels
- Remove nested-quote issue by cd'ing before npm start
2026-05-23 09:25:45 +08:00
panda 9a4f51d378 fix: add retry limit to startup wait loops to prevent infinite hang
Each service wait loop now fails after 30 retries (~60s) instead of
spinning forever when a port is occupied by a stuck process.
Also added cleanup label that kills partially-started services on failure.
2026-05-23 09:20:55 +08:00
panda 40adf50702 fix: add chcp 65001 and .venv check to startup scripts 2026-05-23 09:15:44 +08:00
panda 751df5c4a9 fix: resolve quoting issue in start_all.bat frontend launch, add node_modules check 2026-05-23 09:11:53 +08:00
panda 93ad5e8876 fix: address audit findings — session_id validation, streaming reset, state isolation
- Replace truncated 12-char UUID with full 32-char UUID (128-bit entropy)
- Add validate_session_id() regex check to prevent path traversal
- Add _check_session_id() guard on all 6 API endpoints
- Change _step_counter from module global to contextvars.ContextVar
- Filter None values from node_state before merging into agent_state
- Log save_session failures instead of silently swallowing them
- Add finishStreaming() in catch/finally blocks to prevent UI lockup
- Fix broken multiline docstring in chat() endpoint
2026-05-23 09:08:53 +08:00
panda 1952d75f13 test: add unit/integration/E2E test suites, fix create_session bug, update docs
- Unit tests: test_session.py (27), test_error_kb.py (24), test_agent.py hardened
- Integration tests: test_api_integration.py (25) with FastAPI TestClient
- E2E tests: main-flows.spec.ts (8) with Playwright + API mocking
- Bug fix: backend/session.py create_session() missing session_id parameter
- Config: frontend/playwright.config.ts, npm run test:e2e
- Docs: update CLAUDE.md v9, .gitignore for test artifacts/eval reports
2026-05-23 08:38:29 +08:00
panda b444303055 docs: CLAUDE.md v8 — prompt escape fix + installed plugins/skills reference 2026-05-22 23:01:59 +08:00
panda 1e5ce9725b feat: FastAPI+SSE API server, JRXML auto-reorder, session integrity fixes 2026-05-22 17:53:59 +08:00
panda 1144a86d02 fix: session persistence, multi-turn memory, OCR pipeline, download UX (v7)
- graph.stream() state fix: agent_state now properly accumulates node updates

- atomic session save (tempfile + os.replace)

- uploaded_file_path injection for OcrExtractor + annotation_detector

- download section always visible; refreshFromApi auto-reloads after generation

- node_start/complete unfiltered for full progress visibility

- modification_request without status=='pass' check
2026-05-22 11:13:25 +08:00
panda 4dfc418fc5 fix: escape {field_N} braces in prompt templates to prevent .format() KeyError
$F{field_1} literal text in skeleton_generation/refine_layout/field_mapping
prompts was being parsed as Python .format() placeholder, causing KeyError
on every image-based initial_generation request. Escaped with double braces
so .format() outputs literal {field_1} for the LLM.
2026-05-22 08:12:56 +08:00
panda 339d415322 fix: crash 'list' object has no attribute 'keys' on image upload, output disappearing on error
Root cause: layout_schema.regions is a list of region dicts, not a dict.
_log_ocr_layers() was calling .keys() on it, causing agent_error.

Also fixed: ProcessSection now stays visible after streaming ends (error or
completion), so generated content is not lost. Header shows ✓/✕/pulse indicators.
Error handler now refreshes session state for partial JRXML download.
2026-05-22 00:01:54 +08:00
panda d600cbf285 feat: add quick action buttons (preview/undo/reset) to sidebar
Sidebar now has 快捷操作 section matching Streamlit app functionality:
- 预览 — sends "预览报表" to preview current JRXML
- 撤销 — sends "撤销上一步修改" to revert last change
- 重置 — sends "重新来,清空当前报表" to reset session

Session store now tracks history_states for undo availability check.
2026-05-21 23:54:57 +08:00
panda a364e1de81 feat: 5-issue fix — OCR image parse bug + Vue frontend feature parity + streaming UX
Fix 1 (CRITICAL): file_parser.py suffix normalization ".jpg", api_server.py Path.suffix
Fix 2: Sidebar version history download, ProcessSection replaces old components
Fix 3: OCR content/position layer structured logging in agent/nodes.py
Fix 4: collapsible process sections with per-section stream routing + auto-fold
Fix 5: agent_complete total_duration_ms, SummaryCard duration display

- backend/file_parser.py: normalize suffix to always include leading dot
- api_server.py: step_index in node_start, total_duration_ms in agent_complete
- agent/nodes.py: _log_ocr_layers() for [内容层]/[位置层]/[合并] logging
- frontend: ProcessSection.vue (NEW), chat.ts sections model, Sidebar versions
- CLAUDE.md: updated component list and v6 changelog
2026-05-21 23:43:21 +08:00
panda 60e2f520ba fix: image files silently falling to text parser due to suffix dot mismatch
api_server.py passed "jpg" (no dot) from rsplit, but file_parser.py
parser dict keys all have dots (".jpg"), causing image files to fall
through to _parse_text() which fails on binary data, skipping ALL OCR
and layout analysis. Every image upload was affected.

- file_parser.py: normalize file_type to always have leading dot
- api_server.py: use Path.suffix instead of manual rsplit
2026-05-21 23:05:27 +08:00
panda 83c7da7517 fix: system env vars silently overriding .env — load_dotenv(override=True)
Root cause: load_dotenv() default override=False meant system-level
ANTHROPIC_BASE_URL (https://api.deepseek.com/anthropic) took precedence
over .env's OPENAI_BASE_URL (https://api.minimaxi.com/anthropic). All
Anthropic API calls went to DeepSeek with a MiniMax key, causing 401.

Changes:
- backend/llm.py: load_dotenv(override=True) — .env always wins
- .env.example: add explicit ANTHROPIC_API_KEY + ANTHROPIC_BASE_URL
- CLAUDE.md: document env var priority pitfall
2026-05-21 22:36:43 +08:00
panda aa1d8a6c52 fix: logging KeyError with reserved 'filename' key, pytest return-not-none warnings
- api_server.py: rename 'filename' to 'file_name' in upload_file log extra
  dict to avoid collision with Python logging's reserved LogRecord attribute
- test_e2e_ocr.py: replace return statements with assert in test functions
  to fix PytestReturnNotNoneWarning
2026-05-21 22:28:07 +08:00
panda 960312b088 fix: start.bat nested quote parsing with path containing spaces
cmd /k "cd /d "%~dp0" && ..." breaks because inner quotes around
%~dp0 close the outer quoted string prematurely when the path
contains spaces (D:\Idea Project\...). Fix: remove outer quotes,
escape && as ^&^& so it passes through to the new cmd instance.
2026-05-21 22:14:17 +08:00
panda 7c1aa7d934 docs: update architecture docs for Vue 3 + FastAPI separation, add one-click start.bat
- CLAUDE.md: remove duplicate architecture section, fix MAX_RETRY 5→3
- README.md: update architecture diagram to 3-tier, add start.bat instructions
- ROADMAP.md: add 阶段六 layered generation v5 (items 16-20)
- start.bat: one-click startup with auto port-kill and path-with-spaces fix
- package-lock.json: updated from npm install
2026-05-21 22:10:22 +08:00
panda 74f3f03d2c feat: 前后端分离架构 — FastAPI SSE后端 + Vue 3前端
将单体 Streamlit 应用拆分为三层架构:
- api_server.py: FastAPI SSE 流式后端 (端口 8000)
- frontend/: Vue 3 + Vite + Pinia 聊天前端 (端口 5173)
- agent/graph.py: 新增 node_start 回调支持
- 更新启动脚本为三服务模式

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 20:04:27 +08:00
panda 2befd44430 Merge remote v4/v5 features (multimodal chat input, layered generation, annotation detection) with local v3 features (dialog file upload, XLSX support, session fix)
Key resolutions:
- agent/nodes.py: Merged session_id exclusion fix with new persistable fields (ocr_extraction_result, annotation_result, layout_schema, ocr_elements)
- app.py: Adopted st-multimodal-chatinput for unified paste/drop/upload, removed custom JS paste bridge
- backend/file_parser.py: Kept local XLSX parser, added remote XLS/DOC parsers
- CLAUDE.md + CODE_GUIDE.md: Merged documentation from both branches

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 10:05:43 +08:00
panda 43a0542a11 feat: layered precise generation for A4 report images
3-phase pipeline to solve LLM prompt overflow from too many OCR elements:
Phase 1 (generate_skeleton): compressed layout schema → skeleton JRXML
Phase 2 (refine_layout): sampled coordinates → pixel-level position tuning
Phase 3 (map_fields): OCR field names → replace $F{field_N} placeholders

Only triggered when layout_schema.total_rows > 0 on initial_generation intent.
Text requests and all other intents are unaffected (zero behavior change).
2026-05-21 08:34:32 +08:00
panda 9bb011e429 feat: v4 multimodal chat input, multi-format support, and annotation detection
- Replace st.chat_input with st-multimodal-chatinput (Ctrl+V paste, drag-drop, file button)
- Extract _process_uploaded_file() shared handler (eliminates ~70 duplicated lines)
- Add XLSX (openpyxl), XLS (xlrd), DOC (olefile) parsers to file_parser.py
- Add backend/annotation_detector.py: circle detection (HoughCircles) + arrow detection (HoughLinesP clustering) + OCR correlation + LLM context formatting
- Add annotation_result field to AgentState with session persistence
- Wire annotation detection into process_input and _format_ocr_context
- Add 11 new tests: 7 annotation detector + 4 multi-format parser
- Update all docs: CLAUDE.md, README.md, CODE_GUIDE.md, ROADMAP.md
2026-05-20 23:43:16 +08:00
panda 87ead4fa6a feat: 对话区域文件上传(粘贴/拖拽) + XLSX支持 + 会话切换无限循环修复
- 对话区域: st.file_uploader + 全局 paste/drop 事件监听 + sessionStorage 桥接
- 文件预览芯片: 上传后显示在对话区域,可逐文件移除
- OCR 双层解析全面接入: file_parser(文字) + ocr_extractor(字段提取)
- XLSX 解析: openpyxl 逐工作表/逐行读取
- 修复: create_session 强制写入 agent_state.session_id
- 修复: load_session_node 不再从磁盘覆盖 session_id
- 修复: 切换会话 _last_switched_to 哨兵防止无限 rerun

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 12:04:02 +08:00
panda da79640259 fix: OCR字段提取集成修复 + 会话切换无限循环修复 + 一键启动脚本
- process_input 传入17个默认中文字段(修复空列表导致零字段提取)
- OCR提取结果自动注入 LLM 上下文
- save_session_node/load_session_node 持久化 session_id(修复切换会话无限 rerun)
- app.py 会话切换后显式设置 session_id(纵深防御)
- 新增 start.bat / stop.bat 一键启动/停止脚本
- 更新 CLAUDE.md + CODE_GUIDE.md 文档

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 10:17:05 +08:00
panda c9f003e1b7 feat: 新增 OCR 单据字段精确提取模块
- 新增 backend/ocr_extractor.py: 两阶段提取流水线 (文档分析 + 字段提取)
- 四种提取策略: 精确KV匹配/模糊KV匹配/正则模式/表格结构匹配
- agent/state.py: 新增 ocr_extraction_result 和 uploaded_file_path 字段
- agent/nodes.py: process_input() 中自动触发 OCR 提取钩子
- app.py: 文件上传时保留图片路径, 总结卡片中展示提取结果
- .env.example: 新增 OCR_USE_GPU / OCR_CONFIDENCE_THRESHOLD 配置项
- tests/test_ocr_extraction.py: 48 个单元测试全部通过
2026-05-20 08:06:55 +08:00
panda 067880bf2e feat: 添加结构化日志系统,更新LLM配置与全部文档
新增:
- backend/logger.py — 集中日志模块 (JSON格式 + trace_id + 独立llm.log)
- @log_node / @_log_route 装饰器覆盖17个节点和8个路由

改进:
- backend/llm.py — _LLMLoggingWrapper 自动记录LLM输入输出
- backend/llm.py — API Key优先读ANTHROPIC_API_KEY,模型名改为MiniMax-M2.7
- backend/llm.py — get_llm() 新增caller参数标识调用来源
- backend/validation.py — 新增验证结果/连接失败日志
- backend/session.py — 新增会话创建/删除日志
- app.py — 新增用户交互日志 (输入/执行/异常/会话操作)
- app.py — 提前导入torchvision抑制transformers懒加载报错
- .env.example — 新增LOG_DIR/LOG_LEVEL/ANTHROPIC_API_KEY等配置项
- .gitignore — 新增logs/和db/忽略规则

文档:
- ROADMAP.md — 新增阶段四: 可观测性
- README.md — 补充日志架构/LLM配置/项目结构
- CLAUDE.md — 同步最新配置/日志/MAX_RETRY(3)
- CODE_GUIDE.md — 新增第15章日志系统,更新架构图/LLM/配置
2026-05-19 23:40:01 +08:00
panda 6467fd4ae5 feat: v3 robustness upgrade — EasyOCR, failure recovery, minimum content check
- OCR: EasyOCR (primary, ch_sim+en) with PaddleOCR fallback for Windows compatibility
- Validation: _check_minimum_content() rejects empty-shell JRXML (no band/textField)
- Retry: MAX_RETRY 3→5, exhaustion records pending_failure_context for next-turn auto-injection
- Finalize: only saves jrxml_versions on pass, preserves last good final_jrxml on fail
- Extract JRXML: improved empty markdown block handling and XML fragment fallback
- UI: real-time node progress via placeholder updates, initial "analyzing" feedback
- UI: use agent_state (full) instead of node_state (partial) for summary card routing
- UI: unknown template_type now gives LLM meaningful image context instead of metadata
- Docs: updated CLAUDE.md and CODE_GUIDE.md to reflect all v3 changes

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 19:15:30 +08:00
panda 70614dff5e feat: comprehensive v2 upgrade — streaming, error KB, file upload, layout analysis
Major changes:
- Streaming: LLM统一 _BaseLLM 接口 (invoke + stream), generate/modify/correct
  节点使用 get_stream_writer() 实现逐字输出, UI 节点平铺展开自动折叠
- Prompt外部化: 7个prompt拆分到 prompts/*.md, loader.py 支持热重载
- 错误自增长: backend/error_kb.py — 指纹去重 + ChromaDB持久化,
  correct_jrxml→validate 通过时自动入库, retrieve同时搜索错误KB
- 文件上传: backend/file_parser.py — PDF/DOCX/图片/文本解析,
  侧边栏多文件上传, 文本自动注入下一条消息
- A4模板识别: backend/layout_analyzer.py — 三种模式(完整A4/行片段修改/行片段新建),
  PaddleOCR元素提取 + 行分组 + JRXML section匹配
- 会话历史下载: jrxml_versions版本追踪 + 侧边栏历史版本下载按钮
- 预览修复: route_after_save跳过预览/导出意图的验证循环
- Ctrl+C修复: JS注入拦截Streamlit裸c键清缓存

Docs: CLAUDE.md (完整项目文档), ROADMAP.md (改进路线图)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 15:02:53 +08:00
panda b280c2b453 feat: integrate RAG rag_jrxml submodule and fix Anthropic API key
Add rag submodule for semantic JRXML chunk retrieval, refactor
retrieve node to use RAGSearcher, and fix missing api_key in
Anthropic SDK client initialization.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 09:42:57 +08:00
panda 4416c20b77 feat: update init_kb script 2026-05-15 08:29:01 +08:00
panda 664de945f1 fix: use raw Anthropic SDK for MiniMax with NO_PROXY workaround
The langchain-anthropic wrapper fails auth with MiniMax because
it sends an api_key that conflicts with ANTHROPIC_AUTH_TOKEN at
the SDK level, causing the request to be sent with incorrect
auth headers. Use raw Anthropic SDK directly with a simple
MiniMaxLLM wrapper class instead.

Root cause: MiniMax requires the API key ONLY via ANTHROPIC_AUTH_TOKEN
(system env), not via api_key parameter or OPENAI_API_KEY. Setting
os.environ["NO_PROXY"]="*" is also needed to prevent httpx from
using a proxy that interferes with the auth header.

Note: E2E testing with streamlit run app.py still pending.
2026-05-15 00:35:41 +08:00
panda 76f98a7aeb feat: add Anthropic API provider support and missing env vars
- Add LLM_PROVIDER env var (openai/anthropic) to switch cloud backend
- Use ChatAnthropic for anthropic provider with custom base_url
- Add CONTEXT_MAX_TOKENS, CONTEXT_KEEP_RECENT, SESSIONS_DIR,
  HISTORY_MAX_SNAPSHOTS to .env and .env.example
- Add langchain-anthropic dependency to requirements.txt

Note: E2E testing blocked — the configured MiniMax API key
(sk-cp-...) returns 401 across all endpoints (Anthropic and OpenAI).
The API key may be expired or lack text-generation model access.
2026-05-14 23:39:00 +08:00
panda d0f5d05316 docs: 项目说明 — 架构、快速开始、环境变量、项目结构
README.md 涵盖:
  功能概述、架构图(Streamlit→LangGraph→FastAPI)、前置要求、
  5步快速开始指南、使用示例、验证服务限制说明、项目结构、
  环境变量完整列表
2026-05-14 23:21:31 +08:00
panda e113374682 feat: Streamlit多轮对话界面 + 集成测试
app.py:
  侧边栏:会话管理(创建/切换/删除)、快捷操作(预览/撤销/重置)、
         配置信息、JRXML下载
  主区域:多轮聊天、8种意图差异化展示(JRXML代码/咨询回答/
          错误解释/成功提示)
  URL参数:?session_id= 会话分享

tests/:
  test_validation.py: 验证服务6个单元测试(健康检查/空内容/
                      无效XML/缺少尺寸/有效JRXML/字段引用)
  test_agent.py: 5个集成验收场景(简单生成/自动修正/
                  多轮修改/上下文感知修改/最大重试处理)
2026-05-14 23:21:22 +08:00
panda 4b43c5d3e4 feat: LangGraph工作流核心 — Agent状态/节点/图 + 验证服务 + 知识库
agent/
  state.py: AgentState TypedDict(20字段含意图/压缩/会话/撤销)
  nodes.py: 17个节点函数(生成/修改/验证/纠错/意图分类/压缩/撤销/重置)
  graph.py: 17节点状态图,8意图路由分发

验证服务 validation_service/
  main.py: FastAPI服务,lxml XSD验证 + 结构化检查(字段引用/SQL/尺寸)

数据 data/
  sample_templates/: 4个JRXML示例模板
  corrections/: 3个错误修正案例

脚本 scripts/
  init_kb.py: Chroma知识库初始化
2026-05-14 23:21:10 +08:00
panda 21a5fdf930 feat: 后端基础设施 — LLM工厂/Embedding工厂/验证客户端/会话持久化
- backend/llm.py: 支持 OpenAI 兼容 API 与 Ollama 本地模型切换
- backend/embeddings.py: 支持云端与本地嵌入模型(sentence-transformers)
- backend/validation.py: FastAPI 验证服务 HTTP 客户端
- backend/session.py: JSON 文件会话管理(创建/加载/保存/列表/删除)
- .env.example: 完整环境变量模板
- requirements.txt: 所有 Python 依赖声明
2026-05-14 23:20:56 +08:00