Commit Graph

7 Commits

Author SHA1 Message Date
panda 1e5ce9725b feat: FastAPI+SSE API server, JRXML auto-reorder, session integrity fixes 2026-05-22 17:53:59 +08:00
panda 83c7da7517 fix: system env vars silently overriding .env — load_dotenv(override=True)
Root cause: load_dotenv() default override=False meant system-level
ANTHROPIC_BASE_URL (https://api.deepseek.com/anthropic) took precedence
over .env's OPENAI_BASE_URL (https://api.minimaxi.com/anthropic). All
Anthropic API calls went to DeepSeek with a MiniMax key, causing 401.

Changes:
- backend/llm.py: load_dotenv(override=True) — .env always wins
- .env.example: add explicit ANTHROPIC_API_KEY + ANTHROPIC_BASE_URL
- CLAUDE.md: document env var priority pitfall
2026-05-21 22:36:43 +08:00
panda c9f003e1b7 feat: 新增 OCR 单据字段精确提取模块
- 新增 backend/ocr_extractor.py: 两阶段提取流水线 (文档分析 + 字段提取)
- 四种提取策略: 精确KV匹配/模糊KV匹配/正则模式/表格结构匹配
- agent/state.py: 新增 ocr_extraction_result 和 uploaded_file_path 字段
- agent/nodes.py: process_input() 中自动触发 OCR 提取钩子
- app.py: 文件上传时保留图片路径, 总结卡片中展示提取结果
- .env.example: 新增 OCR_USE_GPU / OCR_CONFIDENCE_THRESHOLD 配置项
- tests/test_ocr_extraction.py: 48 个单元测试全部通过
2026-05-20 08:06:55 +08:00
panda 067880bf2e feat: 添加结构化日志系统,更新LLM配置与全部文档
新增:
- backend/logger.py — 集中日志模块 (JSON格式 + trace_id + 独立llm.log)
- @log_node / @_log_route 装饰器覆盖17个节点和8个路由

改进:
- backend/llm.py — _LLMLoggingWrapper 自动记录LLM输入输出
- backend/llm.py — API Key优先读ANTHROPIC_API_KEY,模型名改为MiniMax-M2.7
- backend/llm.py — get_llm() 新增caller参数标识调用来源
- backend/validation.py — 新增验证结果/连接失败日志
- backend/session.py — 新增会话创建/删除日志
- app.py — 新增用户交互日志 (输入/执行/异常/会话操作)
- app.py — 提前导入torchvision抑制transformers懒加载报错
- .env.example — 新增LOG_DIR/LOG_LEVEL/ANTHROPIC_API_KEY等配置项
- .gitignore — 新增logs/和db/忽略规则

文档:
- ROADMAP.md — 新增阶段四: 可观测性
- README.md — 补充日志架构/LLM配置/项目结构
- CLAUDE.md — 同步最新配置/日志/MAX_RETRY(3)
- CODE_GUIDE.md — 新增第15章日志系统,更新架构图/LLM/配置
2026-05-19 23:40:01 +08:00
panda b280c2b453 feat: integrate RAG rag_jrxml submodule and fix Anthropic API key
Add rag submodule for semantic JRXML chunk retrieval, refactor
retrieve node to use RAGSearcher, and fix missing api_key in
Anthropic SDK client initialization.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 09:42:57 +08:00
panda 76f98a7aeb feat: add Anthropic API provider support and missing env vars
- Add LLM_PROVIDER env var (openai/anthropic) to switch cloud backend
- Use ChatAnthropic for anthropic provider with custom base_url
- Add CONTEXT_MAX_TOKENS, CONTEXT_KEEP_RECENT, SESSIONS_DIR,
  HISTORY_MAX_SNAPSHOTS to .env and .env.example
- Add langchain-anthropic dependency to requirements.txt

Note: E2E testing blocked — the configured MiniMax API key
(sk-cp-...) returns 401 across all endpoints (Anthropic and OpenAI).
The API key may be expired or lack text-generation model access.
2026-05-14 23:39:00 +08:00
panda 21a5fdf930 feat: 后端基础设施 — LLM工厂/Embedding工厂/验证客户端/会话持久化
- backend/llm.py: 支持 OpenAI 兼容 API 与 Ollama 本地模型切换
- backend/embeddings.py: 支持云端与本地嵌入模型(sentence-transformers)
- backend/validation.py: FastAPI 验证服务 HTTP 客户端
- backend/session.py: JSON 文件会话管理(创建/加载/保存/列表/删除)
- .env.example: 完整环境变量模板
- requirements.txt: 所有 Python 依赖声明
2026-05-14 23:20:56 +08:00