fix: MAX_RETRY 5 + rolling continuation + namespace-aware JRXML extraction

- MAX_RETRY: 3→5 (graph.py:35, nodes.py:25) with env override - Rolling continuation: _generate_with_continuation() auto-detects truncated JRXML and sends anchor-based continuation, max 3 rounds - JRXML extraction: regex/end-tag now namespace-prefix aware (ns0:jasperReport, ns:jasperReport, etc.) - All 5 generation nodes refactored to use continuation helper - Tests updated: scenario1 accepts ns-prefixed root, max_retry verifies graph termination - stop_reason capture + WARNING log on max_tokens truncation - Correction prompt now injects OCR context + layout schema
fix: auto-inject JasperReports namespace before XSD validation
2026-05-23 10:58:46 +08:00 · 2026-05-23 09:44:08 +08:00 · 2026-05-23 09:32:32 +08:00 · 2026-05-23 09:25:45 +08:00 · 2026-05-23 09:20:55 +08:00 · 2026-05-23 09:15:44 +08:00
21 changed files with 1436 additions and 171 deletions
@@ -13,6 +13,19 @@ db/chroma/
 sessions/
 logs/
 db/
+# 自动评测 (Mavis AI)
+.mavis/
+EVALUATION_REPORT.md
+
+# 上传文件
+uploads/
+
+# OCR 临时输出
+ocr_raw_positions.json
+
+# Playwright E2E 测试产物
+frontend/test-results/
+
 # RAG 管线中间产物 (rag 子模块内)
 rag/jrxml_chunker_output/
 rag/embeddings/
@@ -4,7 +4,7 @@

 一个**本地桌面应用**，通过自然语言多轮对话帮助非技术用户创建 JasperReports 模板（JRXML 文件）。核心技术栈：Vue 3 前端 + FastAPI SSE 后端 + LangGraph 状态机 + LLM 生成/修改 + 自动验证修正循环。

-**一句话**：用户用中文描述报表需求 → LLM 生成 JRXML → 自动验证 → 失败则自动修正(最多3次) → 重试耗尽后失败上下文自动注入下一轮 → 返回可编译的 JRXML 文件。
+**一句话**：用户用中文描述报表需求 → LLM 生成 JRXML → 自动验证 → 失败则自动修正(最多5次) → 重试耗尽后失败上下文自动注入下一轮 → 返回可编译的 JRXML 文件。

 ## 启动命令

@@ -34,7 +34,7 @@ cd frontend && npm run dev
 - **向量库**: ChromaDB 持久化在 `./db/chroma`
 - **验证服务**: FastAPI `localhost:8001`
 - **日志**: JSON 格式化，`logs/app.log` + `logs/llm.log`，中国时区 (UTC+8)
- **MAX_RETRY**: 3
+- **MAX_RETRY**: 5

 ## 架构

@@ -233,7 +233,7 @@ validation_service/ (FastAPI, 端口 8001) — 不变
 - **OCR 引擎**: 优先 PaddleOCR 2.9.x（精确识别，`pip install paddleocr`），回退 EasyOCR 1.7+。两者均未安装时仅返回图片元信息。PaddlePaddle 3.x 在 Windows 上有 ONEDNN bug，固定在 2.6.x。
 - **OCR 字段提取**: `process_input` 自动检测上传图片，调用 `OcrExtractor` 提取常见中文字段（发票代码/号码/金额/日期等），提取结果自动注入 LLM 上下文。
 - **会话持久化**: `session_id` 现已包含在 `save_session_node` 的持久化字段中，避免切换会话时因 `session_id` 丢失导致的无限 rerun bug。`create_session` 存盘前强制写入 `agent_state["session_id"] = sid`。`load_session_node` 不从磁盘覆盖 `session_id`。切换会话增加 `_last_switched_to` 哨兵防止重复触发。
- **MAX_RETRY**: 默认 3 次。重试耗尽后 `pending_failure_context` 记录失败信息，下次用户输入时自动注入。
+- **MAX_RETRY**: 默认 5 次。重试耗尽后 `pending_failure_context` 记录失败信息，下次用户输入时自动注入。
 - **验证最小内容检查**: 验证服务额外检查至少 1 个 `<band>` + 1 个 `<textField>` 或 `<staticText>`，拦截空壳 JRXML。
 - **XLSX 支持 (v3)**: 需要 `openpyxl>=3.1.0`（已加入 requirements.txt）。表格按工作表逐行读取，单元格用 `|` 分隔。
 - **粘贴功能限制**: 文件以 base64 编码在 sessionStorage 中传递，单文件上限 20MB。大文件建议使用 file_uploader 按钮。
@@ -327,3 +327,61 @@ validation_service/ (FastAPI, 端口 8001) — 不变
 - `prompts/refine_layout.md` — 1 处

 Python 将 `{{` 输出为字面量 `{`，LLM 看到的内容不变。
+
+## 更新 (v9 — 2026-05-22)
+
+### 测试基础设施全面补齐
+
+**单元测试** (76 测试):
+- `tests/test_session.py` — 27 测试：会话 CRUD、原子写入、唯一 ID、损坏 JSON 跳过
+- `tests/test_error_kb.py` — 24 测试：指纹去重、关键词提取（中/英/JRXML）、ErrorKB CRUD、搜索、统计
+- `tests/test_agent.py` — 5 个软断言强化为严格断言（`status`/`current_jrxml` 存在性检查）
+- 已有测试：`test_ocr_extraction.py`（49）、`test_layered_generation.py`（19）、`test_validation.py`（6）、`test_file_parser_formats.py`（4）、`test_annotation_detector.py`（7）、`test_e2e_ocr.py`（3）
+
+**集成测试** (25 测试, `tests/test_api_integration.py`):
+- FastAPI TestClient 全覆盖：健康检查、配置、会话 CRUD、文件上传、下载、Chat SSE、安全边界（路径穿越/非法 JSON/大 payload）
+- Mock LangGraph graph 避免真实 LLM 调用
+
+**E2E 测试** (8 测试, `frontend/tests/e2e/main-flows.spec.ts`):
+- Playwright 浏览器自动化：页面加载、侧边栏、会话管理、聊天流程、输入 UX
+- 全量 API Mock（`page.route`）无需后端运行
+- 配置: `frontend/playwright.config.ts`, `npm run test:e2e`
+
+**运行测试**:
+```bash
+# 全部单元+集成测试
+cd D:\Idea Project\jaspersoft && python -m pytest tests/ -v
+
+# 仅 E2E（需要前端 dev server）
+cd frontend && npx playwright test
+```
+
+### Bug 修复: create_session 参数缺失
+
+`backend/session.py` — `create_session()` 新增可选参数 `session_id: Optional[str] = None`。
+`api_server.py:507` 调用 `create_session(session_id=session_id)` 时之前会抛出 `TypeError`。
+
+## 更新 (v10 — 2026-05-23)
+
+### 5-Fix — 生成可靠性全面加固
+
+**问题诊断**: 上传车辆历史卡片图片后，`map_fields` 节点 LLM 返回 0 字符，导致 ~11,500 字符的骨架 JRXML 被空字符串覆盖，修正循环无法恢复，最终输出 934 字符的占位桩（与原始图片内容完全不符）。
+
+**Fix 1 — 空响应保护**: 所有 5 个生成节点（`generate_skeleton`, `refine_layout`, `map_fields`, `modify_jrxml`, `correct_jrxml`）增加空响应守卫。LLM 返回空字符串时拒绝更新 `current_jrxml`，保留前一有效版本。
+
+**Fix 2 — max_tokens 扩容**: `backend/llm.py` — `max_tokens` 从 4096 → 8192。MiniMax-M2.7 支持最大 131K 输出 token，8192 在生成复杂 JRXML（通常 5000-15000 字符）时提供充裕空间。
+
+**Fix 3 — 快照回退**: 5 个生成节点在 LLM 输出 JRXML 短于 200 字符时，回退到生成前的 `prev_jrxml` 版本，防止 LLM 输出无意义短文本污染状态。
+
+**Fix 4 — 修正循环注入 OCR 上下文**: `correct_jrxml` 节点将 OCR 提取结果（`ocr_context`）和布局 schema（`layout_schema_text`）注入修正 prompt。此前修正节点"盲修"——只看到 JRXML 和编译错误，不理解原始单据的字段结构和布局意图。
+
+**Fix 5 — 滚动续写机制**: 当 LLM 输出因 `max_tokens` 限制被截断（JRXML 不以 `</jasperReport>` 结尾），自动发送续写请求（附最后 800 字符锚点），最多 3 轮（1 正常 + 2 续写）。
+
+- `backend/llm.py` — `MiniMaxLLM.stream()` 捕获 `stop_reason`，`_LLMLoggingWrapper` 在 `max_tokens` 截断时记录 WARNING
+- `agent/nodes.py` — 新增 `_generate_with_continuation()` 辅助函数，5 个生成节点全部重构使用
+- `_extract_jrxml()` — 正则表达式支持命名空间前缀 JRXML（`<\w+:jasperReport`）
+- 内容去重：续写文本直接拼接，依赖 `_extract_jrxml` 提取完整 XML
+
+**MAX_RETRY 调整**: 默认值从 3 → 5（环境变量 `MAX_RETRY`），配合续写机制确保复杂报表有充分修正机会。
+
+**JRXML 提取命名空间兼容**: `_extract_jrxml()` 和 `_generate_with_continuation()` 的完整性检查统一支持 `</ns0:jasperReport>` 等命名空间前缀闭合标签。
@@ -673,11 +673,15 @@ def generate_skeleton(state: AgentState) -> Dict:
        context=state.get("retrieved_context", ""),
        user_request=user_request,
    )
-    full = []
-    for chunk in llm.stream(prompt):
-        full.append(chunk)
-        writer({"type": "stream", "node": "generate_skeleton", "text": chunk})
-    jrxml = _extract_jrxml("".join(full))
+    prev_jrxml = state.get("current_jrxml", "")
+    full_text = _generate_with_continuation(llm, prompt, writer, "generate_skeleton")
+    if not full_text.strip():
+        _node_log.error("generate_skeleton LLM 返回空响应")
+        return state
+    jrxml = _extract_jrxml(full_text)
+    if len(jrxml.strip()) < 200:
+        _node_log.warning(f"generate_skeleton 输出过短（{len(jrxml)} 字符），回退到前一版本")
+        jrxml = prev_jrxml
    state["current_jrxml"] = jrxml
    state["conversation_history"].append({"role": "assistant", "content": jrxml})
    return state
@@ -705,11 +709,15 @@ def refine_layout(state: AgentState) -> Dict:
        current_jrxml=state.get("current_jrxml", ""),
        sampled_coordinates=sampled_text,
    )
-    full = []
-    for chunk in llm.stream(prompt):
-        full.append(chunk)
-        writer({"type": "stream", "node": "refine_layout", "text": chunk})
-    jrxml = _extract_jrxml("".join(full))
+    prev_jrxml = state.get("current_jrxml", "")
+    full_text = _generate_with_continuation(llm, prompt, writer, "refine_layout")
+    if not full_text.strip():
+        _node_log.error("refine_layout LLM 返回空响应，保留前一版本")
+        return state
+    jrxml = _extract_jrxml(full_text)
+    if len(jrxml.strip()) < 200:
+        _node_log.warning(f"refine_layout 输出过短（{len(jrxml)} 字符），回退到前一版本")
+        jrxml = prev_jrxml
    state["current_jrxml"] = jrxml
    state["conversation_history"].append({"role": "assistant", "content": jrxml})
    return state
@@ -744,11 +752,15 @@ def map_fields(state: AgentState) -> Dict:
        current_jrxml=state.get("current_jrxml", ""),
        ocr_fields=fields_text,
    )
-    full = []
-    for chunk in llm.stream(prompt):
-        full.append(chunk)
-        writer({"type": "stream", "node": "map_fields", "text": chunk})
-    jrxml = _extract_jrxml("".join(full))
+    prev_jrxml = state.get("current_jrxml", "")
+    full_text = _generate_with_continuation(llm, prompt, writer, "map_fields")
+    if not full_text.strip():
+        _node_log.error("map_fields LLM 返回空响应，保留占位字段版本")
+        return state
+    jrxml = _extract_jrxml(full_text)
+    if len(jrxml.strip()) < 200:
+        _node_log.warning(f"map_fields 输出过短（{len(jrxml)} 字符），回退到前一版本")
+        jrxml = prev_jrxml
    state["current_jrxml"] = jrxml
    state["conversation_history"].append({"role": "assistant", "content": jrxml})
    return state
@@ -776,11 +788,15 @@ def modify_jrxml(state: AgentState) -> Dict:
        modification_request=state.get("user_modification_request", ""),
        ocr_context=_format_ocr_context(state),
    )
-    full = []
-    for chunk in llm.stream(prompt):
-        full.append(chunk)
-        writer({"type": "stream", "node": "modify_jrxml", "text": chunk})
-    jrxml = _extract_jrxml("".join(full))
+    prev_jrxml = state.get("current_jrxml", "")
+    full_text = _generate_with_continuation(llm, prompt, writer, "modify_jrxml")
+    if not full_text.strip():
+        _node_log.error("modify_jrxml LLM 返回空响应，保留原版本")
+        return state
+    jrxml = _extract_jrxml(full_text)
+    if len(jrxml.strip()) < 200:
+        _node_log.warning(f"modify_jrxml 输出过短（{len(jrxml)} 字符），回退到前一版本")
+        jrxml = prev_jrxml
    state["current_jrxml"] = jrxml
    state["conversation_history"].append(
        {
@@ -876,10 +892,17 @@ def correct_jrxml(state: AgentState) -> Dict:

    writer = get_stream_writer()
    llm = get_llm(caller="correct_jrxml")
+    ocr_context = _format_ocr_context(state)
+    layout_schema = state.get("layout_schema", {})
+    layout_text = ""
+    if isinstance(layout_schema, dict):
+        layout_text = layout_schema.get("schema_text", "")
    prompt = load_prompt("correction").format(
        current_jrxml=state.get("current_jrxml", ""),
        error_msg=state.get("error_msg", ""),
        explanation=state.get("natural_explanation", ""),
+        ocr_context=ocr_context,
+        layout_schema_text=layout_text,
    )
    # 保存修正前状态（供 validate 判断是否写入错误知识库）
    state["last_error_case"] = {
@@ -888,11 +911,16 @@ def correct_jrxml(state: AgentState) -> Dict:
        "correction_prompt": prompt,
    }

-    full = []
-    for chunk in llm.stream(prompt):
-        full.append(chunk)
-        writer({"type": "stream", "node": "correct_jrxml", "text": chunk})
-    jrxml = _extract_jrxml("".join(full))
+    prev_jrxml = state.get("current_jrxml", "")
+    full_text = _generate_with_continuation(llm, prompt, writer, "correct_jrxml")
+    if not full_text.strip():
+        _node_log.error("correct_jrxml LLM 返回空响应，保留原版本")
+        state["retry_count"] = state.get("retry_count", 0) + 1
+        return state
+    jrxml = _extract_jrxml(full_text)
+    if len(jrxml.strip()) < 200:
+        _node_log.warning(f"correct_jrxml 输出过短（{len(jrxml)} 字符），回退到前一版本")
+        jrxml = prev_jrxml
    state["current_jrxml"] = jrxml
    state["retry_count"] = state.get("retry_count", 0) + 1
    state["conversation_history"].append(
@@ -963,6 +991,49 @@ def finalize(state: AgentState) -> Dict:
    return state


+def _generate_with_continuation(llm, prompt, writer, node_name, max_rounds=3) -> str:
+    """Stream LLM generation with automatic truncation recovery.
+
+    After each stream round, checks if the extracted JRXML ends with
+    </jasperReport>. If truncated, sends a continuation request with
+    the last 800 chars as anchor context.
+
+    Returns combined full text from all rounds.
+    """
+    full_text = ""
+
+    for round_num in range(max_rounds):
+        if round_num == 0:
+            current_prompt = prompt
+        else:
+            tail = full_text[-800:] if len(full_text) > 800 else full_text
+            current_prompt = (
+                f"[系统指令] 你正在生成的 JRXML 在上一次响应中被截断。\n"
+                f"已生成内容的最后部分（请从此处继续）：\n...{tail}\n\n"
+                f"请从截断点继续输出剩余内容，不要重复已输出的部分。"
+            )
+
+        new_chunks = []
+        for chunk in llm.stream(current_prompt):
+            new_chunks.append(chunk)
+            writer({"type": "stream", "node": node_name, "text": chunk})
+
+        new_text = "".join(new_chunks)
+        full_text += new_text
+
+        jrxml = _extract_jrxml(full_text)
+        if re.search(r"</(?:[\w:]+:)?jasperReport>\s*$", jrxml, re.IGNORECASE):
+            break
+
+        if not new_text.strip():
+            _node_log.warning(f"{node_name} 第{round_num+1}轮续写无输出，停止")
+            break
+    else:
+        _node_log.warning(f"{node_name} 经{max_rounds}轮续写仍未完整")
+
+    return full_text
+
+
 def _extract_jrxml(text: str) -> str:
    """从 LLM 响应中提取 JRXML 内容，如有 markdown 标记则去除。"""
    text = text.strip()
@@ -974,7 +1045,8 @@ def _extract_jrxml(text: str) -> str:
            return content
        # markdown 代码块存在但内容为空 — 回退到直接匹配

-    jasper_tag = re.search(r"(<\?xml[\s\S]*?</jasperReport>)", text, re.IGNORECASE)
+    _jrxml_close = r"</(?:[\w:]+:)?jasperReport>"
+    jasper_tag = re.search(rf"(<\?xml[\s\S]*?{_jrxml_close})", text, re.IGNORECASE)
    if jasper_tag:
        return jasper_tag.group(1).strip()

@@ -984,8 +1056,9 @@ def _extract_jrxml(text: str) -> str:
    # 最终回退：如果文本中包含 XML 片段但没有被捕获到，尝试直接提取
    # 这处理 LLM 在代码块外用自然语言"包裹"JRXML 的情况
    xml_start = text.find("<?xml")
-    jr_end = text.lower().rfind("</jasperreport>")
-    if xml_start >= 0 and jr_end > xml_start:
-        return text[xml_start:jr_end + len("</jasperreport>")].strip()
+    jr_close = re.search(_jrxml_close, text, re.IGNORECASE)
+    if xml_start >= 0 and jr_close:
+        jr_end = jr_close.end()
+        return text[xml_start:jr_end].strip()

    return text
@@ -16,6 +16,7 @@ Usage:

 import asyncio
 import base64
+import contextvars
 import json
 import mimetypes
 import os
@@ -97,25 +98,30 @@ SKIP_NODES = {"load_session", "process_input", "manage_context",
 _api_log = get_logger("api")
 UPLOADS_DIR = Path(os.getenv("UPLOADS_DIR", "./uploads"))

+def _check_session_id(session_id: str) -> None:
+    """校验 session_id 合法性（防路径穿越），非法时抛出 HTTPException(400)。"""
+    from backend.session import validate_session_id
+    if not validate_session_id(session_id):
+        raise HTTPException(status_code=400, detail=f"Invalid session_id: {session_id!r}")
+
 # ─────────────────────────────────────────────
 # 图编译（全局单例，带 node_start 回调）
 # ─────────────────────────────────────────────

-# 当前请求的事件队列（单个用户桌面应用，无并发问题）
+# 当前请求的事件队列（单个用户桌面应用）
 _current_event_queue: Optional[queue.Queue] = None
-_step_counter: int = 0
+_step_counter: contextvars.ContextVar[int] = contextvars.ContextVar('_step_counter', default=0)


 def _on_node_start(node_name: str):
    """全局 node_start 回调 — 将事件推入当前请求的事件队列。"""
-    global _step_counter
    q = _current_event_queue
    if q is not None:
-        _step_counter += 1
+        _step_counter.set(_step_counter.get() + 1)
        q.put(("node_start", {
            "node": node_name,
            "label": NODE_LABELS.get(node_name, node_name),
-            "step_index": _step_counter,
+            "step_index": _step_counter.get(),
        }))


@@ -180,14 +186,18 @@ def _run_graph_sync(agent_state: AgentState, event_q: queue.Queue):
                if mode == "updates" and isinstance(data, dict):
                    for node_state in data.values():
                        if isinstance(node_state, dict):
-                            agent_state.update(node_state)
+                            agent_state.update({k: v for k, v in node_state.items() if v is not None})
        # 在 graph 完成后立即保存 session，防止 SSE 流中断导致数据丢失
        sid = agent_state.get("session_id", "")
        if sid:
            try:
                save_session(sid, agent_state)
-            except Exception:
-                pass  # 静默失败，SSE 流中还有一次保存机会
+            except Exception as exc:
+                _api_log.error("图运行中保存会话失败", extra={
+                    "session_id": sid,
+                    "error": str(exc),
+                    "traceback": traceback.format_exc(),
+                })
        event_q.put(("done", {"reason": "graph_completed"}))
    except Exception as exc:
        event_q.put(("error", {
@@ -198,9 +208,9 @@ def _run_graph_sync(agent_state: AgentState, event_q: queue.Queue):

 async def _sse_generator(agent_state: AgentState, session_id: str = "") -> str:
    """SSE 事件生成器 —— 在后台线程运行图，异步产出 SSE 字符串。"""
-    global _current_event_queue, _step_counter
+    global _current_event_queue

-    _step_counter = 0
+    _step_counter.set(0)
    t_start = time.time()
    event_q: queue.Queue = queue.Queue()
    _current_event_queue = event_q
@@ -347,6 +357,7 @@ async def list_sessions():

@app.get("/api/sessions/{session_id}")
 async def get_session(session_id: str):
+    _check_session_id(session_id)
    data = get_session_state(session_id)
    if data is None:
        raise HTTPException(status_code=404, detail="会话不存在")
@@ -361,6 +372,7 @@ async def get_session(session_id: str):

@app.delete("/api/sessions/{session_id}")
 async def remove_session(session_id: str):
+    _check_session_id(session_id)
    ok = delete_session(session_id)
    if not ok:
        raise HTTPException(status_code=404, detail="会话不存在或已删除")
@@ -373,6 +385,8 @@ async def remove_session(session_id: str):

@app.post("/api/upload")
 async def upload_file(file: UploadFile = File(...), session_id: str = ""):
+    if session_id:
+        _check_session_id(session_id)
    file_id = uuid.uuid4().hex[:12]
    _ensure_upload_dir(session_id)

@@ -492,6 +506,7 @@ async def chat(session_id: str, payload: dict):
    Returns:
        text/event-stream (SSE)
    """
+    _check_session_id(session_id)
    text = payload.get("text", "").strip()
    file_ids = payload.get("file_ids", [])

@@ -577,6 +592,7 @@ async def chat(session_id: str, payload: dict):
@app.get("/api/sessions/{session_id}/download/latest")
 async def download_latest(session_id: str):
    """下载最新 JRXML 文件。"""
+    _check_session_id(session_id)
    data = load_session(session_id)
    if data is None:
        raise HTTPException(status_code=404, detail="会话不存在")
@@ -601,6 +617,7 @@ async def download_latest(session_id: str):
@app.get("/api/sessions/{session_id}/download/{version}")
 async def download_version(session_id: str, version: int):
    """下载指定版本的 JRXML 文件。"""
+    _check_session_id(session_id)
    data = load_session(session_id)
    if data is None:
        raise HTTPException(status_code=404, detail="会话不存在")
@@ -109,19 +109,36 @@ class _LLMLoggingWrapper(_BaseLLM):
            resp_text = "".join(full)
            resp_len = len(resp_text)
            resp_preview = resp_text[:500]
-            _llm_log.info(
-                "LLM stream 完成",
-                extra={
-                    "direction": "response",
-                    "model": self._model,
-                    "backend": self._backend,
-                    "caller": self._caller,
-                    "duration_ms": elapsed,
-                    "response_length": resp_len,
-                    "response_preview": resp_preview,
-                    "response": resp_text[:10000],
-                },
-            )
+            stop_reason = getattr(self._inner, '_last_stop_reason', None)
+            self._last_stop_reason = stop_reason
+            if stop_reason == "max_tokens":
+                _llm_log.warning(
+                    "LLM stream 截断 (max_tokens)，输出可能不完整",
+                    extra={
+                        "direction": "response",
+                        "model": self._model,
+                        "backend": self._backend,
+                        "caller": self._caller,
+                        "duration_ms": elapsed,
+                        "response_length": resp_len,
+                        "stop_reason": stop_reason,
+                    },
+                )
+            else:
+                _llm_log.info(
+                    "LLM stream 完成",
+                    extra={
+                        "direction": "response",
+                        "model": self._model,
+                        "backend": self._backend,
+                        "caller": self._caller,
+                        "duration_ms": elapsed,
+                        "response_length": resp_len,
+                        "response_preview": resp_preview,
+                        "response": resp_text[:10000],
+                        "stop_reason": stop_reason,
+                    },
+                )
        except Exception as e:
            elapsed = round((time.time() - t0) * 1000)
            _llm_log.error(
@@ -166,11 +183,14 @@ def _build_raw_llm(caller: str = "") -> tuple[_BaseLLM, str, str]:
        base_url = os.getenv("ANTHROPIC_BASE_URL") or os.getenv("OPENAI_BASE_URL", "https://api.minimaxi.com/anthropic")
        model = os.getenv("LLM_MODEL", "MiniMax-M2.7")
        temperature = 0.1
-        max_tokens = 4096
+        max_tokens = 8192

        client = Anthropic(api_key=api_key, base_url=base_url, timeout=120)

        class MiniMaxLLM(_BaseLLM):
+            def __init__(self):
+                self._last_stop_reason = None
+
            def invoke(self, prompt: str) -> Any:
                resp = client.messages.create(
                    model=model,
@@ -185,6 +205,7 @@ def _build_raw_llm(caller: str = "") -> tuple[_BaseLLM, str, str]:
                return type("Response", (), {"content": ""})()

            def stream(self, prompt: str):
+                self._last_stop_reason = None
                with client.messages.stream(
                    model=model,
                    max_tokens=max_tokens,
@@ -193,6 +214,11 @@ def _build_raw_llm(caller: str = "") -> tuple[_BaseLLM, str, str]:
                ) as s:
                    for text in s.text_stream:
                        yield text
+                    try:
+                        final_msg = s.get_final_message()
+                        self._last_stop_reason = getattr(final_msg, 'stop_reason', None)
+                    except Exception:
+                        pass

            def get_num_tokens(self, text: str) -> int:
                resp = client.messages.count_tokens(
@@ -5,6 +5,7 @@

 import json
 import os
+import re
 import uuid
 import tempfile
 from datetime import datetime, timezone
@@ -26,18 +27,27 @@ def _ensure_dir():
    SESSIONS_DIR.mkdir(parents=True, exist_ok=True)


+_VALID_SESSION_ID_RE = re.compile(r'^[a-fA-F0-9]{12,}$')
+
+def validate_session_id(session_id: str) -> bool:
+    """校验 session_id 仅含合法 hex 字符（防路径穿越）。"""
+    return bool(_VALID_SESSION_ID_RE.match(session_id))
+
 def _session_path(session_id: str) -> Path:
+    if not validate_session_id(session_id):
+        raise ValueError(f"Invalid session_id: {session_id!r}")
    return SESSIONS_DIR / f"{session_id}.json"


 def generate_session_id() -> str:
-    return uuid.uuid4().hex[:12]
+    return uuid.uuid4().hex


-def create_session(name: str = "", agent_state: Optional[dict] = None) -> dict:
-    """创建新会话，返回会话元数据。"""
+def create_session(name: str = "", agent_state: Optional[dict] = None,
+                   session_id: Optional[str] = None) -> dict:
+    """创建新会话，返回会话元数据。session_id 可选——传入时使用指定 ID。"""
    _ensure_dir()
-    sid = generate_session_id()
+    sid = session_id or generate_session_id()
    now = _now_iso()
    agent_state = agent_state or {}
    agent_state["session_id"] = sid
@@ -57,7 +67,10 @@ def create_session(name: str = "", agent_state: Optional[dict] = None) -> dict:
 def load_session(session_id: str) -> Optional[dict]:
    """按 ID 加载会话数据。未找到则返回 None。"""
    _ensure_dir()
-    fp = _session_path(session_id)
+    try:
+        fp = _session_path(session_id)
+    except ValueError:
+        return None
    if not fp.exists():
        return None
    with open(fp, "r", encoding="utf-8") as f:
@@ -131,7 +144,10 @@ def list_all_sessions() -> list[dict]:
 def delete_session(session_id: str) -> bool:
    """按 ID 删除会话文件。"""
    _ensure_dir()
-    fp = _session_path(session_id)
+    try:
+        fp = _session_path(session_id)
+    except ValueError:
+        return False
    if fp.exists():
        fp.unlink()
        _session_log.info("删除会话", extra={"session_id": session_id})
@@ -12,6 +12,7 @@
        "vue": "^3.5.34"
      },
      "devDependencies": {
+        "@playwright/test": "^1.60.0",
        "@types/node": "^24.12.3",
        "@vitejs/plugin-vue": "^6.0.6",
        "@vue/tsconfig": "^0.9.1",
@@ -135,6 +136,22 @@
        "url": "https://github.com/sponsors/Boshen"
      }
    },
+    "node_modules/@playwright/test": {
+      "version": "1.60.0",
+      "resolved": "https://registry.npmjs.org/@playwright/test/-/test-1.60.0.tgz",
+      "integrity": "sha512-O71yZIbAh/PxDMNGns37GHBIfrVkEVyn+AXyIa5dOTfb4/xNvRWV+Vv/NMbNCtODB/pO7vLlF2OTmMVLhmr7Ag==",
+      "dev": true,
+      "license": "Apache-2.0",
+      "dependencies": {
+        "playwright": "1.60.0"
+      },
+      "bin": {
+        "playwright": "cli.js"
+      },
+      "engines": {
+        "node": ">=18"
+      }
+    },
    "node_modules/@rolldown/binding-android-arm64": {
      "version": "1.0.2",
      "resolved": "https://registry.npmjs.org/@rolldown/binding-android-arm64/-/binding-android-arm64-1.0.2.tgz",
@@ -1104,6 +1121,53 @@
        }
      }
    },
+    "node_modules/playwright": {
+      "version": "1.60.0",
+      "resolved": "https://registry.npmjs.org/playwright/-/playwright-1.60.0.tgz",
+      "integrity": "sha512-hheHdokM8cdqCb0lcE3s+zT4t4W+vvjpGxsZlDnikarzx8tSzMebh3UiFtgqwFwnTnjYQcsyMF8ei2mCO/tpeA==",
+      "dev": true,
+      "license": "Apache-2.0",
+      "dependencies": {
+        "playwright-core": "1.60.0"
+      },
+      "bin": {
+        "playwright": "cli.js"
+      },
+      "engines": {
+        "node": ">=18"
+      },
+      "optionalDependencies": {
+        "fsevents": "2.3.2"
+      }
+    },
+    "node_modules/playwright-core": {
+      "version": "1.60.0",
+      "resolved": "https://registry.npmjs.org/playwright-core/-/playwright-core-1.60.0.tgz",
+      "integrity": "sha512-9bW6zvX/m0lEbgTKJ6YppOKx8H3VOPBMOCFh2irXFOT4BbHgrx5hPjwJYLT40Lu+4qtD36qKc/Hn56StUW57IA==",
+      "dev": true,
+      "license": "Apache-2.0",
+      "bin": {
+        "playwright-core": "cli.js"
+      },
+      "engines": {
+        "node": ">=18"
+      }
+    },
+    "node_modules/playwright/node_modules/fsevents": {
+      "version": "2.3.2",
+      "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.2.tgz",
+      "integrity": "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==",
+      "dev": true,
+      "hasInstallScript": true,
+      "license": "MIT",
+      "optional": true,
+      "os": [
+        "darwin"
+      ],
+      "engines": {
+        "node": "^8.16.0 || ^10.6.0 || >=11.0.0"
+      }
+    },
    "node_modules/postcss": {
      "version": "8.5.15",
      "resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.15.tgz",
@@ -6,13 +6,15 @@
  "scripts": {
    "dev": "vite",
    "build": "vue-tsc -b && vite build",
-    "preview": "vite preview"
+    "preview": "vite preview",
+    "test:e2e": "playwright test"
  },
  "dependencies": {
    "pinia": "^3.0.4",
    "vue": "^3.5.34"
  },
  "devDependencies": {
+    "@playwright/test": "^1.60.0",
    "@types/node": "^24.12.3",
    "@vitejs/plugin-vue": "^6.0.6",
    "@vue/tsconfig": "^0.9.1",
@@ -0,0 +1,20 @@
+import { defineConfig } from "@playwright/test";
+
+export default defineConfig({
+  testDir: "./tests/e2e",
+  timeout: 60000,
+  expect: { timeout: 10000 },
+  retries: 0,
+  use: {
+    baseURL: "http://localhost:5173",
+    headless: true,
+    screenshot: "only-on-failure",
+    trace: "retain-on-failure",
+  },
+  webServer: {
+    command: "npm run dev",
+    url: "http://localhost:5173",
+    reuseExistingServer: true,
+    timeout: 30000,
+  },
+});
@@ -109,6 +109,11 @@ async function handleSend(text: string, files: File[]) {
  } catch (e: any) {
    chat.setError(e.message || '网络请求失败')
    chat.addMessage({ role: 'assistant', content: `请求失败: ${e.message}`, type: 'error' })
+    chat.finishStreaming({ status: '' })
+  } finally {
+    if (chat.streaming) {
+      chat.finishStreaming({ status: '' })
+    }
  }
 }
 </script>
@@ -0,0 +1,168 @@
+/**
+ * E2E tests: key user flows for the JRXML Agent frontend.
+ *
+ * Pre-requisites: npm run dev (reuseExistingServer in playwright.config).
+ * API calls are intercepted by page.route() — no real backend needed.
+ */
+
+import { test, expect } from "@playwright/test";
+
+// ── helpers ────────────────────────────────────────────────────
+
+function mockApi(page: any) {
+  page.route("**/api/health", (route: any) =>
+    route.fulfill({ json: { status: "ok", version: "5.0" } })
+  );
+
+  page.route("**/api/sessions", (route: any) => {
+    if (route.request().method() === "POST") {
+      return route.fulfill({
+        json: {
+          session_id: "test12345678",
+          session_name: "新建报表 2026-05-22",
+          created_at: "2026-05-22T10:00:00.000Z",
+          updated_at: "2026-05-22T10:00:00.000Z",
+        },
+      });
+    }
+    return route.fulfill({ json: { sessions: [] } });
+  });
+
+  page.route("**/api/sessions/*/chat", (route: any) => {
+    const sseBody = [
+      "event: node_start",
+      'data: {"node":"classify_intent","label":"识别意图","step_index":1}',
+      "",
+      "event: node_complete",
+      'data: {"node":"classify_intent","label":"识别意图","detail":"意图: 新建报表"}',
+      "",
+      "event: agent_complete",
+      'data: {"reason":"done","intent":"initial_generation","status":"pass","jrxml_length":42,"versions":1,"total_duration_ms":1200}',
+      "",
+      "",
+    ].join("\n");
+    return route.fulfill({
+      status: 200,
+      headers: { "content-type": "text/event-stream" },
+      body: sseBody,
+    });
+  });
+
+  page.route("**/api/upload", (route: any) =>
+    route.fulfill({
+      json: { file_id: "f001122334455", filename: "test.png", size: 1024 },
+    })
+  );
+
+  // Catch-all for GET/DELETE /api/sessions/:id (must fallback for POST to let chat route match)
+  page.route("**/api/sessions/**", (route: any) => {
+    if (route.request().method() === "DELETE") {
+      return route.fulfill({ json: { status: "deleted" } });
+    }
+    if (route.request().method() === "GET") {
+      return route.fulfill({
+        json: {
+          session_id: "test12345678",
+          session_name: "测试会话",
+          agent_state: { current_jrxml: "<jasperReport/>" },
+        },
+      });
+    }
+    return route.fallback();
+  });
+}
+
+// ── tests ──────────────────────────────────────────────────────
+
+test.describe("Page load", () => {
+  test("renders sidebar and input area", async ({ page }) => {
+    await mockApi(page);
+    await page.goto("/");
+
+    await expect(page.locator("aside.sidebar")).toBeVisible();
+    await expect(page.locator("h2")).toContainText("JRXML");
+    await expect(page.locator(".unified-input")).toBeVisible();
+  });
+
+  test("sidebar shows session list header and new button", async ({ page }) => {
+    await mockApi(page);
+    await page.goto("/");
+
+    await expect(page.getByText("会话列表")).toBeVisible();
+    await expect(page.locator('button[title="新建会话"]')).toBeVisible();
+  });
+});
+
+test.describe("Session management", () => {
+  test("creates new session on button click", async ({ page }) => {
+    await mockApi(page);
+    await page.goto("/");
+
+    await page.locator('button[title="新建会话"]').click();
+    await expect(page.locator(".session-item")).toBeVisible({ timeout: 5000 });
+    await expect(page.locator(".session-item.active")).toBeVisible();
+  });
+
+  test("can delete current session", async ({ page }) => {
+    await mockApi(page);
+    await page.goto("/");
+
+    await page.locator('button[title="新建会话"]').click();
+    await expect(page.locator(".session-item")).toBeVisible({ timeout: 5000 });
+
+    page.on("dialog", (dialog) => dialog.accept());
+    await page.locator(".btn-delete").click();
+
+    await expect(page.locator(".session-item")).toHaveCount(0, { timeout: 5000 });
+  });
+});
+
+test.describe("Chat flow", () => {
+  test("sends text and displays user message + process section", async ({ page }) => {
+    await mockApi(page);
+    await page.goto("/");
+
+    await page.locator('button[title="新建会话"]').click();
+    await expect(page.locator(".session-item")).toBeVisible({ timeout: 5000 });
+
+    const textarea = page.locator(".unified-input textarea");
+    await textarea.fill("生成一个员工名册报表");
+    await page.locator(".send-btn").click();
+
+    await expect(
+      page.locator(".chat-messages .message.msg-user").filter({ hasText: "员工名册" })
+    ).toBeVisible({ timeout: 10000 });
+
+    await expect(page.locator(".process-section")).toBeVisible({ timeout: 10000 });
+  });
+
+  test("summary card appears after stream complete", async ({ page }) => {
+    await mockApi(page);
+    await page.goto("/");
+
+    await page.locator('button[title="新建会话"]').click();
+    await expect(page.locator(".session-item")).toBeVisible({ timeout: 5000 });
+
+    await page.locator(".unified-input textarea").fill("生成报表");
+    await page.locator(".send-btn").click();
+
+    await expect(page.locator(".summary-card")).toBeVisible({ timeout: 15000 });
+  });
+});
+
+test.describe("Input UX", () => {
+  test("send button disabled when input empty", async ({ page }) => {
+    await mockApi(page);
+    await page.goto("/");
+
+    await expect(page.locator(".send-btn")).toBeDisabled();
+  });
+
+  test("send button enabled when text entered", async ({ page }) => {
+    await mockApi(page);
+    await page.goto("/");
+
+    await page.locator(".unified-input textarea").fill("Hi");
+    await expect(page.locator(".send-btn")).toBeEnabled();
+  });
+});
@@ -4,6 +4,7 @@
 - 只输出完整修复后的 JRXML 代码，不要解释，不要 markdown 标记。
 - JRXML 必须与 JasperReports 7.0.6 兼容。
 - 解决下面列出的特定错误。
+- 如果当前 JRXML 内容为空或过短（<200 字符），请根据下方提供的 OCR 识别数据和布局 schema 重新生成完整的 JRXML，而非输出一个占位桩。

 当前 JRXML（带错误）：
 {current_jrxml}
@@ -14,4 +15,8 @@
 错误的自然语言解释：
 {explanation}

+{ocr_context}
+
+{layout_schema_text}
+
 立即生成修正后的 JRXML：
@@ -1,47 +1,4 @@
@echo off
-setlocal enabledelayedexpansion
-echo ================================================
-echo   agent_jrxml 启动 (API + 验证)
-echo ================================================
 cd /d "%~dp0"
-
-:: 清理残留进程
-echo [清理] 检查残留进程...
-for /f "tokens=5" %%a in ('netstat -ano ^| findstr ":8000.*LISTENING"') do (
-    taskkill /F /PID %%a >nul 2>&1 && echo   已清理 PID %%a
-)
-for /f "tokens=5" %%a in ('netstat -ano ^| findstr ":8001.*LISTENING"') do (
-    taskkill /F /PID %%a >nul 2>&1 && echo   已清理 PID %%a
-)
-echo.
-
-:: 启动验证服务 (后台最小化)
-echo [启动] 验证服务 :8001
-start "jrxml-validator" /MIN .venv\Scripts\python.exe -c "import uvicorn; uvicorn.run('validation_service.main:app',host='0.0.0.0',port=8001,reload=False)"
-
-:: 等待验证服务就绪 (用 PowerShell 检测)
-echo [等待] 验证服务就绪...
-:wait_val
-ping -n 2 127.0.0.1 >nul
-powershell -Command "try{$r=Invoke-WebRequest -Uri http://localhost:8001/health -TimeoutSec 2 -UseBasicParsing;exit 0}catch{exit 1}" >nul 2>&1
-if errorlevel 1 goto wait_val
-echo    :8001 就绪
-
-:: 启动 API 服务 (前台，Ctrl+C 退出)
-echo [启动] API 服务 :8000
-echo ================================================
-echo   服务已就绪:
-echo     API:    http://localhost:8000/docs
-echo     验证:   http://localhost:8001/health
-echo   按 Ctrl+C 停止 API 服务
-echo   关闭窗口后会自动清理验证服务
-echo ================================================
-.venv\Scripts\python.exe -c "import uvicorn; uvicorn.run('api_server:app',host='0.0.0.0',port=8000,reload=False)"
-
-:: API 进程退出后自动清理
-echo.
-echo [清理] 停止验证服务...
-taskkill /F /FI "WINDOWTITLE eq jrxml-validator*" >nul 2>&1
-for /f "tokens=5" %%a in ('netstat -ano ^| findstr ":8001.*LISTENING"') do taskkill /F /PID %%a >nul 2>&1
-echo 已停止所有服务
+.venv\Scripts\python.exe start.py
 pause
@@ -0,0 +1,132 @@
+"""Start all jrxml-agent services (validator, API, frontend)."""
+import subprocess
+import sys
+import time
+import urllib.request
+from pathlib import Path
+
+ROOT = Path(__file__).parent
+VENV_PYTHON = ROOT / ".venv" / "Scripts" / "python.exe"
+FRONTEND_DIR = ROOT / "frontend"
+
+
+def kill_port(port: int):
+    """Kill any process listening on the given port."""
+    import os
+    import signal
+    try:
+        result = subprocess.run(
+            ["netstat", "-ano"], capture_output=True, text=True
+        )
+        for line in result.stdout.splitlines():
+            if f":{port}" in line and "LISTENING" in line:
+                parts = line.split()
+                pid = int(parts[-1])
+                print(f"  Killing PID {pid} on port {port}")
+                os.kill(pid, signal.SIGTERM)
+    except Exception as e:
+        print(f"  (cleanup note: {e})")
+
+
+def wait_http(url: str, timeout: int = 60, interval: float = 2.0) -> bool:
+    """Wait until an HTTP endpoint responds 200, or timeout."""
+    deadline = time.time() + timeout
+    while time.time() < deadline:
+        try:
+            urllib.request.urlopen(url, timeout=2)
+            return True
+        except Exception:
+            time.sleep(interval)
+    return False
+
+
+def wait_port(port: int, timeout: int = 60, interval: float = 3.0) -> bool:
+    """Wait until a TCP port is listening."""
+    deadline = time.time() + timeout
+    while time.time() < deadline:
+        try:
+            result = subprocess.run(
+                ["netstat", "-ano"], capture_output=True, text=True
+            )
+            for line in result.stdout.splitlines():
+                if f":{port}" in line and "LISTENING" in line:
+                    return True
+        except Exception:
+            pass
+        time.sleep(interval)
+    return False
+
+
+def main():
+    if not VENV_PYTHON.exists():
+        print("[ERROR] .venv not found. Create a virtual environment first.")
+        sys.exit(1)
+
+    print("=" * 48)
+    print("  jrxml-agent launcher (full stack)")
+    print("=" * 48)
+
+    # -- cleanup --
+    print("\n[Cleanup] Checking residual processes...")
+    for port in (8000, 8001, 5173):
+        kill_port(port)
+
+    # -- 1. Validator --
+    print("\n[1/3] Starting validator on :8001 ...")
+    subprocess.Popen(
+        [str(VENV_PYTHON), "-c",
+         "import uvicorn; uvicorn.run('validation_service.main:app',host='0.0.0.0',port=8001,reload=False)"],
+        cwd=str(ROOT),
+        creationflags=subprocess.CREATE_NO_WINDOW,
+    )
+    if not wait_http("http://localhost:8001/health"):
+        print("[FAIL] Validator did not start in time.")
+        sys.exit(1)
+    print("  :8001 ready")
+
+    # -- 2. API --
+    print("[2/3] Starting API on :8000 ...")
+    subprocess.Popen(
+        [str(VENV_PYTHON), "-c",
+         "import uvicorn; uvicorn.run('api_server:app',host='0.0.0.0',port=8000,reload=False)"],
+        cwd=str(ROOT),
+        creationflags=subprocess.CREATE_NO_WINDOW,
+    )
+    if not wait_http("http://localhost:8000/api/health"):
+        print("[FAIL] API server did not start in time.")
+        sys.exit(1)
+    print("  :8000 ready")
+
+    # -- 3. Frontend --
+    print("[3/3] Starting frontend on :5173 ...")
+    if not (FRONTEND_DIR / "node_modules").exists():
+        print("  Installing npm dependencies...")
+        subprocess.run(["npm", "install"], cwd=str(FRONTEND_DIR), check=True)
+
+    subprocess.Popen(
+        ["npm", "run", "dev"],
+        cwd=str(FRONTEND_DIR),
+        creationflags=subprocess.CREATE_NO_WINDOW,
+    )
+    if not wait_port(5173):
+        print("[FAIL] Frontend did not start in time.")
+        sys.exit(1)
+    print("  :5173 ready")
+
+    print("\n" + "=" * 48)
+    print("  All services ready:")
+    print("    Frontend:  http://localhost:5173")
+    print("    API:       http://localhost:8000/docs")
+    print("    Validator: http://localhost:8001/health")
+    print("  Press Ctrl+C to stop all services")
+    print("=" * 48)
+
+    try:
+        while True:
+            time.sleep(1)
+    except KeyboardInterrupt:
+        print("\nShutting down...")
+
+
+if __name__ == "__main__":
+    main()
@@ -1,50 +1,4 @@
@echo off
-setlocal enabledelayedexpansion
-echo ================================================
-echo   agent_jrxml 启动 (全栈)
-echo ================================================
 cd /d "%~dp0"
-
-:: 清理残留进程
-echo [清理] 检查残留进程...
-for /f "tokens=5" %%a in ('netstat -ano ^| findstr ":8000.*LISTENING"') do taskkill /F /PID %%a >nul 2>&1
-for /f "tokens=5" %%a in ('netstat -ano ^| findstr ":8001.*LISTENING"') do taskkill /F /PID %%a >nul 2>&1
-for /f "tokens=5" %%a in ('netstat -ano ^| findstr ":5173.*LISTENING"') do taskkill /F /PID %%a >nul 2>&1
-echo.
-
-:: 1. 验证服务
-echo [1/3] 验证服务 :8001
-start "jrxml-validator" /MIN .venv\Scripts\python.exe -c "import uvicorn; uvicorn.run('validation_service.main:app',host='0.0.0.0',port=8001,reload=False)"
-:wait_val
-ping -n 2 127.0.0.1 >nul
-powershell -Command "try{$r=Invoke-WebRequest -Uri http://localhost:8001/health -TimeoutSec 2 -UseBasicParsing;exit 0}catch{exit 1}" >nul 2>&1
-if errorlevel 1 goto wait_val
-echo    :8001 就绪
-
-:: 2. API 服务
-echo [2/3] API 服务 :8000
-start "jrxml-api" /MIN .venv\Scripts\python.exe -c "import uvicorn; uvicorn.run('api_server:app',host='0.0.0.0',port=8000,reload=False)"
-:wait_api
-ping -n 2 127.0.0.1 >nul
-powershell -Command "try{$r=Invoke-WebRequest -Uri http://localhost:8000/api/health -TimeoutSec 2 -UseBasicParsing;exit 0}catch{exit 1}" >nul 2>&1
-if errorlevel 1 goto wait_api
-echo    :8000 就绪
-
-:: 3. 前端
-echo [3/3] 前端 :5173
-start "jrxml-frontend" /MIN cmd /c "cd /d "%~dp0frontend" && npm run dev"
-:wait_fe
-ping -n 3 127.0.0.1 >nul
-powershell -Command "try{$r=Invoke-WebRequest -Uri http://localhost:5173 -TimeoutSec 3 -UseBasicParsing;exit 0}catch{exit 1}" >nul 2>&1
-if errorlevel 1 goto wait_fe
-echo    :5173 就绪
-
-echo.
-echo ================================================
-echo   全部就绪:
-echo     前端:   http://localhost:5173
-echo     API:    http://localhost:8000/docs
-echo     验证:   http://localhost:8001/health
-echo   运行 stop.bat 停止所有服务
-echo ================================================
+.venv\Scripts\python.exe start.py
 pause
@@ -1,8 +1,18 @@
@echo off
-chcp 65001 >nul
-echo [清理] 停止所有 agent_jrxml 服务...
-for /f "tokens=5" %%a in ('netstat -ano ^| findstr ":8000.*LISTENING"') do taskkill /F /PID %%a 2>nul
-for /f "tokens=5" %%a in ('netstat -ano ^| findstr ":8001.*LISTENING"') do taskkill /F /PID %%a 2>nul
-for /f "tokens=5" %%a in ('netstat -ano ^| findstr ":5173.*LISTENING"') do taskkill /F /PID %%a 2>nul
-echo 已停止
+cd /d "%~dp0"
+.venv\Scripts\python.exe -c "
+import os, signal, subprocess
+ports = (8000, 8001, 5173)
+for port in ports:
+    try:
+        r = subprocess.run(['netstat', '-ano'], capture_output=True, text=True)
+        for line in r.stdout.splitlines():
+            if f':{port}' in line and 'LISTENING' in line:
+                pid = int(line.split()[-1])
+                print(f'Killing PID {pid} on port {port}')
+                os.kill(pid, signal.SIGTERM)
+    except Exception as e:
+        print(f'Port {port}: {e}')
+print('Done')
+"
 pause
@@ -44,8 +44,10 @@ class TestAcceptanceScenarios:

        final = run_graph(graph, state)
        assert final.get("current_jrxml"), "应该已生成 JRXML"
-        # 注意：通过/失败取决于 LLM 输出质量；我们检查是否得到了结果
-        print(f"场景 1 状态: {final.get('status')}, 错误: {final.get('error_msg', '')[:100]}")
+        assert final.get("status") in ("pass", "fail"), f"意外状态: {final.get('status')}"
+        import re
+        assert re.search(r"<[\w:]*jasperReport", final["current_jrxml"]), \
+            "输出应包含合法 JRXML 根元素（支持命名空间前缀如 ns0:jasperReport）"

    def test_scenario2_auto_correction(self, graph):
        """场景 2：故意提出一个可能初次失败的需求。"""
@@ -58,7 +60,8 @@ class TestAcceptanceScenarios:

        final = run_graph(graph, state)
        assert final.get("retry_count", 0) <= 5, "不应超过最大重试次数"
-        print(f"场景 2 状态: {final.get('status')}, 重试次数: {final.get('retry_count', 0)}")
+        assert "status" in final, "最终状态应包含 status 字段"
+        assert final.get("current_jrxml") or final.get("error_msg"), "应有输出或错误消息"

    def test_scenario3_multi_turn_modification(self, graph):
        """场景 3：多轮对话 - 先生成，再修改两次。"""
@@ -71,8 +74,8 @@ class TestAcceptanceScenarios:
        state["stage"] = "initial_generation"

        final = run_graph(graph, state)
-        print(f"第 1 轮状态: {final.get('status')}, 错误: {final.get('error_msg', '')[:100]}")
        assert final.get("current_jrxml"), "第 1 轮应该已生成 JRXML"
+        assert final.get("status") in ("pass", "fail")

        # 第 2 轮：添加月度销售汇总
        state2 = final.copy()
@@ -82,8 +85,8 @@ class TestAcceptanceScenarios:
        state2["retry_count"] = 0

        final2 = run_graph(graph, state2)
-        print(f"第 2 轮状态: {final2.get('status')}")
        assert final2.get("current_jrxml"), "第 2 轮应该已修改 JRXML"
+        assert final2.get("status") in ("pass", "fail")

        # 第 3 轮：修改标题
        state3 = final2.copy()
@@ -93,9 +96,9 @@ class TestAcceptanceScenarios:
        state3["retry_count"] = 0

        final3 = run_graph(graph, state3)
-        print(f"第 3 轮状态: {final3.get('status')}")
        jrxml = final3.get("current_jrxml", "")
        assert "2024" in jrxml or "Annual" in jrxml, "标题修改应该体现在 JRXML 中"
+        assert final3.get("status") in ("pass", "fail")

    def test_scenario4_context_aware_modification(self, graph):
        """场景 4：基于对话上下文的修改。"""
@@ -109,7 +112,7 @@ class TestAcceptanceScenarios:
        state["stage"] = "initial_generation"

        final = run_graph(graph, state)
-        print(f"第 1 轮状态: {final.get('status')}")
+        assert final.get("current_jrxml"), "第 1 轮应该已生成 JRXML"

        # 第 2 轮：上下文感知修改
        state2 = final.copy()
@@ -119,17 +122,23 @@ class TestAcceptanceScenarios:
        state2["retry_count"] = 0

        final2 = run_graph(graph, state2)
-        print(f"第 2 轮状态: {final2.get('status')}")
        jrxml = final2.get("current_jrxml", "")
        assert "isBold" in jrxml or "size=" in jrxml, "字体修改应该体现在结果中"
+        assert final2.get("status") in ("pass", "fail")

    def test_max_retry_handling(self, graph):
-        """测试在 MAX_RETRY 次失败后，图能否正常终止。"""
+        """测试在 MAX_RETRY 次失败后，图能否正常终止。
+
+        process_input 会重置 retry_count 为 0，因此不依赖初始值。
+        实际验证：图在多次修正后终止（不挂死），renry_count 至少为 1。
+        MAX_RETRY 配置为 5（环境变量），图在达到上限后路由到 finalize。
+        """
        state = create_initial_state()
        state["current_jrxml"] = "<invalid>xml<<<"
        state["user_input"] = "Fix this"
-        state["retry_count"] = 5  # 已达到最大重试次数
        state["status"] = "fail"

        final = run_graph(graph, state)
-        assert final.get("retry_count", 0) >= 5 or final.get("status") == "pass"
+        # 图应正常终止：status=pass（LLM修复成功）或 retry_count>=1（至少尝试了修正）
+        assert final.get("retry_count", 0) >= 1 or final.get("status") == "pass", \
+            f"图应在至少1次修正后终止，实际 retry_count={final.get('retry_count')} status={final.get('status')}"
@@ -0,0 +1,267 @@
+"""api_server.py 集成测试 — REST 端点 + SSE 流 + 文件上传/下载。
+
+使用 FastAPI TestClient，不需要启动真实服务器。
+"""
+
+import io
+import json
+import os
+import sys
+import tempfile
+from pathlib import Path
+from unittest.mock import patch, MagicMock
+
+import pytest
+from fastapi.testclient import TestClient
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+
+from api_server import app
+
+
+@pytest.fixture
+def client():
+    return TestClient(app)
+
+
+@pytest.fixture
+def temp_sessions(monkeypatch):
+    """重定向上传目录到临时目录，隔离测试数据。"""
+    with tempfile.TemporaryDirectory(prefix="test_api_") as tmpdir:
+        monkeypatch.setattr("api_server.UPLOADS_DIR", Path(tmpdir) / "uploads")
+        monkeypatch.setattr("backend.session.SESSIONS_DIR", Path(tmpdir) / "sessions")
+        yield Path(tmpdir)
+
+
+# ── 健康检查 & 配置 ────────────────────────────────────────────
+
+class TestHealthAndConfig:
+    def test_health_returns_ok(self, client):
+        resp = client.get("/api/health")
+        assert resp.status_code == 200
+        data = resp.json()
+        assert data["status"] == "ok"
+        assert data["version"] == "5.0"
+        assert "timestamp" in data
+
+    def test_config_returns_env_keys(self, client):
+        resp = client.get("/api/config")
+        assert resp.status_code == 200
+        cfg = resp.json()["config"]
+        for key in ("LLM_PROVIDER", "OCR_ENGINE", "MAX_RETRY"):
+            assert key in cfg
+
+
+# ── 会话 CRUD ──────────────────────────────────────────────────
+
+class TestSessionCRUD:
+    def test_create_session(self, client, temp_sessions):
+        resp = client.post("/api/sessions")
+        assert resp.status_code == 200
+        data = resp.json()
+        assert len(data["session_id"]) == 32
+        assert "session_name" in data
+        assert "created_at" in data
+
+    def test_list_sessions_empty(self, client, temp_sessions):
+        assert client.get("/api/sessions").json()["sessions"] == []
+
+    def test_list_sessions_populated(self, client, temp_sessions):
+        client.post("/api/sessions")
+        client.post("/api/sessions")
+        assert len(client.get("/api/sessions").json()["sessions"]) == 2
+
+    def test_get_session_found(self, client, temp_sessions):
+        created = client.post("/api/sessions").json()
+        resp = client.get(f"/api/sessions/{created['session_id']}")
+        assert resp.status_code == 200
+        assert resp.json()["session_id"] == created["session_id"]
+        assert "agent_state" in resp.json()
+
+    def test_get_session_invalid_id(self, client, temp_sessions):
+        assert client.get("/api/sessions/nonexistent").status_code == 400
+
+    def test_get_session_not_found(self, client, temp_sessions):
+        assert client.get("/api/sessions/aabbccddeeff0011223344").status_code == 404
+
+    def test_delete_session(self, client, temp_sessions):
+        sid = client.post("/api/sessions").json()["session_id"]
+        resp = client.delete(f"/api/sessions/{sid}")
+        assert resp.status_code == 200
+        assert resp.json()["status"] == "deleted"
+        assert client.get(f"/api/sessions/{sid}").status_code == 404
+
+    def test_delete_nonexistent(self, client, temp_sessions):
+        assert client.delete("/api/sessions/aabbccddeeff0011223344").status_code == 404
+
+    def test_full_crud_lifecycle(self, client, temp_sessions):
+        sid = client.post("/api/sessions").json()["session_id"]
+        assert client.get(f"/api/sessions/{sid}").status_code == 200
+        assert len(client.get("/api/sessions").json()["sessions"]) == 1
+        client.delete(f"/api/sessions/{sid}")
+        assert client.get("/api/sessions").json()["sessions"] == []
+
+
+# ── 文件上传 ───────────────────────────────────────────────────
+
+class TestFileUpload:
+    def test_upload_text_file(self, client, temp_sessions):
+        content = b"Hello, JRXML!"
+        resp = client.post(
+            "/api/upload",
+            files={"file": ("test.txt", io.BytesIO(content), "text/plain")},
+        )
+        assert resp.status_code == 200
+        data = resp.json()
+        assert data["filename"] == "test.txt"
+        assert data["size"] == len(content)
+        assert len(data["file_id"]) == 12
+
+    def test_upload_with_session_id_in_query(self, client, temp_sessions):
+        resp = client.post(
+            "/api/upload?session_id=aabbccddeeff0011223344",
+            files={"file": ("data.csv", io.BytesIO(b"a,b,c"), "text/csv")},
+        )
+        assert resp.status_code == 200
+
+    def test_upload_png_gets_correct_content_type(self, client, temp_sessions):
+        png_minimal = b"\x89PNG\r\n\x1a\n" + b"\x00" * 100
+        resp = client.post(
+            "/api/upload",
+            files={"file": ("chart.png", io.BytesIO(png_minimal), "image/png")},
+        )
+        assert resp.status_code == 200
+        assert resp.json()["content_type"] == "image/png"
+
+    def test_upload_writes_file_to_disk(self, client, temp_sessions):
+        data = b"persisted content"
+        file_id = client.post(
+            "/api/upload",
+            files={"file": ("note.txt", io.BytesIO(data), "text/plain")},
+        ).json()["file_id"]
+
+        matches = list(temp_sessions.rglob(f"{file_id}_*"))
+        assert len(matches) == 1
+        assert matches[0].read_bytes() == data
+
+
+# ── 下载 ───────────────────────────────────────────────────────
+
+class TestDownload:
+    def test_download_missing_session_returns_404(self, client, temp_sessions):
+        assert client.get("/api/sessions/aabbccddeeff0011223344/download/latest").status_code == 404
+
+    def test_download_no_jrxml_returns_404(self, client, temp_sessions):
+        sid = client.post("/api/sessions").json()["session_id"]
+        resp = client.get(f"/api/sessions/{sid}/download/latest")
+        assert resp.status_code == 404
+
+    def test_download_with_jrxml_returns_file(self, client, temp_sessions):
+        import backend.session as sess
+
+        sess.create_session(name="测试下载")
+        # 需要手动写入 JRXML 到会话
+        sessions = sess.list_all_sessions()
+        sid = sessions[0]["session_id"]
+        sess.save_session(sid, {"current_jrxml": "<jasperReport name='rpt'/>"})
+
+        resp = client.get(f"/api/sessions/{sid}/download/latest")
+        assert resp.status_code == 200
+        assert "<jasperReport" in resp.text
+        assert "attachment" in resp.headers.get("content-disposition", "")
+
+
+# ── 聊天 SSE ───────────────────────────────────────────────────
+
+class TestChatSSE:
+    @pytest.fixture(autouse=True)
+    def mock_graph(self, monkeypatch):
+        """Mock LangGraph 的 build_graph 和 stream，避免真实 LLM 调用。"""
+        mock_graph = MagicMock()
+        mock_graph.stream.return_value = [
+            ("updates", {"classify_intent": {"intent": "initial_generation"}}),
+            ("updates", {"generate": {"current_jrxml": "<jasperReport name='test'/>", "status": "pass"}}),
+            ("updates", {"validate": {"status": "pass"}}),
+            ("updates", {"finalize": {}}),
+            ("done", {"reason": "graph_completed"}),
+        ]
+        # 注意：_graph 是模块级变量，在导入时就编译了。需要直接替换。
+        monkeypatch.setattr("api_server._graph", mock_graph)
+        # 同时替换 agent.graph.build_graph 以防后续重新编译
+        monkeypatch.setattr("agent.graph.build_graph", lambda on_node_start=None: mock_graph)
+        return mock_graph
+
+    def test_empty_payload_rejected(self, client, temp_sessions):
+        sid = client.post("/api/sessions").json()["session_id"]
+        resp = client.post(
+            f"/api/sessions/{sid}/chat",
+            json={"text": "", "file_ids": []},
+        )
+        assert resp.status_code == 400
+
+    def test_sse_stream_returns_valid_events(self, client, temp_sessions):
+        sid = client.post("/api/sessions").json()["session_id"]
+        with client.stream(
+            "POST",
+            f"/api/sessions/{sid}/chat",
+            json={"text": "生成一个简单的员工名册报表", "file_ids": []},
+        ) as resp:
+            assert resp.status_code == 200
+            assert resp.headers["content-type"].startswith("text/event-stream")
+            body = resp.read().decode("utf-8", errors="replace")
+            assert "event: node_complete" in body
+            assert "event: agent_complete" in body
+
+    def test_auto_creates_session_on_chat(self, client, temp_sessions):
+        with client.stream(
+            "POST",
+            "/api/sessions/aabbccddeeff0011223344/chat",
+            json={"text": "生成报表", "file_ids": []},
+        ) as resp:
+            assert resp.status_code == 200
+            assert b"event:" in resp.read()
+
+    def test_unknown_file_ids_not_crash(self, client, temp_sessions):
+        sid = client.post("/api/sessions").json()["session_id"]
+        with client.stream(
+            "POST",
+            f"/api/sessions/{sid}/chat",
+            json={"text": "测试", "file_ids": ["fake_id_xyz"]},
+        ) as resp:
+            assert resp.status_code == 200
+
+
+# ── 边界 & 安全测试 ────────────────────────────────────────────
+
+class TestBoundaries:
+    def test_session_id_invalid_format_returns_400(self, client, temp_sessions):
+        """非 hex 字符的 session_id 应被拒绝。"""
+        assert client.get("/api/sessions/not_valid_hex_id").status_code == 400
+
+    def test_upload_with_path_traversal_session_id(self, client, temp_sessions):
+        """路径穿越 session_id 被拒绝。"""
+        resp = client.post(
+            "/api/upload?session_id=../malicious",
+            files={"file": ("t.txt", io.BytesIO(b"x"), "text/plain")},
+        )
+        assert resp.status_code == 400
+
+    def test_invalid_json_body_rejected(self, client, temp_sessions):
+        sid = client.post("/api/sessions").json()["session_id"]
+        resp = client.post(
+            f"/api/sessions/{sid}/chat",
+            content=b"{not valid json",
+            headers={"Content-Type": "application/json"},
+        )
+        assert resp.status_code == 422
+
+    def test_large_payload_survives(self, client, temp_sessions):
+        """大文本（100KB）不应崩溃。"""
+        sid = client.post("/api/sessions").json()["session_id"]
+        large_text = "生成报表包含字段: " + ", ".join(f"field_{i}" for i in range(5000))
+        with client.stream(
+            "POST",
+            f"/api/sessions/{sid}/chat",
+            json={"text": large_text, "file_ids": []},
+        ) as resp:
+            assert resp.status_code == 200
@@ -0,0 +1,242 @@
+"""backend/error_kb.py 单元测试 — 指纹去重 + 关键词提取 + CRUD。
+
+覆盖:
+  - _make_fingerprint 标准化与去重
+  - _extract_keywords 中英文混合提取
+  - ErrorKB.record / exists / search / search_as_context（mock ChromaDB）
+  - 全局便捷函数 record_error / search_error_cases
+"""
+
+import os
+import sys
+import json
+from pathlib import Path
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+
+from backend.error_kb import (
+    _make_fingerprint,
+    _extract_keywords,
+    ErrorKB,
+    get_error_kb,
+    record_error,
+    search_error_cases,
+)
+
+
+# ── _make_fingerprint ───────────────────────────────────────────
+
+class TestMakeFingerprint:
+    def test_same_structure_same_fingerprint(self):
+        e1 = "Field $F{customer_name} is not declared in the report"
+        e2 = "Field $F{order_total} is not declared in the report"
+        assert _make_fingerprint(e1) == _make_fingerprint(e2)
+
+    def test_different_errors_different_fingerprint(self):
+        e1 = "Missing required attribute pageWidth"
+        e2 = "Query returned 0 results"
+        assert _make_fingerprint(e1) != _make_fingerprint(e2)
+
+    def test_normalizes_variable_names(self):
+        fp1 = _make_fingerprint("Field $F{amount} not found")
+        fp2 = _make_fingerprint("Field $F{total_price} not found")
+        assert fp1 == fp2
+
+    def test_normalizes_string_literals_single_quote(self):
+        fp1 = _make_fingerprint("Value 'abc123' is invalid")
+        fp2 = _make_fingerprint("Value 'xyz789' is invalid")
+        assert fp1 == fp2
+
+    def test_normalizes_string_literals_double_quote(self):
+        fp1 = _make_fingerprint('Name "test_table" not found')
+        fp2 = _make_fingerprint('Name "prod_table" not found')
+        assert fp1 == fp2
+
+    def test_normalizes_numbers(self):
+        fp1 = _make_fingerprint("Line 42 has 100 errors")
+        fp2 = _make_fingerprint("Line 7 has 3 errors")
+        assert fp1 == fp2
+
+    def test_case_insensitive(self):
+        assert _make_fingerprint("ERROR: Missing Field") == _make_fingerprint("error: missing field")
+
+    def test_whitespace_insensitive(self):
+        e1 = "missing   field\n\ndeclaration"
+        e2 = "missing field declaration"
+        assert _make_fingerprint(e1) == _make_fingerprint(e2)
+
+    def test_output_is_16_char_hex(self):
+        fp = _make_fingerprint("some error message")
+        assert len(fp) == 16
+        assert all(c in "0123456789abcdef" for c in fp)
+
+
+# ── _extract_keywords ───────────────────────────────────────────
+
+class TestExtractKeywords:
+    def test_extracts_chinese_words(self):
+        kw = _extract_keywords("未声明的字段引用和语法错误")
+        has_cn = any(len(k) >= 2 and "一" <= k[0] <= "鿿" for k in kw)
+        assert has_cn
+
+    def test_extracts_english_tokens(self):
+        kw = _extract_keywords("missing field declaration in report")
+        assert "missing" in kw
+        assert "field" in kw
+        assert "report" in kw
+
+    def test_extracts_jrxml_patterns(self):
+        kw = _extract_keywords("Field $F{customer_name} not declared")
+        assert "$F{customer_name}" in kw
+
+    def test_short_tokens_ignored(self):
+        kw = _extract_keywords("a b c ab cd")
+        assert "ab" not in kw
+        assert "cd" not in kw
+
+    def test_empty_input_returns_empty_list(self):
+        assert _extract_keywords("") == []
+
+    def test_mixed_cn_en_jrxml(self):
+        kw = _extract_keywords("字段 $F{amount} 在 report 中未声明")
+        assert "$F{amount}" in kw
+        assert "report" in kw
+
+
+# ── ErrorKB class (mock ChromaDB) ───────────────────────────────
+
+def _make_patched_kb(client_override=None, collection_override=None):
+    """创建一个 ErrorKB 实例，其 ChromaDB 依赖已被 mock。
+
+    因为 chromadb 是懒加载的（在 client/collection property 中导入），
+    直接设置 _client/_collection 实例属性即可绕过真实 ChromaDB。
+    """
+    kb = ErrorKB()
+    kb._client = client_override or MagicMock()
+    kb._collection = collection_override or MagicMock()
+    if not client_override and not collection_override:
+        # 默认：client.get_collection 返回 mock collection
+        kb._client.get_collection.return_value = kb._collection
+    return kb
+
+
+class TestErrorKBRecord:
+    def test_exists_returns_true_when_found(self):
+        col = MagicMock()
+        col.get.return_value = {"ids": ["abc123"]}
+        kb = _make_patched_kb(collection_override=col)
+        assert kb.exists("some error") is True
+
+    def test_exists_returns_false_when_not_found(self):
+        col = MagicMock()
+        col.get.return_value = {"ids": []}
+        kb = _make_patched_kb(collection_override=col)
+        assert kb.exists("some error") is False
+
+    def test_exists_survives_exception(self):
+        col = MagicMock()
+        col.get.side_effect = RuntimeError("db down")
+        kb = _make_patched_kb(collection_override=col)
+        assert kb.exists("some error") is False
+
+    def test_record_skips_duplicate(self):
+        col = MagicMock()
+        col.get.return_value = {"ids": ["existing_fp"]}
+        kb = _make_patched_kb(collection_override=col)
+        assert kb.record("error", "<bad/>", "<good/>", "fix prompt") is False
+        col.add.assert_not_called()
+
+    def test_record_adds_new_case(self):
+        col = MagicMock()
+        col.get.return_value = {"ids": []}
+        kb = _make_patched_kb(collection_override=col)
+        assert kb.record(
+            "Field $F{x} not declared",
+            "<bad_jrxml>", "<good_jrxml>",
+            "prompt content", model="test-model", retry_count=2,
+        ) is True
+        col.add.assert_called_once()
+        meta = col.add.call_args[1]["metadatas"][0]
+        assert meta["retry_success"] == 3
+
+
+class TestErrorKBSearch:
+    @pytest.fixture
+    def col(self):
+        return MagicMock()
+
+    @pytest.fixture
+    def kb(self, col):
+        return _make_patched_kb(collection_override=col)
+
+    def test_search_returns_formatted_results(self, kb, col):
+        col.get.return_value = {"ids": []}
+        col.query.return_value = {
+            "ids": [["fp1"]],
+            "documents": [[json.dumps({
+                "error": "test error",
+                "good_jrxml_snippet": "<good/>",
+                "correction_prompt": "fix it",
+                "recorded_at": "2026-01-01T00:00:00",
+            })]],
+            "metadatas": [[{}]],
+            "distances": [[0.05]],
+        }
+        results = kb.search("some error", k=3)
+        assert len(results) == 1
+        assert results[0]["error"] == "test error"
+        assert results[0]["distance"] == 0.05
+
+    def test_search_returns_empty_on_exception(self, kb, col):
+        col.query.side_effect = RuntimeError("fail")
+        assert kb.search("error") == []
+
+    def test_search_as_context_formats_output(self, kb, col):
+        col.get.return_value = {"ids": []}
+        col.query.return_value = {
+            "ids": [["fp1", "fp2"]],
+            "documents": [[
+                json.dumps({"error": "e1", "good_jrxml_snippet": "<g1/>", "correction_prompt": "p1", "recorded_at": ""}),
+                json.dumps({"error": "e2", "good_jrxml_snippet": "<g2/>", "correction_prompt": "p2", "recorded_at": ""}),
+            ]],
+            "metadatas": [[{}, {}]],
+            "distances": [[0.1, 0.2]],
+        }
+        ctx = kb.search_as_context("error", k=2)
+        assert "[历史错误案例]" in ctx
+        assert "---" in ctx
+
+    def test_search_as_context_empty_for_no_results(self, kb, col):
+        col.get.return_value = {"ids": []}
+        col.query.return_value = {"ids": [[]], "documents": [[]], "distances": [[]]}
+        assert kb.search_as_context("error") == ""
+
+    def test_stats_returns_count(self, kb, col):
+        col.count.return_value = 42
+        assert kb.stats()["total_cases"] == 42
+
+    def test_stats_zero_on_exception(self, kb, col):
+        col.count.side_effect = RuntimeError("down")
+        assert kb.stats()["total_cases"] == 0
+
+
+# ── 全局便捷函数 ───────────────────────────────────────────────
+
+class TestConvenienceFunctions:
+    def test_get_error_kb_is_singleton(self, monkeypatch):
+        import backend.error_kb as mod
+        monkeypatch.setattr(mod, "_kb", None)
+        assert get_error_kb() is get_error_kb()
+
+    def test_record_error_delegates(self):
+        with patch.object(ErrorKB, "record", return_value=True) as mock_r:
+            assert record_error("e", "<b>", "<g>", "p") is True
+            mock_r.assert_called_once()
+
+    def test_search_error_cases_delegates(self):
+        with patch.object(ErrorKB, "search_as_context", return_value="ctx") as mock_s:
+            assert search_error_cases("err", k=5) == "ctx"
+            mock_s.assert_called_once_with("err", k=5)
@@ -0,0 +1,210 @@
+"""backend/session.py 单元测试 — 会话 CRUD + 原子写入。
+
+覆盖:
+  - 创建/加载/保存/删除/列出
+  - 原子写入（tempfile + os.replace）
+  - 边界情况（不存在会话、损坏 JSON、空名称自动填充）
+"""
+
+import json
+import os
+import sys
+import tempfile
+import time
+from pathlib import Path
+
+import pytest
+
+sys.path.insert(0, str(Path(__file__).parent.parent))
+
+from backend.session import (
+    create_session,
+    load_session,
+    save_session,
+    get_session_state,
+    list_all_sessions,
+    delete_session,
+    generate_session_id,
+    SESSIONS_DIR,
+)
+
+
+@pytest.fixture
+def temp_sessions_dir(monkeypatch):
+    with tempfile.TemporaryDirectory(prefix="test_sessions_") as tmpdir:
+        monkeypatch.setattr("backend.session.SESSIONS_DIR", Path(tmpdir))
+        yield Path(tmpdir)
+
+
+# ── create_session ──────────────────────────────────────────────
+
+class TestCreateSession:
+    def test_creates_with_defaults(self, temp_sessions_dir):
+        s = create_session()
+        assert len(s["session_id"]) == 32
+        assert "新建报表" in s["session_name"]
+        assert s["created_at"]
+        assert s["updated_at"]
+
+    def test_custom_name(self, temp_sessions_dir):
+        s = create_session(name="测试报表")
+        assert s["session_name"] == "测试报表"
+
+    def test_agent_state_preserved(self, temp_sessions_dir):
+        s = create_session(agent_state={"current_jrxml": "<x/>"})
+        assert s["agent_state"]["current_jrxml"] == "<x/>"
+
+    def test_session_id_injected_into_agent_state(self, temp_sessions_dir):
+        s = create_session()
+        assert s["agent_state"]["session_id"] == s["session_id"]
+
+    def test_persists_json_to_disk(self, temp_sessions_dir):
+        s = create_session(name="磁盘测试")
+        fp = temp_sessions_dir / f"{s['session_id']}.json"
+        assert fp.exists()
+        loaded = json.loads(fp.read_text("utf-8"))
+        assert loaded["session_name"] == "磁盘测试"
+
+    def test_unique_ids_no_collision(self, temp_sessions_dir):
+        ids = {generate_session_id() for _ in range(100)}
+        assert len(ids) == 100
+
+    def test_creates_sessions_dir_if_missing(self, temp_sessions_dir):
+        nested = temp_sessions_dir / "nested" / "sub"
+        import backend.session as mod
+
+        monkeypatch = pytest.MonkeyPatch()
+        monkeypatch.setattr(mod, "SESSIONS_DIR", nested)
+        s = create_session()
+        assert nested.exists()
+        assert (nested / f"{s['session_id']}.json").exists()
+
+
+# ── load_session ────────────────────────────────────────────────
+
+class TestLoadSession:
+    def test_returns_none_for_missing(self, temp_sessions_dir):
+        assert load_session("nonexistent_id") is None
+
+    def test_loads_existing(self, temp_sessions_dir):
+        created = create_session(name="加载测试")
+        loaded = load_session(created["session_id"])
+        assert loaded["session_name"] == "加载测试"
+        assert loaded["session_id"] == created["session_id"]
+
+    def test_load_includes_agent_state(self, temp_sessions_dir):
+        created = create_session(agent_state={"field_count": 5})
+        loaded = load_session(created["session_id"])
+        assert loaded["agent_state"]["field_count"] == 5
+
+
+# ── save_session ────────────────────────────────────────────────
+
+class TestSaveSession:
+    def test_updates_name_and_state(self, temp_sessions_dir):
+        created = create_session(name="原始")
+        save_session(created["session_id"], {"new_key": True}, session_name="更新")
+        loaded = load_session(created["session_id"])
+        assert loaded["session_name"] == "更新"
+        assert loaded["agent_state"]["new_key"] is True
+
+    def test_preserves_created_at(self, temp_sessions_dir):
+        created = create_session()
+        original = created["created_at"]
+        save_session(created["session_id"], {"x": 1})
+        assert load_session(created["session_id"])["created_at"] == original
+
+    def test_updates_updated_at(self, temp_sessions_dir):
+        created = create_session()
+        time.sleep(0.01)
+        save_session(created["session_id"], {"x": 1})
+        loaded = load_session(created["session_id"])
+        assert loaded["updated_at"] != created["updated_at"]
+
+    def test_atomic_write_produces_valid_json(self, temp_sessions_dir):
+        created = create_session()
+        save_session(created["session_id"], {"data": "x" * 1000})
+        fp = temp_sessions_dir / f"{created['session_id']}.json"
+        data = json.loads(fp.read_text("utf-8"))
+        assert data["agent_state"]["data"] == "x" * 1000
+
+    def test_auto_generates_name_when_empty(self, temp_sessions_dir):
+        created = create_session(name="")
+        save_session(created["session_id"], {"x": 1})
+        assert load_session(created["session_id"])["session_name"]
+
+    def test_keeps_existing_name_when_not_provided(self, temp_sessions_dir):
+        created = create_session(name="原名")
+        save_session(created["session_id"], {"x": 1})
+        assert load_session(created["session_id"])["session_name"] == "原名"
+
+    def test_fills_missing_created_at(self, temp_sessions_dir):
+        sid = "aaaabbbbccccddddeeeeffff"
+        fp = temp_sessions_dir / f"{sid}.json"
+        fp.write_text(
+            json.dumps({"session_id": sid, "session_name": "旧数据"}), "utf-8"
+        )
+        save_session(sid, {"x": 1})
+        assert load_session(sid)["created_at"]
+
+
+# ── get_session_state ───────────────────────────────────────────
+
+class TestGetSessionState:
+    def test_none_for_missing(self, temp_sessions_dir):
+        assert get_session_state("missing") is None
+
+    def test_returns_all_keys(self, temp_sessions_dir):
+        created = create_session(name="状态测试")
+        state = get_session_state(created["session_id"])
+        for key in ("session_id", "session_name", "agent_state", "created_at", "updated_at"):
+            assert key in state
+
+
+# ── list_all_sessions ───────────────────────────────────────────
+
+class TestListAllSessions:
+    def test_empty_when_no_sessions(self, temp_sessions_dir):
+        assert list_all_sessions() == []
+
+    def test_lists_all_created(self, temp_sessions_dir):
+        s1 = create_session(name="A")
+        s2 = create_session(name="B")
+        ids = {s["session_id"] for s in list_all_sessions()}
+        assert s1["session_id"] in ids
+        assert s2["session_id"] in ids
+
+    def test_summary_excludes_agent_state(self, temp_sessions_dir):
+        create_session(agent_state={"secret": True})
+        result = list_all_sessions()
+        assert "agent_state" not in result[0]
+
+    def test_sorted_by_mtime_desc(self, temp_sessions_dir):
+        s1 = create_session(name="先")
+        time.sleep(0.02)
+        s2 = create_session(name="后")
+        assert list_all_sessions()[0]["session_id"] == s2["session_id"]
+
+    def test_skips_corrupt_json(self, temp_sessions_dir):
+        (temp_sessions_dir / "bad.json").write_text("{not json}", "utf-8")
+        create_session(name="正常")
+        assert len(list_all_sessions()) == 1
+
+
+# ── delete_session ──────────────────────────────────────────────
+
+class TestDeleteSession:
+    def test_returns_false_for_missing(self, temp_sessions_dir):
+        assert delete_session("ghost_id") is False
+
+    def test_returns_true_and_removes(self, temp_sessions_dir):
+        created = create_session()
+        assert delete_session(created["session_id"]) is True
+        assert load_session(created["session_id"]) is None
+
+    def test_file_is_removed_from_disk(self, temp_sessions_dir):
+        created = create_session()
+        fp = temp_sessions_dir / f"{created['session_id']}.json"
+        assert fp.exists()
+        delete_session(created["session_id"])
+        assert not fp.exists()
@@ -111,11 +111,28 @@ def _check_minimum_content(jrxml: str) -> list[str]:
    return issues


+JR_NAMESPACE = "http://jasperreports.sourceforge.net/jasperreports"
+
+
+def _ensure_jr_namespace(jrxml: str) -> str:
+    """如果 JRXML 根元素缺少命名空间声明，自动补上。"""
+    import re
+    if 'xmlns=' not in jrxml[:500]:
+        return re.sub(
+            r'(<jasperReport)\b',
+            r'\1 xmlns="' + JR_NAMESPACE + '"',
+            jrxml, count=1,
+        )
+    return jrxml
+
+
 def _validate_xsd(jrxml: str) -> tuple[bool, str]:
    """根据 JasperReports XSD schema 验证 JRXML。"""
    if not SCHEMA_FILE.exists():
        return True, ""

+    jrxml = _ensure_jr_namespace(jrxml)
+
    try:
        schema_doc = etree.parse(str(SCHEMA_FILE))
        xmlschema = etree.XMLSchema(schema_doc)
Author	SHA1	Message	Date
panda	1210b926c3	fix: MAX_RETRY 5 + rolling continuation + namespace-aware JRXML extraction - MAX_RETRY: 3→5 (graph.py:35, nodes.py:25) with env override - Rolling continuation: _generate_with_continuation() auto-detects truncated JRXML and sends anchor-based continuation, max 3 rounds - JRXML extraction: regex/end-tag now namespace-prefix aware (ns0:jasperReport, ns:jasperReport, etc.) - All 5 generation nodes refactored to use continuation helper - Tests updated: scenario1 accepts ns-prefixed root, max_retry verifies graph termination - stop_reason capture + WARNING log on max_tokens truncation - Correction prompt now injects OCR context + layout schema	2026-05-23 10:58:46 +08:00
panda	83e801a0b8	fix: auto-inject JasperReports namespace before XSD validation AI-generated JRXML often omits the xmlns declaration on the root element. The XSD schema requires targetNamespace, so validation would fail with "Element 'jasperReport': No matching global declaration available". _ensure_jr_namespace() detects missing xmlns and injects it before schema validation, making the validator tolerant of namespace-free JRXML.	2026-05-23 09:44:08 +08:00
panda	c2cae5665e	fix: replace complex bat scripts with Python launcher + minimal bat wrappers Root cause: Windows batch files written with LF endings caused cmd.exe to misparse labels and Chinese characters, producing garbled "not a command" errors. The Python launcher avoids encoding issues entirely. - start.py: reliable cross-platform launcher (kill ports, start 3 services, wait for health, print status) - start.bat / start_all.bat: minimal 4-line ASCII wrappers - stop.bat: inline Python for port-based process killing	2026-05-23 09:32:32 +08:00
panda	c8924c625c	fix: rewrite startup scripts with reliable helpers, stderr logging, visible windows - Replace /MIN (hidden window) with normal windows so errors are visible - Redirect stderr to logs/*.log for post-mortem - Extract killport/wait_health/wait_port into callable helpers - Use !N! (delayed expansion) for retry counters - stop.bat now shows which PIDs it kills with port labels - Remove nested-quote issue by cd'ing before npm start	2026-05-23 09:25:45 +08:00
panda	9a4f51d378	fix: add retry limit to startup wait loops to prevent infinite hang Each service wait loop now fails after 30 retries (~60s) instead of spinning forever when a port is occupied by a stuck process. Also added cleanup label that kills partially-started services on failure.	2026-05-23 09:20:55 +08:00
panda	40adf50702	fix: add chcp 65001 and .venv check to startup scripts	2026-05-23 09:15:44 +08:00
panda	751df5c4a9	fix: resolve quoting issue in start_all.bat frontend launch, add node_modules check	2026-05-23 09:11:53 +08:00
panda	93ad5e8876	fix: address audit findings — session_id validation, streaming reset, state isolation - Replace truncated 12-char UUID with full 32-char UUID (128-bit entropy) - Add validate_session_id() regex check to prevent path traversal - Add _check_session_id() guard on all 6 API endpoints - Change _step_counter from module global to contextvars.ContextVar - Filter None values from node_state before merging into agent_state - Log save_session failures instead of silently swallowing them - Add finishStreaming() in catch/finally blocks to prevent UI lockup - Fix broken multiline docstring in chat() endpoint	2026-05-23 09:08:53 +08:00
panda	1952d75f13	test: add unit/integration/E2E test suites, fix create_session bug, update docs - Unit tests: test_session.py (27), test_error_kb.py (24), test_agent.py hardened - Integration tests: test_api_integration.py (25) with FastAPI TestClient - E2E tests: main-flows.spec.ts (8) with Playwright + API mocking - Bug fix: backend/session.py create_session() missing session_id parameter - Config: frontend/playwright.config.ts, npm run test:e2e - Docs: update CLAUDE.md v9, .gitignore for test artifacts/eval reports	2026-05-23 08:38:29 +08:00