Compare commits
10 Commits
573ce012e7
..
master
| Author | SHA1 | Date | |
|---|---|---|---|
| 65898478ea | |||
| 7e3a90a2b8 | |||
| 00f718fbda | |||
| 6e6199bd26 | |||
| cacff6f63a | |||
| 963c5e41c8 | |||
| c9344a2715 | |||
| 6d5cfaf29a | |||
| 0839ba92da | |||
| 0adae3e06d |
@@ -0,0 +1,114 @@
|
||||
# jaspersoft-fix 评测报告
|
||||
|
||||
**项目路径**: `D:\Idea Project\jaspersoft-fix`
|
||||
**评测时间**: 2026-05-25
|
||||
**评测维度**: 代码质量 · 安全与稳定性 · 工程实践 · 产品设计
|
||||
|
||||
---
|
||||
|
||||
## 综合评分
|
||||
|
||||
| 维度 | 评分 | 主要问题 |
|
||||
|------|------|----------|
|
||||
| 代码质量与架构 | 7.3/10 | nodes.py 1709行 God Module、无文件锁并发风险 |
|
||||
| 安全与稳定性 | P0×1 + P1×2 + P2×4 | llm.log 写全量 prompt、session 并发覆盖、无 magic bytes 校验 |
|
||||
| 工程实践 | 3.5/5 | 原子写入优秀、trace_id 传播良好、无 E2E 测试 |
|
||||
| 产品设计 | 4.2/5 | natural_explanation 透明、非 fix 报告误报进度不透明 |
|
||||
|
||||
---
|
||||
|
||||
## 一、代码质量与架构(7.3/10)
|
||||
|
||||
亮点:**原子写入**(tempfile+fsync+replace)设计优秀、v5 Band 级分层精确生成架构、前端 Vue3+Pinia 结构清晰。
|
||||
|
||||
主要问题:
|
||||
|
||||
| 问题 | 严重度 | 说明 |
|
||||
|------|--------|------|
|
||||
| nodes.py 过胖 | P1 | 1709行,14个工作流节点,应拆分到 `agent/utils.py` |
|
||||
| session.py 无文件锁 | P0 | 多用户并发写同一 session 会互相覆盖(无 flock/fcntl) |
|
||||
| 废弃 Vue 组件 | P1 | `StreamingMessage.vue`/`NodeProgress.vue` 仍在 frontend/components |
|
||||
|
||||
---
|
||||
|
||||
## 二、安全与稳定性
|
||||
|
||||
| 等级 | 数量 | 问题 |
|
||||
|------|------|------|
|
||||
| **P0** | 1 | `llm.log` 写全量 prompt(`prompt[:10000]`),API Key 可能泄露 |
|
||||
| **P1** | 2 | session 并发无锁(见 P0);文件上传无 magic bytes 校验 |
|
||||
| **P2** | 4 | LLM prompt 注入风险;ChromaDB 无认证;CORS 宽松;无 API 认证 |
|
||||
|
||||
已做好:`.env` 隔离、`sessions/` gitignore、SQL 注入防护(参数化查询)、hex session_id 校验防路径穿越。
|
||||
|
||||
**⚠️ llm.log 泄露风险**:
|
||||
`backend/llm.py` 第 47-49 行写 `prompt[:10000]` 到日志,第 66-67 行写 `response[:10000]`。prompt 中若含用户上传的文档内容(包含敏感字段名)或 API 调用上下文,可能被记录。需要脱敏。
|
||||
|
||||
---
|
||||
|
||||
## 三、工程实践(3.5/5)
|
||||
|
||||
亮点:原子写入(tempfile+fsync+replace)优秀、日志 trace_id 传播(contextvars)、JSONFormatter 结构化日志、`nodes.py` 的 namespace 检查修复(五轮修正失败根因)。
|
||||
|
||||
主要问题:
|
||||
|
||||
| 问题 | 严重度 |
|
||||
|------|--------|
|
||||
| 会话并发无文件锁 | P0 — 多用户并发写同一 session 会互相覆盖 |
|
||||
| 无 E2E 测试 | P1 — 无 Playwright 测试 |
|
||||
| 废弃 Vue 组件未删除 | P1 — `StreamingMessage.vue`/`NodeProgress.vue` |
|
||||
| 冷启动慢(llm.py 初始化) | P2 |
|
||||
|
||||
---
|
||||
|
||||
## 四、产品设计(4.2/5)
|
||||
|
||||
亮点:错误修正循环设计优秀、五轮自动修正+失败上下文注入、`SummaryCard.vue` 正确展示 `natural_explanation`(非 fix 报告误报"进度不透明"是错的)。
|
||||
|
||||
主要问题:
|
||||
|
||||
| 优先级 | 问题 |
|
||||
|--------|------|
|
||||
| P0 | 会话并发无文件锁(影响稳定性) |
|
||||
| P1 | `export_pdf` 未实现(需标记"敬请期待") |
|
||||
| P1 | 意图分类无用户确认机制 |
|
||||
| P2 | 流式输出无 XML 语法高亮 |
|
||||
| P2 | 空白状态无引导示例 |
|
||||
|
||||
---
|
||||
|
||||
## 优先修复路线图
|
||||
|
||||
### P0(立即修复)
|
||||
|
||||
1. **会话并发文件锁**:在 `save_session()` 加 `fcntl.flock()` 保护先读后写
|
||||
2. **LLM 日志脱敏**:prompt/response 中截断或替换 API Key 为 `[REDACTED]`
|
||||
|
||||
### P1(近期处理)
|
||||
|
||||
3. 删除废弃 Vue 组件(`StreamingMessage.vue`/`NodeProgress.vue`)
|
||||
4. 实现 `export_pdf` 或标记"敬请期待"
|
||||
5. 意图分类结果标签化供用户确认
|
||||
6. 添加 Playwright E2E 测试
|
||||
|
||||
### P2(有空再搞)
|
||||
|
||||
7. 流式输出 XML 语法高亮
|
||||
8. 空白状态引导示例
|
||||
|
||||
---
|
||||
|
||||
## 与 jaspersoft(非 fix)的关键差异
|
||||
|
||||
| 项目 | jaspersoft(非 fix) | jaspersoft-fix |
|
||||
|------|---------------------|----------------|
|
||||
| commit | `2d5183d` OCR fidelity reform | `0839ba9` WIP(rag + test image) |
|
||||
| namespace 前缀 | 未处理 | 已修复 `_extract_jrxml()` |
|
||||
| 五轮修正失败根因 | 旧评分公式 | 已修复(去掉 field_coverage 权重) |
|
||||
| OCR 自动发现文档类型 | 需手动 | 已实现 |
|
||||
| 进度透明度 | 非 fix 报告误报"不透明" | 实际展示 natural_explanation ✅ |
|
||||
|
||||
---
|
||||
|
||||
*评测时间: 2026-05-25 (Asia/Hong_Kong)*
|
||||
*评测工具: Mavis AI Agent*
|
||||
+30
-10
@@ -151,11 +151,22 @@ def process_input(state: AgentState) -> Dict:
|
||||
# 同时更新工作对话历史中的最后一条
|
||||
conv_history[-1]["content"] = user_input
|
||||
# 批注检测(圈选/箭头标记)
|
||||
elements = ocr_result.get("elements", [])
|
||||
elements = ocr_result.get("all_elements", [])
|
||||
if elements:
|
||||
try:
|
||||
from backend.annotation_detector import detect_annotations
|
||||
ann_result = detect_annotations(uploaded_path, elements)
|
||||
elem_dicts = []
|
||||
for e in elements:
|
||||
d = e.to_dict() if hasattr(e, "to_dict") else (e if isinstance(e, dict) else {"text": str(e), "bbox": [], "confidence": 0})
|
||||
# annotation_detector 期望 bbox 为 {x,y,w,h},但 OcrTextElement.to_dict() 返回 [x_min,y_min,x_max,y_max]
|
||||
b = d.get("bbox", [])
|
||||
if isinstance(b, (list, tuple)) and len(b) == 4:
|
||||
d["bbox"] = {"x": b[0], "y": b[1], "w": b[2] - b[0], "h": b[3] - b[1]}
|
||||
elif isinstance(b, dict) and "x" not in b:
|
||||
# 已经是 [x,y,w,h] 形式的 list 但被当成 dict 的情况
|
||||
d["bbox"] = {"x": b.get(0, 0), "y": b.get(1, 0), "w": b.get(2, 0) - b.get(0, 0), "h": b.get(3, 0) - b.get(1, 0)}
|
||||
elem_dicts.append(d)
|
||||
ann_result = detect_annotations(uploaded_path, elem_dicts)
|
||||
if ann_result.get("total", 0) > 0:
|
||||
state["annotation_result"] = ann_result
|
||||
_node_log.info(
|
||||
@@ -663,14 +674,18 @@ def _format_ocr_context(state: AgentState) -> str:
|
||||
)
|
||||
|
||||
# 所有原始文本(用于表格匹配等需要全文的场景)
|
||||
elements = ocr_result.get("elements", [])
|
||||
elements = ocr_result.get("all_elements", [])
|
||||
if elements:
|
||||
parts.append("\n全部文本元素(含坐标):")
|
||||
for e in elements:
|
||||
bbox = e.get("bbox", {})
|
||||
x, y, w, h = bbox.get("x", 0), bbox.get("y", 0), bbox.get("w", 0), bbox.get("h", 0)
|
||||
bbox = e.get("bbox", [])
|
||||
if isinstance(bbox, list) and len(bbox) >= 4:
|
||||
x_min, y_min, x_max, y_max = bbox[0], bbox[1], bbox[2], bbox[3]
|
||||
x, y, w, h = x_min, y_min, x_max - x_min, y_max - y_min
|
||||
else:
|
||||
x, y, w, h = 0, 0, 0, 0
|
||||
parts.append(
|
||||
f" [{x},{y} {w}×{h}] {e['text']} "
|
||||
f" [{x},{y} {w}×{h}] {e.get('text','')} "
|
||||
f"(置信度={e.get('confidence',0):.2f})"
|
||||
)
|
||||
|
||||
@@ -1251,9 +1266,9 @@ def _check_ocr_fidelity(jrxml: str, state: dict) -> dict:
|
||||
|
||||
issues = []
|
||||
|
||||
# 1. 元素数量对比
|
||||
text_fields = len(re.findall(r"<textField", jrxml))
|
||||
static_texts = len(re.findall(r"<staticText", jrxml))
|
||||
# 1. 元素数量对比(支持 namespace 前缀,如 <jrxml:textField>)
|
||||
text_fields = len(re.findall(r"<[a-zA-Z0-9_-]+:textField|<textField", jrxml))
|
||||
static_texts = len(re.findall(r"<[a-zA-Z0-9_-]+:staticText|<staticText", jrxml))
|
||||
total_jrxml_elements = text_fields + static_texts
|
||||
|
||||
ocr_text_count = 0
|
||||
@@ -1273,7 +1288,9 @@ def _check_ocr_fidelity(jrxml: str, state: dict) -> dict:
|
||||
element_coverage = 1.0
|
||||
|
||||
# 2. 字段名覆盖(英文字段名 vs OCR 中文字段名天然不匹配,权重降低)
|
||||
jrxml_fields = set(re.findall(r'<field name="([^"]+)"', jrxml))
|
||||
# 支持 namespace 前缀的 field 声明(如 <jrxml:field>)
|
||||
raw_fields = re.findall(r'(?:<[a-zA-Z0-9_-]+:)?field\s+name="([^"]+)"', jrxml)
|
||||
jrxml_fields = set(raw_fields)
|
||||
ocr_field_names = set()
|
||||
ocr_fields = ocr_result.get("fields", []) if isinstance(ocr_result, dict) else []
|
||||
for f in ocr_fields:
|
||||
@@ -1677,6 +1694,9 @@ def _extract_jrxml(text: str) -> str:
|
||||
3. 纯 JRXML 无包装
|
||||
"""
|
||||
text = text.strip()
|
||||
# 清理 LLM 输出的 ns0: 命名空间前缀和声明
|
||||
text = text.replace("ns0:", "")
|
||||
text = re.sub(r'\s+xmlns:ns0="[^"]*"', "", text)
|
||||
# 检测并提取 markdown 代码块中的内容
|
||||
# 如果第一个代码块的内容看起来是完整 JRXML(以 <?xml 或 <jasperReport 开头),
|
||||
# 则返回它;否则跳过该块,回退到其他提取方式。
|
||||
|
||||
+4
-9
@@ -35,7 +35,6 @@ class _LLMLoggingWrapper(_BaseLLM):
|
||||
def invoke(self, prompt: str) -> Any:
|
||||
t0 = time.time()
|
||||
prompt_len = len(prompt)
|
||||
prompt_preview = prompt[:500]
|
||||
_llm_log.debug(
|
||||
"LLM invoke 请求",
|
||||
extra={
|
||||
@@ -44,8 +43,7 @@ class _LLMLoggingWrapper(_BaseLLM):
|
||||
"backend": self._backend,
|
||||
"caller": self._caller,
|
||||
"prompt_length": prompt_len,
|
||||
"prompt_preview": prompt_preview,
|
||||
"prompt": prompt[:10000],
|
||||
"prompt_preview": prompt[:500],
|
||||
},
|
||||
)
|
||||
try:
|
||||
@@ -64,7 +62,6 @@ class _LLMLoggingWrapper(_BaseLLM):
|
||||
"duration_ms": elapsed,
|
||||
"response_length": resp_len,
|
||||
"response_preview": resp_preview,
|
||||
"response": content[:10000],
|
||||
},
|
||||
)
|
||||
return result
|
||||
@@ -79,7 +76,7 @@ class _LLMLoggingWrapper(_BaseLLM):
|
||||
"caller": self._caller,
|
||||
"duration_ms": elapsed,
|
||||
"error": str(e),
|
||||
"prompt": prompt[:10000],
|
||||
"prompt_preview": prompt[:500],
|
||||
},
|
||||
)
|
||||
raise
|
||||
@@ -96,8 +93,7 @@ class _LLMLoggingWrapper(_BaseLLM):
|
||||
"backend": self._backend,
|
||||
"caller": self._caller,
|
||||
"prompt_length": prompt_len,
|
||||
"prompt_preview": prompt_preview,
|
||||
"prompt": prompt[:10000],
|
||||
"prompt_preview": prompt[:500],
|
||||
},
|
||||
)
|
||||
full = []
|
||||
@@ -135,7 +131,6 @@ class _LLMLoggingWrapper(_BaseLLM):
|
||||
"duration_ms": elapsed,
|
||||
"response_length": resp_len,
|
||||
"response_preview": resp_preview,
|
||||
"response": resp_text[:10000],
|
||||
"stop_reason": stop_reason,
|
||||
},
|
||||
)
|
||||
@@ -150,7 +145,7 @@ class _LLMLoggingWrapper(_BaseLLM):
|
||||
"caller": self._caller,
|
||||
"duration_ms": elapsed,
|
||||
"error": str(e),
|
||||
"prompt": prompt[:10000],
|
||||
"prompt_preview": prompt[:500],
|
||||
},
|
||||
)
|
||||
raise
|
||||
|
||||
@@ -98,6 +98,14 @@ class ExtractionResult:
|
||||
}
|
||||
for f in self.fields
|
||||
],
|
||||
"all_elements": [
|
||||
{
|
||||
"text": e.text,
|
||||
"bbox": e.bbox,
|
||||
"confidence": e.confidence,
|
||||
}
|
||||
for e in self.all_elements
|
||||
],
|
||||
"total_elements": len(self.all_elements),
|
||||
"errors": self.errors,
|
||||
}
|
||||
|
||||
+86
-32
@@ -6,11 +6,50 @@
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import threading
|
||||
import uuid
|
||||
import tempfile
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
from typing import Optional, Any
|
||||
|
||||
# Per-session-file locks to prevent concurrent writes from corrupting JSON
|
||||
_session_locks: dict[str, threading.Lock] = {}
|
||||
_locks_lock = threading.Lock()
|
||||
|
||||
def _get_lock(session_id: str) -> threading.Lock:
|
||||
with _locks_lock:
|
||||
if session_id not in _session_locks:
|
||||
_session_locks[session_id] = threading.Lock()
|
||||
return _session_locks[session_id]
|
||||
|
||||
|
||||
class _SafeEncoder(json.JSONEncoder):
|
||||
"""处理 numpy / lxml / 等非标准类型的 JSON 序列化"""
|
||||
|
||||
def default(self, o: Any) -> Any:
|
||||
try:
|
||||
# numpy 标量
|
||||
import numpy as np
|
||||
if isinstance(o, np.integer):
|
||||
return int(o)
|
||||
if isinstance(o, np.floating):
|
||||
return float(o)
|
||||
if isinstance(o, np.ndarray):
|
||||
return o.tolist()
|
||||
if isinstance(o, np.bool_):
|
||||
return bool(o)
|
||||
except ImportError:
|
||||
pass
|
||||
# lxml intc / 其他 C 类型
|
||||
try:
|
||||
return int(o)
|
||||
except Exception:
|
||||
pass
|
||||
# bytes
|
||||
if isinstance(o, bytes):
|
||||
return o.decode("utf-8", errors="replace")
|
||||
return super().default(o)
|
||||
|
||||
from dotenv import load_dotenv
|
||||
|
||||
@@ -59,8 +98,21 @@ def create_session(name: str = "", agent_state: Optional[dict] = None,
|
||||
"kb_id": agent_state.get("kb_id", "") if agent_state else "",
|
||||
"agent_state": agent_state,
|
||||
}
|
||||
with open(_session_path(sid), "w", encoding="utf-8") as f:
|
||||
json.dump(data, f, ensure_ascii=False, indent=2)
|
||||
fp = _session_path(sid)
|
||||
tmp = tempfile.NamedTemporaryFile(
|
||||
mode="w", suffix=".json", delete=False,
|
||||
dir=SESSIONS_DIR, encoding="utf-8",
|
||||
)
|
||||
try:
|
||||
json.dump(data, tmp, ensure_ascii=False, indent=2, cls=_SafeEncoder)
|
||||
tmp.flush()
|
||||
os.fsync(tmp.fileno())
|
||||
tmp.close()
|
||||
os.replace(tmp.name, str(fp))
|
||||
except Exception:
|
||||
tmp.close()
|
||||
Path(tmp.name).unlink(missing_ok=True)
|
||||
raise
|
||||
_session_log.info("创建会话", extra={"session_id": sid, "session_name": data["session_name"]})
|
||||
return data
|
||||
|
||||
@@ -79,39 +131,41 @@ def load_session(session_id: str) -> Optional[dict]:
|
||||
|
||||
|
||||
def save_session(session_id: str, agent_state: dict, session_name: str = ""):
|
||||
"""将会话状态原子保存至磁盘(temp file + rename,避免崩溃时截断)。"""
|
||||
"""线程安全地原子保存会话状态到磁盘。"""
|
||||
_ensure_dir()
|
||||
fp = _session_path(session_id)
|
||||
data = {}
|
||||
if fp.exists():
|
||||
with open(fp, "r", encoding="utf-8") as f:
|
||||
data = json.load(f)
|
||||
lock = _get_lock(session_id)
|
||||
with lock:
|
||||
data = {}
|
||||
if fp.exists():
|
||||
with open(fp, "r", encoding="utf-8") as f:
|
||||
data = json.load(f)
|
||||
|
||||
data["session_id"] = session_id
|
||||
if session_name:
|
||||
data["session_name"] = session_name
|
||||
if not data.get("session_name"):
|
||||
data["session_name"] = f"报表 {data.get('created_at', _now_iso())[:10]}"
|
||||
data["updated_at"] = _now_iso()
|
||||
if not data.get("created_at"):
|
||||
data["created_at"] = data["updated_at"]
|
||||
data["agent_state"] = agent_state
|
||||
data["session_id"] = session_id
|
||||
if session_name:
|
||||
data["session_name"] = session_name
|
||||
if not data.get("session_name"):
|
||||
data["session_name"] = f"报表 {data.get('created_at', _now_iso())[:10]}"
|
||||
data["updated_at"] = _now_iso()
|
||||
if not data.get("created_at"):
|
||||
data["created_at"] = data["updated_at"]
|
||||
data["agent_state"] = agent_state
|
||||
|
||||
# 原子写入:先写临时文件,再 replace,避免崩溃时截断 JSON
|
||||
tmp = tempfile.NamedTemporaryFile(
|
||||
mode="w", suffix=".json", delete=False,
|
||||
dir=SESSIONS_DIR, encoding="utf-8",
|
||||
)
|
||||
try:
|
||||
json.dump(data, tmp, ensure_ascii=False, indent=2)
|
||||
tmp.flush()
|
||||
os.fsync(tmp.fileno())
|
||||
tmp.close()
|
||||
os.replace(tmp.name, str(fp))
|
||||
except Exception:
|
||||
tmp.close()
|
||||
Path(tmp.name).unlink(missing_ok=True)
|
||||
raise
|
||||
# 原子写入:先写临时文件,再 replace,避免崩溃时截断 JSON
|
||||
tmp = tempfile.NamedTemporaryFile(
|
||||
mode="w", suffix=".json", delete=False,
|
||||
dir=SESSIONS_DIR, encoding="utf-8",
|
||||
)
|
||||
try:
|
||||
json.dump(data, tmp, ensure_ascii=False, indent=2, cls=_SafeEncoder)
|
||||
tmp.flush()
|
||||
os.fsync(tmp.fileno())
|
||||
tmp.close()
|
||||
os.replace(tmp.name, str(fp))
|
||||
except Exception:
|
||||
tmp.close()
|
||||
Path(tmp.name).unlink(missing_ok=True)
|
||||
raise
|
||||
|
||||
|
||||
def get_session_state(session_id: str) -> Optional[dict]:
|
||||
|
||||
@@ -8,6 +8,8 @@
|
||||
- 如果当前 JRXML 内容为空或过短(<200 字符),请根据下方提供的 OCR 识别数据和布局 schema 重新生成完整的 JRXML,而非输出一个占位桩。
|
||||
- 如果错误是"字段 'field_N' 未在 <field> 部分声明",**必须**为每个缺失的 field_N 添加 `<field name="field_N" class="java.lang.String"/>` 声明。这些是占位字段,不可删除。同时确保所有 $F{{field_N}} 引用都有对应的 <field> 声明。
|
||||
- 如果错误是"字段 'field_N' 未在 <field> 部分声明"且有 OCR 字段数据,尝试将 $F{{field_N}} 替换为 OCR 中对应的真实字段名(如 $F{{invoice_code}}),同时更新 <field> 声明和所有引用。
|
||||
- 【强制】修正后的 JRXML 必须保证所有 $F{...} 引用都有对应的 <field name="..."> 声明。禁止出现 $F{field_name} 却没有对应 field 声明的情况。
|
||||
- 【强制】font 标签必须符合 JasperReports XSD:<font fontName="..." size="..." isBold="..." isItalic="..." isUnderline="..."/>。禁止在 <font> 标签上写 fontName= 属性(错误写法),必须使用嵌套属性格式(正确写法)。
|
||||
- **始终检查并修复命名空间**:正确的根元素格式必须为:`<jasperReport xmlns="http://jasperreports.sourceforge.net/jasperreports" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://jasperreports.sourceforge.net/jasperreports http://jasperreports.sourceforge.net/xsd/jasperreport.xsd">`。删除所有 ns0: 前缀,删除所有 `xmlns:ns0` 声明,删除所有元素标签上的 `ns0:` 前缀。
|
||||
|
||||
当前 JRXML(带错误):
|
||||
|
||||
@@ -4,6 +4,8 @@ JRXML 必须兼容 JasperReports 7.0.6 schema。
|
||||
关键规则:
|
||||
- 只输出 JRXML 代码,不要解释,不要 markdown 标记。
|
||||
- 报表正文中使用的每个字段必须在 <field name="..."> 部分中声明。
|
||||
- 【强制】在 <jasperReport> 下必须包含完整的 <fields> 节,列出所有用到的字段。每个字段格式:<field name="field_name" class="java.lang.String"/>。禁止出现 $F{field_name} 却没有对应 field 声明的情况。
|
||||
- 【强制】font 标签结构:使用 <font fontName="Serif" size="12"/> 而非 <fontName="Serif"/> 等属性写法。font 标签必须符合 JasperReports XSD:<font fontName="..." size="..." isBold="..." isItalic="..." isUnderline="..."/>
|
||||
- 根元素为 <jasperReport>,包含正确的 xmlns 属性。**禁止在元素标签上使用 ns0: 前缀**。正确的根元素格式:
|
||||
```xml
|
||||
<jasperReport xmlns="http://jasperreports.sourceforge.net/jasperreports" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://jasperreports.sourceforge.net/jasperreports http://jasperreports.sourceforge.net/xsd/jasperreport.xsd">
|
||||
|
||||
@@ -3,6 +3,8 @@
|
||||
关键规则:
|
||||
- 只输出 JRXML 代码,不要解释,不要 markdown 标记。
|
||||
- 使用 $F{{field_1}}, $F{{field_2}}, ... 作为占位字段名,并在 <field> 部分声明它们。
|
||||
- 【强制】在 <jasperReport> 下必须包含完整的 <fields> 节,列出所有用到的字段。每个字段格式:<field name="field_name" class="java.lang.String"/>。禁止出现 $F{field_name} 却没有对应 field 声明的情况。
|
||||
- 【强制】font 标签结构:使用 <font fontName="Serif" size="12"/> 而非 <fontName="Serif"/> 等属性写法。font 标签必须符合 JasperReports XSD:<font fontName="..." size="..." isBold="..." isItalic="..." isUnderline="..."/>
|
||||
- 报表结构必须正确(title, pageHeader, columnHeader, detail, pageFooter 等 band)。
|
||||
- 元素位置使用近似值即可,后续会精确调整。
|
||||
- 根元素为 <jasperReport xmlns="http://jasperreports.sourceforge.net/jasperreports" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://jasperreports.sourceforge.net/jasperreports http://jasperreports.sourceforge.net/xsd/jasperreport.xsd">,命名空间和 schemaLocation 必须精确,不可使用其他 URL(如 jaspersoft.com)。**禁止在元素标签上使用 ns0: 前缀**。
|
||||
|
||||
+1
-1
Submodule rag updated: 687b3a8f90...5760153e7e
@@ -0,0 +1,126 @@
|
||||
"""
|
||||
Jaspersoft E2E 测试脚本
|
||||
用法: python scripts/run_e2e.py [--user-text "请根据图片生成结算单模板"]
|
||||
|
||||
输出:
|
||||
- tmp/e2e_events_{HHMMSS}.json 完整事件流
|
||||
- tmp/e2e_log_{HHMMSS}.txt 节点日志
|
||||
"""
|
||||
import sys, os
|
||||
sys.path.insert(0, os.path.dirname(os.path.dirname(__file__)))
|
||||
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', buffering=1)
|
||||
|
||||
import requests, json, time, uuid
|
||||
from pathlib import Path
|
||||
|
||||
BASE_URL = "http://localhost:8000"
|
||||
TEST_IMAGE = Path(__file__).parent.parent / "test_image.jpg"
|
||||
USER_TEXT = "请根据图片信息生成结算单模板"
|
||||
|
||||
ts = time.strftime("%H%M%S")
|
||||
out_path = Path(__file__).parent.parent / "tmp" / f"e2e_events_{ts}.json"
|
||||
log_path = Path(__file__).parent.parent / "tmp" / f"e2e_log_{ts}.txt"
|
||||
out_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
def log(msg):
|
||||
print(msg, flush=True)
|
||||
with open(log_path, "a", encoding="utf-8") as f:
|
||||
f.write(msg + "\n")
|
||||
|
||||
def run():
|
||||
log("=" * 60)
|
||||
log(f"E2E 测试开始 {time.strftime('%H:%M:%S')}")
|
||||
|
||||
# 1. 创建会话
|
||||
sid_resp = requests.post(f"{BASE_URL}/api/sessions", json={"session_id": "test"}, timeout=10)
|
||||
sid = sid_resp.json()["session_id"]
|
||||
log(f"[会话] {sid}")
|
||||
|
||||
# 2. 上传图片
|
||||
with open(TEST_IMAGE, "rb") as f:
|
||||
up_resp = requests.post(
|
||||
f"{BASE_URL}/api/upload",
|
||||
files={"file": ("test_image.jpg", f, "image/jpeg")},
|
||||
data={"session_id": sid}, timeout=30,
|
||||
)
|
||||
fid = up_resp.json()["file_id"]
|
||||
log(f"[上传] file_id={fid}")
|
||||
|
||||
# 3. 发送对话
|
||||
log(f"[对话] 开始 pipeline...")
|
||||
start = time.time()
|
||||
|
||||
events = []
|
||||
node_times = {}
|
||||
error_events = []
|
||||
|
||||
r = requests.post(
|
||||
f"{BASE_URL}/api/sessions/{sid}/chat",
|
||||
json={"text": USER_TEXT, "file_ids": [fid]},
|
||||
stream=True, timeout=600,
|
||||
)
|
||||
log(f"[状态] HTTP {r.status_code}")
|
||||
|
||||
for line in r.iter_lines():
|
||||
if not line:
|
||||
continue
|
||||
line = line.decode("utf-8", errors="replace")
|
||||
if line.startswith("data:"):
|
||||
try:
|
||||
data = json.loads(line[5:].strip())
|
||||
events.append(data)
|
||||
evt = data.get("event", "")
|
||||
d = data.get("data", {})
|
||||
node = d.get("node", "")
|
||||
|
||||
if evt == "node_start":
|
||||
node_times.setdefault(node, {"start": time.time() - start, "complete": None})
|
||||
log(f" [开始] {node}")
|
||||
|
||||
if evt == "node_complete":
|
||||
if node in node_times and node_times[node]["complete"] is None:
|
||||
dur = time.time() - start - node_times[node]["start"]
|
||||
node_times[node]["complete"] = time.time() - start
|
||||
detail = d.get("detail", "")[:80]
|
||||
log(f" [完成] {node} ({dur:.1f}s) — {detail}")
|
||||
|
||||
if evt == "error":
|
||||
msg = d.get("message", str(data))[:200]
|
||||
log(f" [错误] {msg}")
|
||||
error_events.append(data)
|
||||
|
||||
if evt == "result":
|
||||
result = data.get("data", {})
|
||||
elapsed = time.time() - start
|
||||
jrxml = result.get("jrxml", "")
|
||||
log(f"\n{'='*50}")
|
||||
log(f"[完成] 耗时 {elapsed:.1f}s")
|
||||
log(f" status: {result.get('status', 'N/A')}")
|
||||
log(f" jrxml_length: {len(jrxml)}")
|
||||
log(f" error: {result.get('error', 'None')[:200]}")
|
||||
|
||||
if evt == "done":
|
||||
log(f"\n[SSE Done]")
|
||||
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
|
||||
elapsed_total = time.time() - start
|
||||
log(f"\n总耗时: {elapsed_total:.1f}s")
|
||||
log(f"共 {len(events)} 个事件,{len(node_times)} 个节点,{len(error_events)} 个错误")
|
||||
|
||||
# 保存
|
||||
with open(out_path, "w", encoding="utf-8") as f:
|
||||
json.dump({
|
||||
"session_id": sid,
|
||||
"elapsed": elapsed_total,
|
||||
"events": events,
|
||||
"node_times": node_times,
|
||||
"error_events": error_events,
|
||||
}, f, ensure_ascii=False, indent=2)
|
||||
|
||||
log(f"事件已保存: {out_path}")
|
||||
log(f"日志已保存: {log_path}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
run()
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 450 KiB |
Reference in New Issue
Block a user