CS 论文预审 / 改稿 · Claude Code Skill · Workflow + MemoryCS-conference review / editing · Skill + Workflow + Memory

投稿前,先让
AI 陪审团
审一遍。
Before a reviewer tears it apart,
let a jury do it first.

直接问 AI「我论文怎么样」,通常只会得到两种没用的答案:礼貌夸好,或者漫天挑刺。

PaperJury 把它改成一套闭环:先找问题,再裁定是否成立;能安全改的写成补丁,缺实验或证据的交给作者,不成立的意见直接驳回。
Ask an AI if your paper is any good and you get a polite yes. PaperJury argues the other side: N reviewers read the whole paper, disputes go to an independent vote and a three-way verdict (fix it / needs you / no fix), and only edits you sign off on land, then it actually compiles.

PaperJury v. Your PaperNo. 26-CV-0603
三种模式Three modes · direct-edit / review / autoSkill + Workflow + Memory
三路裁定板Verdict board
每条 issue 都必须有去向every issue must route somewhere
ledger.json
fix 安全修复valid-fixable文本层面能改,不需要新实验,也不会让论断漂移。Text-level, no new experiment, no drift.
queue 交作者处理author-required缺实验、缺证据,或需要研究判断;进入待办队列。Needs evidence, experiment, or a judgment call.
drop 不成立invalid-drop误读、幻觉、重复或越界;记录证据后驳回。Misread, hallucinated, duplicate, or out of scope.
review → verdict → revise → verify不静默改稿no silent edit
判决样例 · 三类问题,三种去向Docket · three issues, ruled (format)ledger.json
headline gain mislabeled vs the paper's own table"outperforms ... by at least 29.7 and 53.5"
安全修复valid-fixable
self-selected CDNet subset, unfalsifiable rule"exclude sequences unrelated to MOS"
作者处理author-required
alleged contradiction is a misread of Table 2charge rests on a quote not in the text
不成立invalid-drop
真实问题进入 ledger 后不会被一股脑写进稿件:能安全改的才写补丁;缺实验或证据的留给作者;误判不会混进论文。Real ledger rows and the three-way verdict. A charge can land "no fix", which a yes-and rewriter cannot return.
自动流程Workflow (semantic fan-out, in sandbox, no fs) 确定性检查Orchestrator deterministic (Bash/Node, off-court) 作者参与Human in the loop AI 评审Agent Memory + ledgerMemory / ledger
01

三种模式,对应三种改稿意图Three modes for three editing scenarios

全局global

不是三个孤立命令,而是三种使用意图:小改动走 direct-edit,投稿前自查走 review;只有需要多轮自动收敛时,才显式开启 auto。Three entries, one engine. review and auto run the same courtroom engine; auto just removes the human second instance and swaps each human gate for a pre-authorized policy + return queue.

DIRECT-EDIT · 常态common

只改一处,直接回到 LaTeXSay the change → edit the LaTeX

  • 适合局部改写、压缩表达、去 AI 味,或把中文想法整理成英文论文段落Say what to change in Chinese or English → toolkit drafts → self-check → the author signs off → land it
  • 系统先起草,再自检,最后由作者确认后落稿No panel / no ledger, fast
  • 涉及版面的改动,还会走一遍「编译 → 渲染 → 看效果」Layout edits use loop B: compile → render → visual confirm
REVIEW · 投稿前自查occasional, pre-submission

投稿前,系统性找问题Systematic hardening

  • 适合投稿前做一次严格自查Courtroom per-issue adjudication engine (zoomed in the next sections)
  • 多个 reviewer 通读全文、提出问题,有争议的进入裁定Core = N domain reviewers each read the whole paper + routing by contestability
  • 最后裁成「安全修复 / 作者处理 / 不成立」三类A simple fixed 3-reviewer panel is also available as a quick check
AUTO · 自动迭代unattended · /goal

多轮 review-revise,改到收敛Run under /goal to convergence

  • 适合让它连续处理一批已经授权的明确问题Same engine, human removed; gates → pre-authorized policy + return queue
  • 安全改动自动落稿;缺实验、缺数据、判断不确定的问题进入待办队列Anti-drift: frozen spine + four-state meaning audit + per-passage cap
  • 当修改收益变小或触发上限时停止,避免越改越偏。完成 = 收敛检查通过。stops at clerk convergence (or applied-quiescence / a hard limit) · completion = ledger.js gate
共用引擎核心 · 在所有模式间共享Shared engine core · shared across all modes
decompose.jsledger.jsjournal.js apply-patch.jsanchor-diff.jscross-ref.jsspine.js compile-guard.jscompliance-check.jsmeaning-audit · 四态meaning-audit · four-state
设计原则:确定性护栏和语义 fan-out 分开跑,前者可追溯、可重放,后者专心处理判断问题。Design principle: deterministic guards and semantic fan-out are kept separate, so the deterministic checks stay traceable and re-runnable.
02

如何触发:自然语言默认保守,auto 必须明说How to use: triggering each mode

全局 · 入口global · entry

三模式不是三个命令,而是三种意图:direct-edit 与 review 按用户的描述自动路由只有 auto 需要你显式开启/goal),系统不会自己切到 auto。The three modes are not three commands but three intents: direct-edit and review are auto-routed by what is said; auto is the only explicit switch (/goal) and is never self-detected.

DIRECT-EDIT · 常态 · 自然语言common · natural language

直接说要改什么Just say what to change

  • 中/英口述一处改动,无需任何前缀
    「把这段 intro 改紧些」「polish 这段」「把我这段中文想法写成 LaTeX」「de-AI 这段」「这句压到一行」
    Describe one change in Chinese or English, no prefix needed:
    "tighten this intro", "polish this", "turn my Chinese idea for intro into LaTeX", "de-AI this", "compress this to one line"
  • 自动选 writing-toolkit prompt 起草 → logic-check 自检 → 作者确认 → 落稿It picks the matching writing-toolkit prompt to draft → logic-check self-check → the author signs off → land it
  • 大文件里指代不清时,先问是哪一段,不硬猜On a large file with an ambiguous target → it asks which passage rather than guessing
REVIEW · 投稿前 · 说「审稿」pre-submission · say "review"

要求挑问题 / 加固Ask it to critique / harden

  • 说出意图:审稿 / 评审 / critique / mock-review / hardenState the intent: review / critique / 审稿 / 评审 / mock-review / harden
  • 带范围:full(全文)或 passage <节/段/claim>(单点)With scope: full (whole paper) or passage <section/paragraph/claim> (one point)
  • 跑裁定引擎 → 人工把关:逐 issue 定方向 → tiebreak → 确认落稿Runs the courtroom engine → human gates: per-issue direction → tiebreak → authorize and land
AUTO · 自动迭代 · 显式开关unattended · explicit switch

用 /goal 跑到收敛Run under /goal to convergence

  • auto 的唯一入口:/goal "<可验证完成条件>"(或 config mode: autoThe only explicit entry: /goal "<verifiable completion condition>" (or config mode: auto)
  • 不会自动进入 auto;先两步前置确认(spine + reviewer 分配)+ 预授权 bounded-aggressive 策略 = auto 的事前授权Never self-detected; two up-front confirmations (spine + reviewer assignment) + the pre-authorized bounded-aggressive policy = auto's sign-off
  • 安全修改自动落稿,有风险的改动放入待办队列;完成 = ledger.js gate PASS,最后停在收敛、applied-quiescence 或硬上限Safe fixes land automatically, risky ones go to the return queue; completion = ledger.js gate PASS, ending at clerk convergence / applied-quiescence / a hard limit
一句话决策 · 选哪种,取决于想做什么One-line decision · which mode follows from the intent
想改一处 → 直接说Edit one spot → just say it 想挑问题 → 说「审稿」(可带 full / passage)Want critique → say "review" (optionally full / passage) 想让它自动跑多轮 → /goal + 完成条件Run unattended → /goal + completion condition
三种模式共用同一套引擎核心、同一组硬规则、同一个 ledger、同一道作者确认环节(auto 的确认 = 事前策略 + 待办队列,见护栏 1)。direct-edit / review 走正常对话即可,只有 auto 需要显式开 /goalAll three share one engine core, one set of hard rules, one ledger, one author sign-off gate (auto's sign-off = up-front policy + queue, see guardrail 1). direct-edit / review go through conversation; only auto needs an explicit /goal.
03

引擎流水线:一条 issue 怎么走完全程The engine pipeline: how one issue flows through

中观mid-level

下面是一条 issue 从提出到落稿或入队的路径。机械、轻微的问题不进庭审,走单独的快速 polish 流程;真正有争议的才进入裁定。Below is the order the engine processes one issue, step by step; mechanical and minor issues skip the trial and take a separate quick polish path.

就绪 · 点播放看 issue 流过Ready · click play to watch a charge flow through
0

分解Decompose

切成 claim 单元,保留稳定 passage-idSplit into claim units + stable passage-id

decompose.js
1 · WF

分派 + 通读Assign + holistic read

分派 N 个子领域,N 位领域 reviewer 各通读全文一遍assign N subfields; N domain reviewers each read the whole paper once

assign-reviewers · reading-check
2 · WF

覆盖 + 合并Coverage + merge

防略读审计 + 跨 reviewer 去重 → 按可争议性分流anti-skim audit + cross-reviewer dedup → route by contestability

coverage-auditor · merge
3 · WF

庭审(5→12 个评审视角)Trial (5→12 jurors)

全文辩护 / 5 人陪审带局部上下文 / 无多数升 12 → 法官三路裁定whole-paper defense / 5-juror local-context tier / escalate to 12 → judge routes three ways

trial
4 · WF

召回审计Recall audit

A 救回误丢的问题 + B 落稿前抽检强共识 majorMode A revives wrong drops + Mode B spot-checks consensus majors before the edit

recall-audit
5 · WF

起草Drafter

valid-fixable → 写最小补丁valid-fixable → a minimal patch

drafter
6

编辑安全 + 落稿Edit-safety + apply

anchor-diff + cross-ref → meaning / edit-audit → applyanchor-diff + cross-ref → meaning / edit-audit → apply

anchor-diff · cross-ref · apply-patch
7 · 作者author

书记官 / 终审Clerk / final

每轮结束时核对残留问题并判断是否收敛;review 由作者把关,auto 由 clerk 收尾round-boundary reconcile + converge; review = the author's gate, auto = clerk converges

clerk / queue
charge
点「播放流程」,跟着一条 issue 走完八步。Click "Play the engine flow" to follow one charge through all eight steps.
review: 第 7 步由作者把关(停在这里,不自动进下一轮)step 7 is the author's gate (it stops here, no auto-advance) auto: 第 7 步 → 书记官把本轮结果并入累计 ledger。外层循环在 /goal 下继续跑,直到收敛、没有新改动可做,或触发硬上限,gate 随后翻 PASSstep 7 → the clerk reconciles this clean round into the cumulative ledger; the outer loop runs under /goal, ending at clerk convergence / applied-quiescence / a hard limit, then the gate flips PASS
04

一案一庭:谁找问题,谁来裁定One case, one courtroom: the floor plan

局部 · triallocal · trial

每个 issue 单独处理:提出问题的 reviewer 只负责找问题,不参与裁定;独立评审组看双方证据后投票,judge 再给出三路裁定。下图沿用法庭隐喻,把检方、辩方、陪审团、法官和作者放到各自位置上。Each issue = one case; the defendant = the passage under attack, not the issue itself. Position is stance: prosecution left, defense right, jury center, judge above, appeal outside.

连续点击会依次演示 valid-fixable → invalid-drop → author-required 三条分流Click repeatedly to cycle the three routings: valid-fixable → invalid-drop → author-required
第一审 · agentsfirst instance · agents
法官 · 主持裁定Judge · presides, rules
1 个 presiding agent · 只裁定、不投票 · 设定 close_criterion1 presiding agent · rules only, no vote · sets close_criterion
检方Prosecution
N 位领域 reviewer 通读全文 · 提 charge + 引文,随后退场N domain reviewers read the whole paper · file charge + quote → step away
覆盖 + 合并 · 防略读Coverage + merge · anti-skim
coverage-auditor + merge · 去重 · 按争议程度分流(机械/轻微 → polish)coverage-auditor + merge · dedup · route by contestability (mechanical/minor → polish)
庭审陪审团 · 先派 5 位视角彼此独立的全新陪审员 · 无明显多数再升 12 · 够票数且一方 >60% 才判Trial jury · 5 fresh jurors with independent perspectives · escalate to 12 on no clear majority · decided at quorum + >60% one side
methodo
repro
theory
empirics
claims
novelty
stats
clarity
scope
deploy
最敌意most hostile
最善意most charitable
第一审先派 5 人:最敌意和最善意各占一端,其余视角彼此独立。每位拿到局部上下文,也能按需扩展;只有没有明显多数时才升到 12,因为视角差异足够大时,多加人手才有意义。the first tier sends 5 (one at the most-hostile end, one at the most-charitable), given local context with on-demand expansion; it escalates to 12 only on no clear majority, since adding jurors helps mainly when their perspectives genuinely differ
辩方Defense
author agent · 带证据替被攻击段落辩护author agent · argues its case with evidence
起草 drafterDrafter
同一 agent 转换角色 · 只对 valid-fixablesame agent, a different stage · valid-fixable only
执行官 · orchestrator · 庭外Bailiff · orchestrator · off-court
orchestrator + 确定性脚本 · 跑护栏 / 维护 ledger / 写 journal / gateorchestrator + deterministic scripts · run guards / keep ledger / write journal / gate
上诉 · 召回审计 → 第二审appeal · recall audit → second instance
点「审一桩案子」,看一条指控从起诉走到判决与分流。Click "Try a case" to watch one charge go from indictment to verdict and routing.
展开角色对照表 · 搭建时用来避免混淆Expand the roster · anti-confusion table (use it to keep roles straight)
法庭角色Courtroom role引擎实体Engine entity立场 / 职责Stance / duty⚠ 别混淆⚠ Do not confuse
指控 chargecharge一条 issueone issue受审的「案由」the matter on trial是「指控」,不是被告、不是 agentit is the charge, not the defendant, not an agent
被告defendant受攻击的 passage / claimthe passage / claim under attack被动,不发言passive, does not speak不是 agent,不是 issuenot an agent, not an issue
检方prosecutionN 位领域 reviewer(各通读全文)N domain reviewers (each reads the whole paper)提 charge + 引文files charge + quote提完退场,不 judge / 不为自己的 issue 辩steps away after filing, does not judge / does not defend its own issue
辩方defenseauthor agentauthor agent带证据为其辩护:已处理 / 越界 / 改了会漂移argues its case with evidence: already addressed / out of scope / fixing it would drift= author AGENT,不是人类;之后转为 drafter= author AGENT, not the human; later becomes the drafter
覆盖 + 合并coverage + mergecoverage-auditor + merge(workflow)coverage-auditor + merge (workflows)防略读审计 + 跨 reviewer 去重 + 按可争议性分流anti-skim audit + cross-reviewer dedup + route by contestability机械 / 轻微件走 polish 流程,不进庭审mechanical / minor items take the polish track, not the trial
庭审陪审团trial jury先 5 位、必要时升到 12 位、视角彼此独立的全新陪审员5, escalating to 12, fresh jurors with independent perspectives中立,带着局部上下文、凭双方证据投票neutral, votes on both sides' evidence with local contextfresh,既非 reviewer 也非 author;只有无明显多数才升 12fresh, neither reviewer nor author; escalates to 12 only on no clear majority
法官judge1 presiding agent1 presiding agent主持、汇总、裁定、设定 criterionpresides, tallies, rules, sets criterion只裁定,不投票rules only, does not vote
起草 drafterdrafterauthor agent(转换角色)author agent (different stage)对 valid-fixable 起草 minimal-edit;需新数据则诚实软化或升人minimal-edit for valid-fixable; for needs-data, honest softening or escalate同 persona、不同阶段same persona, different stage
召回审计recall auditfresh skepticfresh skepticA 复查每条 drop;B 落稿前抽检强共识 majorMode A re-checks every drop; Mode B spot-checks consensus majors before the edit不是原 reviewer(避免固守先前判断)not the original reviewer (avoids entrenched positions)
二审 / 终审second / final人类作者the human author只看升级件,终裁记录在案sees only escalations, final ruling logged= ,不是 author agent= the human, not the author agent
执行官 / orchestratorbailiff / orchestratororchestrator + 确定性脚本orchestrator + deterministic scripts跑护栏 / 维护 ledger / 写 journalruns guards / keeps ledger / writes journal庭外,不 judge,只跑确定性 + gateoff-court, does not judge, only runs the deterministic guards + gate
书记官 clerkclerkclerk(workflow)clerk (workflow)轮边界:核对遗留问题 vs 本轮改动、去重、判定是否收敛round boundary: reconcile carried open-questions vs this round's edits, dedup, decide convergence语义 workflow,不是上面的法警(确定性脚本)a semantic workflow, not the bailiff (deterministic scripts) above
05

先读全,再判准Reading check, zoomed

局部 · reading-checklocal · reading-check

分工原则很简单:找全靠阅读,判准靠裁定。阅读阶段只负责找问题和给引文,不负责判定是否成立;每条观察都必须引用原文(引不出 = 没真读,也可能是在编)。N 位领域 reviewer 各自通读全文一遍,再用三层机制防略读。Organizing principle: recall belongs to reading, precision to the courtroom. Readers only find + cite, they do not judge validity; they must quote the source (cannot quote = did not read, anti-skim and anti-hallucination). N domain reviewers each read the whole paper once, backstopped by a three-layer anti-skim.

Stage 1 分派 + 通读全文Assign + holistic read

assign-reviewers 先识别论文涉及的 N 个子领域,再实例化 N 位领域 reviewername N subfields for the paper, instantiate N domain reviewers
reading-check 每位 reviewer 各通读全文一遍 → weaknesses + overall_confidence + 每节覆盖报告each reads the whole paper once → weaknesses + overall_confidence + a per-section coverage report
每条观察必带逐字引文(可验证)。Each finding must include a verbatim quote (verifiable).

Stage 2 三层防略读Three-layer anti-skim

L1 逐节强制覆盖:确定性核验每节的 in_section_quoteforced per-section coverage: deterministically verify each in_section_quote
L2 coverage-auditor 标出 skim 掉的(reviewer,节)对flag skimmed (reviewer, section) pairs
L3 每个 flag → cap-1 定向重读补回each flag → a cap-1 targeted re-read to fill the gap
这三层专门防「看似读了,其实略过」。These three layers backstop "looked-read but actually skimmed".

Stage 3 合并 + 分流Merge + route

merge 跨 reviewer 语义去重,合并 raised_by(corroboration)semantic dedup across reviewers, merge raised_by (corroboration)
significance / kind 确定性派生(MAX / substantive-dominates)deterministically derived (MAX / substantive-dominates)
按可争议性分流route by contestability 机械 / 轻微 → polish;substantive-major → 庭审mechanical / minor → polish; substantive-major → trial
产出:去重后、引文可核验、覆盖情况可查的 issue → 交给路由器分两路Output: deduped · quote-verified · coverage-provable issues → the router splits two ways
因为 reviewer 通读全文,跨节不一致也会被直接抓出来,比如 abstract 写了「all benchmarks」,实验却只测一个 dataset。N domain reviewers reading the whole paper catch cross-section consistency (e.g. an abstract claiming "all benchmarks" against experiments on only one dataset) directly.
06

三路分流:不是所有意见都该写进论文Three-way routing and human gates

设计原则design principle

PaperJury 不默认相信 reviewer 意见。每个 issue 先核验,再按性质分流:能安全改的写补丁;缺实验、缺证据或需要研究判断的交给作者;不成立的直接驳回。Routes in two stages: first a deterministic contestability router sends mechanical + minor issues to a separate polish track (no jury); only substantive-major issues reach the trial. The three routes below are the judge's verdict inside the trial, downstream of the contestability router.

invalid-drop

误读、幻觉、重复、越界或严重度虚高:驳回,并保留证据。accurately judged invalid (hallucination / misread / duplicate / out of scope / inflated severity) → dropped, with evidence.

→ 仍会交给召回审计抽查→ to recall audit for re-check

valid-fixable

文本问题、claim 过强或结构不清;无需新实验或数据,也不会漂移:写最小补丁,过护栏后落稿。valid + text-fixable (no new experiments/data) + no drift → draft a minimal edit, pass the guards, land it.

→ drafter · 需要新数据则改判 author-required 或诚实软化→ drafter · if it needs new data, reroute to author-required or soften honestly

author-required

缺实验、缺数据,或需要研究判断:进入作者待办队列。这是正确分流,不算漏判。needs author-private info / new experiments / a judgment call → routed to the author (second instance). Correct handling, not a recall loss.

→ review 交作者 · auto 入队→ review hands it to the author · auto queues it
一条原则贯穿始终:拿不准就交给作者,不静默丢弃,也不静默改稿。
机械、轻微的问题不进裁定,单独走 polish 流程批量润色;如果其中藏着大问题,会重新送回评审。拿不准的也会进待办队列,不会悄悄消失。
软维度总会留下残余误差,所以不承诺完美,也不承诺零队列。
Design aim: make the judgment accurate and route three ways by the nature of the issue, so precision and recall need not be traded off. That trade-off mainly arises when a noisy validity score is thresholded; making the judgment accurate and routing three ways by the nature of the issue is meant to let both improve together.
queue = the irreducible "truly needs the author" + "could not be confidently classified". The first is irreducible; the second shrinks as accuracy rises. Compute goes into verification / grounding / adjudication (raising accuracy), not into "finding more issues".
Polish track: issues the contestability router judges mechanical / minor skip the jury and take a batch copy-edit / light-check; a misrouted major can escalate back to trial, the uncertain ones are queued (polish-review), never silently dropped.
Soft dimensions carry residual error; the rule = when unsure, hand it to the author, never silently drop or silently edit. No promise of perfection or a zero queue.
07

硬规则Hard rules

贯穿所有模式throughout all modes
1 作者授权author sign-off 未经作者授权绝不改稿;auto 通过「事前授权策略 + 待办队列」满足这一条,不逐条确认。never edit the manuscript without the author's authorization; auto satisfies it via up-front policy sign-off + queue (a named carve-out), not per-edit sign-off.
2 角色隔离isolation 检方、陪审、审计三方互不通气,也看不到 ledger;实现方式是 prompt 不给相关信息,并在每个 reviewer 型 prompt 里显式写明 ISOLATION。prosecution/jury/audit never cross-talk and never see the ledger; enforced by what is not in the prompt + an explicit ISOLATION instruction in every reviewer-type prompt.
3 必须有引文和判据quote + criterion required 观察必引原文(可核验);valid-fixable 由法官设 close_criterion。every observation must quote the source (verifiable); a valid-fixable issue carries a close_criterion (set by the judge).
4 草稿外信息不入稿zero leakage 回译、日志、自检结果只保留在作者侧,不写入论文正文。back-translations / logs / self-checks stay author-side, never enter the manuscript.
5 分歧必须显式处理disagreement resolves 有分歧就讨论;谈不拢再 override,并记录在案。动到 spine 锚的修复一律入队。disagreement resolves through discussion, then override (logged), never a silent dismissal (any fix touching a spine anchor is queued).
6 不写死项目路径no hardcoded paths skill 不含项目路径 / 文件,一切运行时解析。the skill carries no project paths / files; everything resolved at runtime.