Commit Graph

3968 Commits

Author SHA1 Message Date
David Baity
c5ec53c8e0 benchmark: run B — claude-sonnet-4.6
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-12 13:07:47 -04:00
PAE
6b2c4c0f26 [deliverable] 56ff71f9-710e-47cc-a7ec-64f480757559_01.md 2026-03-12 17:06:58 +00:00
David Baity
4c65db1f90 benchmark: run A — claude-opus-4.6
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-12 13:03:52 -04:00
PAE
2cd7205ff4 [deliverable] 653f4c62-6dc6-407f-bdc2-1fea27c18d51_01.md 2026-03-12 16:50:50 +00:00
PAE
53bca98ad9 [deliverable] 56ff71f9-710e-47cc-a7ec-64f480757559_01.md 2026-03-12 16:50:25 +00:00
PAE
cd40433e88 [deliverable] 9180636b-dc73-4166-8d93-e77f40e9ef41_01.md 2026-03-12 16:42:54 +00:00
PAE
c4fcdacc83 [deliverable] 56ff71f9-710e-47cc-a7ec-64f480757559_01.md 2026-03-12 15:50:59 +00:00
PAE
93d7792ede [deliverable] 653f4c62-6dc6-407f-bdc2-1fea27c18d51_01.md 2026-03-12 15:50:57 +00:00
PAE
43c45d4599 [deliverable] 9180636b-dc73-4166-8d93-e77f40e9ef41_01.md 2026-03-12 15:50:31 +00:00
David Baity
1e2cb6e875 enforce word count in chapter_polish and book_chapter templates
- chapter_polish: add explicit 'chapter_target_words minimum' warning to think hint
  with instruction to EXPAND scenes to reach target length
- chapter_polish: add word_count criterion (30%) to adjudication, restructure weights
- book_chapter PASS 1: strengthen word count instruction with explicit stop-early warning
- book_chapter: add word_count criterion (30%) to adjudication, restructure weights

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-12 11:38:16 -04:00
PAE
18b4689947 [deliverable] 56ff71f9-710e-47cc-a7ec-64f480757559_01.md 2026-03-12 15:23:54 +00:00
PAE
790370be31 [deliverable] 653f4c62-6dc6-407f-bdc2-1fea27c18d51_01.md 2026-03-12 15:21:56 +00:00
PAE
578366d072 [deliverable] 9180636b-dc73-4166-8d93-e77f40e9ef41_01.md 2026-03-12 15:21:06 +00:00
David Baity
ff38fff631 refactor: move all project folders into projects/ subdirectory
This change reorganizes the repository structure to keep the root directory
clean. All 15 project folders are now nested under projects/, alongside
infrastructure directories (agents/, templates/, deliverables/, rag/, skills/).

This allows the repository to grow without polluting the core service directories.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-12 11:09:34 -04:00
David Baity
db06dce05d feat: wire skills guides into templates, deduplicate Iris RAG
Skills guides wired (all were dead code — no templates declared skills:):
- book_chapter.yml: YAFictionGuide + RomanceFictionGuide + SciFiFictionGuide
- chapter_review.yml: same (Devon, Lane, Cora reviewers now have genre context)
- chapter_roundtable.yml: same (debate participants use genre craft knowledge)
- chapter_polish.yml: same (Iris polishes with full genre guide in context)
- short_story.yml: same
- blog_write.yml: BlogWritingGuide
- recipe_develop.yml: RecipeWritingGuide

All templates updated to include 'skills' in sections list so guides
are injected as SKILLS & GUIDES block in the prompt.

Iris RAG deduplication:
- agents/iris/rag/agent.rag.md: 15 near-identical entries -> 2 canonical
  Entry 1: Bible & Continuity Check requirement
  Entry 2: Editorial assignments (Devon/Lane/Cora with their roles)
  13 duplicates removed

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-12 09:39:05 -04:00
David Baity
acccb65af7 fix: roundtable early exit, iteration cap, and ghost-agent prevention
chapter_roundtable.yml:
- Reduce max_iterations 9 → 5 (3 rounds of 3 editors is enough; 9 was
  burning credits in a retry loop after credit exhaustion)
- Add explicit 'Once any participant outputs CONSENSUS REACHED, the
  debate is over' — prevents continuation into wasted rounds

planning.yml:
- Add ANTI-HALLUCINATION RULE FOR AGENTS block: explicitly names the
  known ghost agents (Worldbuilder, Prose Engine, Plot Architect, etc.)
  and forbids their use; maps task types to canonical CLP agents so
  planning LLM has unambiguous fallback assignments

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-12 09:19:46 -04:00
PAE
6c93574948 [deliverable] 9180636b-dc73-4166-8d93-e77f40e9ef41_01.md 2026-03-12 13:11:42 +00:00
PAE
3a1b072ddc [deliverable] 56ff71f9-710e-47cc-a7ec-64f480757559_01.md 2026-03-12 13:10:34 +00:00
PAE
a14ba61feb [deliverable] 653f4c62-6dc6-407f-bdc2-1fea27c18d51_01.md 2026-03-12 13:10:31 +00:00
David Baity
4c9222960d fix: template prompt bloat and variable substitution failures
- chapter_polish: remove sections:deliverables — chapter text already in
  {chapter_text}; this caused 150KB+ prompts for late chapters (40MB logs)
- chapter_roundtable: require structured CONSENSUS REACHED block so
  key_changes is always formatted as an extractable string; change
  key_changes schema from list to string to match
- book_chapter: remove sections:history to reduce context; restructure
  Pass 0 to plan-only (no prose output) so the chapter is only written
  once in Pass 1 instead of twice; add explicit instruction in package
  hint to copy full chapter_text into spawn context
- short_story: remove sections:history and sections:deliverables (standalone
  task, needs neither); restructure Pass 0 to plan-only, Pass 1 to write;
  add note to handle literal {genre_name} placeholders gracefully
- recipe_develop, ai_article_write, blog_write: remove sections:history
  (these standalone tasks do not need full project conversation history;
  deliverables kept so they can read the research/plan file)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-12 09:05:45 -04:00
PAE
5c1b1846fb [deliverable] 19f076ce-76a0-40da-915a-c1f2be1f1ff4_01.md 2026-03-12 09:34:13 +00:00
PAE
8328e13e2b [deliverable] 33fc675d-2bf2-40cc-8bbc-3348b600c976_01.md 2026-03-12 09:33:53 +00:00
PAE
21a6bc723f [deliverable] 3db67af8-1158-4055-8ae4-11835f10b0dc_01.md 2026-03-12 09:33:49 +00:00
PAE
f918ca856a [deliverable] f9551c95-a92c-488a-8896-7759268739ae_01.md 2026-03-12 09:33:17 +00:00
PAE
543f33a234 [deliverable] c3697679-8ace-4309-b177-c4c0d722afef_01.md 2026-03-12 09:33:15 +00:00
PAE
25129cd836 [deliverable] d73a0f51-0682-46e4-be98-1ffa13d6ac40_01.md 2026-03-12 09:32:55 +00:00
PAE
b8af72f8e5 [deliverable] 806f770b-16af-44e1-a450-d692203f4464_01.md 2026-03-12 09:32:46 +00:00
PAE
263a89e0b0 [deliverable] fcb57d00-1c74-462c-8fdc-d6709d7899d5_01.md 2026-03-12 09:32:36 +00:00
PAE
33fb53f84d [deliverable] 341fc5ca-897b-4cc4-a49e-aea0901323dd_01.md 2026-03-12 09:32:34 +00:00
PAE
d81808b670 [deliverable] 744b567c-f651-476e-a0ae-459fc77a1995_01.md 2026-03-12 09:32:00 +00:00
PAE
9ccbf70afc [deliverable] a36dbf16-b6d9-4dc6-94c8-e92e977006fe_01.md 2026-03-12 09:31:59 +00:00
PAE
f40c941edf [deliverable] 1ed156b5-03c7-4f4e-98b5-0bb35d7ac8bd_01.md 2026-03-12 09:31:51 +00:00
PAE
b5a3084092 [deliverable] 6dfe460d-a9de-4759-b8e2-63a5e1a7333c_01.md 2026-03-12 09:30:29 +00:00
PAE
d75c726f74 Adjudication: Task 58e35e21-c73d-4e1f-9d5c-a65ebd6340d7 2026-03-12 09:27:09 +00:00
PAE
7d61103301 [deliverable] review-ch-09-lane.md 2026-03-12 09:25:51 +00:00
PAE
7644087b46 [deliverable] review-ch-06-devon.md 2026-03-12 09:24:39 +00:00
PAE
960b145dcd Adjudication: Task 30b12723-ab6c-4742-8473-dd9b78f76b1e 2026-03-12 09:20:21 +00:00
PAE
ff4c7d4c5d Adjudication: Task dd7464eb-b0ba-4ead-a2bc-13d10557438d 2026-03-12 09:19:56 +00:00
PAE
5353a72352 [deliverable] review-ch-05-cora.md 2026-03-12 09:19:48 +00:00
PAE
2a5c5d4ce8 Adjudication: Task 021bb74c-4fc6-445b-9a46-c473fd6b045b 2026-03-12 09:18:21 +00:00
PAE
0b02289381 [deliverable] review-ch-04-lane.md 2026-03-12 09:16:06 +00:00
PAE
dab542b761 [deliverable] review-ch-09-cora.md 2026-03-12 09:13:56 +00:00
PAE
bc1d690b26 [deliverable] review-ch-10-agent-slug.md 2026-03-12 09:12:16 +00:00
PAE
416c33e7b1 [deliverable] review-ch-09-agent-slug.md 2026-03-12 09:11:45 +00:00
David Baity
09a0abe890 fix: chapter_review adjudication threshold 65→60, better criteria descriptions
- pass_threshold: 65→60 (reviews scoring 58-64 are high quality but barely
  missing the bar; editorial reviews are inherently subjective)
- deliverable_type: coordination→editorial_review (correct semantic type)
- Improved criteria descriptions to clearly signal this is an editorial
  feedback document, not a task completion report

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-12 05:11:41 -04:00
PAE
63bed7b776 [deliverable] review-ch-05-agent-slug.md 2026-03-12 09:10:30 +00:00
PAE
d4ec009377 [deliverable] review-ch-06-agent-slug.md 2026-03-12 09:09:42 +00:00
David Baity
8dfb1bb140 fix: roundtable max_iterations 3→9 for 3-agent, 3-round deliberation
With 3 agents and max_iterations=3, effective_max=max(3,3)=3, meaning
each agent speaks exactly once per roundtable — no back-and-forth.
Adjudicator correctly rejected these as incomplete (score=0-58 on
completeness/consensus_clarity/actionability).

9 iterations = 3 full rounds for 3 agents, allowing genuine debate:
Round 1: Initial assessments | Round 2: Responses | Round 3: Consensus

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-12 05:08:52 -04:00
PAE
9f810974dd [deliverable] review-ch-10-agent-slug.md 2026-03-12 09:07:50 +00:00
PAE
c8e7f2e74c [deliverable] review-ch-07-agent-slug.md 2026-03-12 09:03:14 +00:00