feat(sprint83b): research paper quality for business plans

- Rewrite company_proposal.yml: 5-source research pipeline, research_synthesis
  step, anti-refusal agent_prompt, remove rag section, fix input_from bug,
  fix source_step by name, add bibliography structure
- Fix strategic_review.yml proposal_brief: remove 'incubation' anchor word,
  explicit CLO-first instruction to stop wrong-company hallucination

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
David Baity
2026-04-29 13:04:24 -04:00
parent a2fac827bf
commit 3cdde2d5ab
2 changed files with 291 additions and 216 deletions

View File

@@ -1,13 +1,29 @@
name: company_proposal name: company_proposal
description: "Produce a complete, board-ready business plan for a proposed new Crimson Leaf subsidiary. Includes web research with cited sources." description: "Produce a complete, board-ready business plan for a proposed new Crimson Leaf subsidiary. Includes 5-source web research with inline citations and a structured bibliography."
debug: true debug: true
model: power model: power
system: agent_prompt
agent_prompt: |
You are Edgar Chen, CEO of Crimson Leaf Holdings. You are writing a formal business plan.
NON-NEGOTIABLE RULES FOR THIS TASK:
1. WRITE THE DOCUMENT. Do NOT ask for clarification. Do NOT produce "holding frames."
Do NOT say "I need more information." You have the brief -- that is enough.
2. USE THE TASK MESSAGE. The company name, slug, and purpose are in the task message.
Use them EXACTLY. Do NOT rename, rebrand, or creatively reinterpret the company.
3. CITE EVERYTHING. Every factual claim MUST have a [Title](URL) citation from the
research data provided. No invented statistics. No uncited claims.
4. IF RESEARCH IS EMPTY for a section, write: "No web data found for this area.
Structural analysis: [your reasoning]" -- then continue. Never refuse.
5. COMPLETE THE DOCUMENT. All required sections must be present. No truncation.
sections: sections:
- agent - agent
- project - project
- rag
- message - message
- instructions - instructions
steps: steps:
- type: tool - type: tool
action: git_read_file action: git_read_file
@@ -17,17 +33,17 @@ steps:
optional: true optional: true
- type: think - type: think
max_tokens: 200 max_tokens: 150
output_key: research_query_1 output_key: research_query_1
hint: | hint: |
You are preparing to write a business plan. The task message above describes a specific company to propose.
The brief from the strategic review is in the task message above. Read it carefully. Extract the company type and purpose.
Generate ONE precise web search query to find market size and growth rate data Write ONE precise search query to find: market size and growth rate data
for this type of business. for this type of business or capability.
Output ONLY the single search query. No numbering. No explanation. No extra text. Output ONLY the search query. Nothing else. No numbering. No explanation.
Example: AI operations cost tracking software market size 2025 Example: "AI cost tracking software market size 2025 growth rate"
- type: tool - type: tool
capability: Tool_WebSearcher capability: Tool_WebSearcher
@@ -36,204 +52,273 @@ steps:
max_results: 6 max_results: 6
fetch_pages: 3 fetch_pages: 3
optional: true optional: true
output_key: market_research_1 output_key: research_1
- type: think - type: think
max_tokens: 600 max_tokens: 150
output_key: research_query_2 output_key: research_query_2
hint: | hint: |
Your previous research covered market size and competitors. The task message describes a specific company type.
Now formulate ONE focused search query to find: Write ONE search query to find: revenue models, pricing benchmarks,
- Revenue models and monetization strategies in this exact niche and monetization strategies for this exact type of business.
- Pricing benchmarks or average deal sizes
- Customer pain points or unmet needs documented online
Output ONLY the search query. Nothing else. Output ONLY the search query. Nothing else.
Example: "AI API cost management SaaS pricing models revenue"
- type: tool - type: tool
capability: Tool_WebSearcher capability: Tool_WebSearcher
input_from: last_text input_from: research_query_2
mode: research
max_results: 5
fetch_pages: 3
optional: true
output_key: research_2
- type: think
max_tokens: 150
output_key: research_query_3
hint: |
The task message describes a specific company type.
Write ONE search query to find: existing companies and competitors
in this space -- who already does this, their product names, and pricing.
Output ONLY the search query. Nothing else.
Example: "AI spending analytics platforms competitors 2025 enterprise"
- type: tool
capability: Tool_WebSearcher
input_from: research_query_3
mode: research
max_results: 5
fetch_pages: 3
optional: true
output_key: research_3
- type: think
max_tokens: 150
output_key: research_query_4
hint: |
The task message describes a specific company type.
Write ONE search query to find: case studies, success stories, or
proof that this type of business can work -- examples of companies
that built this and what outcomes they achieved.
Output ONLY the search query. Nothing else.
Example: "AI API cost visibility ROI case study implementation success"
- type: tool
capability: Tool_WebSearcher
input_from: research_query_4
mode: research mode: research
max_results: 5 max_results: 5
fetch_pages: 2 fetch_pages: 2
optional: true optional: true
output_key: market_research_2 output_key: research_4
- type: think - type: think
max_tokens: 2000 max_tokens: 150
output_key: exec_summary output_key: research_query_5
hint: | hint: |
You are writing a professional business plan for a proposed subsidiary. The task message describes a specific company type.
The brief from the strategic review is in the task message above. Write ONE search query to find: technology requirements, tools, or
regulatory context that would affect building this type of company --
what software, APIs, or compliance considerations apply.
IMPORTANT: Use the company name, slug, and purpose EXACTLY as stated in Output ONLY the search query. Nothing else.
the brief. Do NOT rename, rebrand, or creatively reinterpret the company. Example: "AI spend tracking tools technology stack requirements 2025"
The brief was written from the mission charter -- it is authoritative.
=== PRIOR PROPOSALS (read this before writing) === - type: tool
capability: Tool_WebSearcher
input_from: research_query_5
mode: research
max_results: 5
fetch_pages: 2
optional: true
output_key: research_5
- type: think
max_tokens: 4000
output_key: research_synthesis
hint: |
You have completed 5 web searches. Compile ALL findings into a structured
research synthesis. This will be cited throughout the business plan.
=== SEARCH 1: Market Size and Growth ===
{research_1}
=== END SEARCH 1 ===
=== SEARCH 2: Revenue Models and Pricing ===
{research_2}
=== END SEARCH 2 ===
=== SEARCH 3: Competitors and Existing Players ===
{research_3}
=== END SEARCH 3 ===
=== SEARCH 4: Case Studies and Success Stories ===
{research_4}
=== END SEARCH 4 ===
=== SEARCH 5: Technology and Regulatory Context ===
{research_5}
=== END SEARCH 5 ===
=== PRIOR PROPOSALS (do NOT repeat these) ===
{prior_proposals} {prior_proposals}
=== END PRIOR PROPOSALS === === END PRIOR PROPOSALS ===
The prior proposals above are your institutional memory. Study them. Produce a RESEARCH SYNTHESIS with this exact structure:
Do NOT repeat a company that was already proposed.
=== MARKET RESEARCH (primary - market size/growth) === ## Research Synthesis
{market_research_1}
=== END RESEARCH ===
=== MARKET RESEARCH (revenue/monetization) === ### Key Statistics
{market_research_2} List 5-10 specific data points found across all searches.
=== END RESEARCH === Format: - [STAT]: [value] -- Source: [Title](URL)
If a search returned nothing, note "No data found" for that category.
Write the EXECUTIVE SUMMARY section. Cite specific research findings inline ### Competitor Landscape
using [Title](URL) markdown format wherever you reference external data. List all named companies/products found in search 3.
Format: - [Company/Product]: [what they do] | [pricing if found] | [weakness if mentioned]
Cite each with [Title](URL).
Include: ### Case Studies Found
List any success stories or ROI examples from search 4.
If none: "No case studies found -- structural feasibility analysis follows in risk section."
### Technology Findings
Key tools, APIs, or requirements from search 5.
### Complete Source List
Number every URL found across all 5 searches.
[1] [Title](URL) -- what data this source provided
[2] [Title](URL) -- what data this source provided
(continue for all sources)
- type: think
max_tokens: 3000
output_key: exec_summary
hint: |
Write the EXECUTIVE SUMMARY section NOW.
Do NOT ask for clarification. Do NOT add preambles.
The company to propose is in the task message above -- use it EXACTLY.
=== RESEARCH SYNTHESIS ===
{research_synthesis}
=== END ===
1. PROPOSED COMPANY 1. PROPOSED COMPANY
- Full name and slug (EXACTLY as given in the brief) - Full name and slug EXACTLY from the task message
- One-sentence purpose statement - One-sentence purpose
- Which existing gap it closes - Which gap it closes
2. PROBLEM STATEMENT 2. PROBLEM STATEMENT
What is Crimson Leaf Holdings unable to do today without this company? What can Crimson Leaf NOT do today without this company? Be specific.
Be specific. Give examples of decisions we cannot make.
3. MARKET OPPORTUNITY 3. MARKET OPPORTUNITY
Use data from the research above. Cite every statistic. Cite every statistic with [Title](URL) from the research synthesis.
What is the addressable market? What is the growth trajectory? If no market data was found, say so and use structural analysis.
What do competitors charge? What are they missing?
4. PROPOSED SOLUTION 4. PROPOSED SOLUTION
How does this company close the gap? How does this close the gap? First 30 days, first 90 days.
What would it do in the first 30 days? First 90 days?
5. STRATEGIC FIT 5. STRATEGIC FIT
How does this company advance the primary mission (profitable AI publishing)? How does this advance the primary mission (profitable AI publishing)?
How does it interact with existing subsidiaries?
Write in professional business plan style. No fluff. No filler.
Every factual claim should be backed by a citation from the research.
- type: think
max_tokens: 1500
output_key: cost_model
hint: |
Write the COST MODEL AND FINANCIAL PROJECTIONS section.
Reference any pricing or market data from the research to ground your projections.
Include:
1. SETUP COSTS
- Gitea repo creation (one-time, zero API cost, ~5 minutes of Copilot time)
- Initial template development (estimate tasks * cost per task)
- Initial agent configuration
2. RECURRING OPERATIONAL COSTS
- Estimated tasks per week at steady state
- Average cost per task (power model: ~$0.05-0.15 per task typical range)
- Weekly and monthly API cost projection
3. COST-BENEFIT ANALYSIS
- What is the cost of NOT having this company? (lost revenue, blind decisions, etc.)
- What is the break-even point?
- Cite any relevant market benchmarks from the research above.
4. BUDGET CONSTRAINT CHECK
- Current Genesis Fund status: UNKNOWN (this is itself evidence for CLO need)
- Does this proposal create a self-funding loop? How?
Use real numbers from research where available. Estimate clearly where not.
- type: think
max_tokens: 1200
output_key: risk_analysis
hint: |
Write the RISK ANALYSIS AND ALTERNATIVES CONSIDERED section.
Include:
1. RISKS OF PROCEEDING
- Could this company create scope creep or distraction?
- Could it duplicate work already handled by an existing subsidiary?
Rate each risk: Low / Medium / High
2. RISKS OF NOT PROCEEDING
- What keeps getting worse if we wait another 30 days?
- Are there compounding risks?
Rate each risk: Low / Medium / High
3. COMPETITIVE RISK
Reference the competitor research above. Which competitors are already in this space?
What is their moat? Can we out-specialize them? Cite with [Title](URL).
4. ALTERNATIVES CONSIDERED
For each alternative, explain why it was rejected:
A. Solve with a new template in an existing company
B. Solve with a one-time manual report
C. Expand an existing subsidiary's charter
D. Wait -- this is not urgent enough yet
5. RECOMMENDATION
Given the above, do you still recommend proceeding?
If yes, state the minimum viable version.
- type: think - type: think
max_tokens: 2000 max_tokens: 2000
output_key: company_spec output_key: cost_model
hint: | hint: |
Write the PROPOSED COMPANY SPECIFICATION section. Write the COST MODEL AND FINANCIAL PROJECTIONS section NOW.
This is the technical blueprint David and Copilot will use to build the company. Cite from the research synthesis where available.
Include: === RESEARCH SYNTHESIS ===
{research_synthesis}
=== END ===
1. COMPANY RECORD 1. SETUP COSTS
- company_id: (leave as TBD -- David assigns) - Gitea repo creation (one-time, zero API cost)
- name: (full display name) - Template development estimate
- slug: (lowercase, underscores, e.g. crimson_leaf_operations) - Agent configuration
- parent_company: crimson_leaf
- mission: (one sentence)
- tagline: (marketing one-liner)
- type: (operations / research / production / marketing)
- status: active
2. PROPOSED AGENTS 2. RECURRING OPERATIONAL COSTS
For each agent: - Tasks per week at steady state
- Role title and name - Average cost per task (power model: ~$0.05-0.15 typical)
- Personality and focus in 2-3 sentences - Weekly and monthly API cost projection
- Primary responsibilities (bullet list)
- Model recommendation: power / standard / fast
- supported_templates list
3. PROPOSED TEMPLATES (MVP set) 3. COST-BENEFIT ANALYSIS
For each template: - Cost of NOT having this company?
- Template name and purpose - Break-even point?
- Key steps (2-5 sentences describing the pipeline) - Cite pricing benchmarks with [Title](URL) if found.
- Trigger: scheduled / on-demand / spawned-by
- Estimated cost per run
4. SCHEDULE 4. BUDGET CONSTRAINT CHECK
- What scheduled tasks should run? At what frequency? - Does this create a self-funding loop?
- What should run on demand vs on a schedule?
5. 90-DAY SUCCESS CRITERIA
List 3-5 measurable outcomes that would confirm this company is delivering value.
Each criterion must be verifiable without subjective judgment.
6. DEPENDENCIES
- What must exist before this company can operate?
- Any data feeds, integrations, or other companies required?
- type: think - type: think
max_tokens: 7000 max_tokens: 1500
output_key: risk_analysis
hint: |
Write the RISK ANALYSIS AND ALTERNATIVES CONSIDERED section NOW.
=== RESEARCH SYNTHESIS (competitor data) ===
{research_synthesis}
=== END ===
1. RISKS OF PROCEEDING -- rate each: Low / Medium / High
2. RISKS OF NOT PROCEEDING -- what gets worse? Rate each.
3. COMPETITIVE RISK
Use competitor data from the synthesis. Cite with [Title](URL).
4. ALTERNATIVES CONSIDERED
A. New template in existing company -- why rejected?
B. One-time manual report -- why rejected?
C. Expand existing subsidiary -- why rejected?
D. Wait -- why rejected?
5. RECOMMENDATION
Proceed? State the minimum viable version.
- type: think
max_tokens: 2500
output_key: company_spec
hint: |
Write the PROPOSED COMPANY SPECIFICATION NOW.
Use the EXACT company name and slug from the task message.
1. COMPANY RECORD
company_id: TBD (David assigns)
name: (from task message)
slug: (from task message)
parent_company: crimson_leaf
mission: (one sentence)
tagline: (one-liner)
type: (operations/research/production/marketing)
status: active
2. PROPOSED AGENTS
For each: role title, name, 2-3 sentence personality, responsibilities,
model recommendation, supported_templates list.
3. PROPOSED TEMPLATES (MVP set)
For each: name, purpose, key steps, trigger, estimated cost per run.
4. SCHEDULE -- what runs on what frequency?
5. 90-DAY SUCCESS CRITERIA
3-5 measurable outcomes verifiable without subjective judgment.
6. DEPENDENCIES -- what must exist before this company can operate?
- type: think
max_tokens: 9000
output_key: full_proposal output_key: full_proposal
hint: | hint: |
Assemble the complete business plan from all sections already written. Assemble the complete business plan NOW.
Format as a professional document David will read and approve or reject. Do NOT truncate any section. Do NOT add preamble notices.
Use the company name EXACTLY from the task message.
REQUIRED STRUCTURE: # Proposal: [Company Full Name from task message]
# Proposal: [Company Full Name]
Submitted by: Edgar Chen, CEO, Crimson Leaf Holdings Submitted by: Edgar Chen, CEO, Crimson Leaf Holdings
Task ID: {task.id} Task ID: {task.id}
Status: AWAITING DAVID'S APPROVAL Status: AWAITING DAVID'S APPROVAL
@@ -245,9 +330,9 @@ steps:
--- ---
## Market Research Sources ## Research Sources
List all sources cited in this document as a reference section: (Paste the "Complete Source List" from the research synthesis)
- [Title](URL) -- brief note on what data was drawn from this source {research_synthesis}
--- ---
@@ -267,20 +352,18 @@ steps:
--- ---
## Signature Block ## Signature Block
Edgar Chen certifies this proposal meets the governance requirements of the Edgar Chen certifies this proposal meets Crimson Leaf Holdings governance requirements:
Crimson Leaf Holdings charter:
- No existing subsidiary duplicates this charter - No existing subsidiary duplicates this charter
- No existing template or tool can solve this gap - No existing template or tool can solve this gap
- No proposal for this company has been submitted in the last 30 days - No proposal for this company has been submitted in the last 30 days
- A full business plan with market research is provided - A full business plan with 5-source web research and inline citations is provided
This proposal requires David Baity's explicit approval before any action is taken. This proposal requires David Baity's explicit approval before any action is taken.
No company will be created until approval is received.
Output ONLY the document content. Start with the # Proposal heading. Output ONLY the document. Start with the # Proposal heading.
- type: document - type: document
source_step: 9 source_step: full_proposal
dest_path: "deliverables/proposals/proposal-{task.id}.md" dest_path: "deliverables/proposals/proposal-{task.id}.md"
commit_msg: "proposal: company_proposal task={task.id}" commit_msg: "proposal: company_proposal task={task.id}"
@@ -290,20 +373,20 @@ steps:
hint: | hint: |
You just wrote a company proposal. Update the proposal index. You just wrote a company proposal. Update the proposal index.
Read the current index content: Current index content:
{prior_proposals} {prior_proposals}
Append a new entry at the bottom of the "## Submitted Proposals" section. Append a new entry at the bottom of the "Submitted Proposals" section.
Use this exact format: Use this EXACT format (plain ASCII only -- no emojis, no Unicode):
### [Company Full Name] -- Task {task.id} ### [Company Full Name] -- Task {task.id}
Date: 2026-04-29 Date: 2026-04-29
Status: AWAITING DAVID'S APPROVAL Status: AWAITING DAVID'S APPROVAL
Summary: [2-3 sentences: what was proposed, what gap it fills, key differentiator Summary: [2-3 sentences: what was proposed, what gap it fills, how it
from any prior proposals] differs from any prior proposals]
Output the COMPLETE updated index file content from the very beginning. Output the COMPLETE updated index file from the very beginning.
Do not truncate. Do not skip sections. Include all prior entries plus the new one. Do not truncate. Include all prior entries plus the new one.
- type: document - type: document
source_step: index_update source_step: index_update
@@ -314,17 +397,14 @@ steps:
target: channel target: channel
channel_name: "crimson_leaf:general" channel_name: "crimson_leaf:general"
hint: | hint: |
Write a brief, professional message to David announcing that a company proposal Write a professional 4-6 line message to David announcing the proposal is ready.
is ready for his review.
Include: Include:
- Which company is being proposed (name and slug) - Which company is proposed (name and slug from task message)
- The core gap it addresses (one sentence) - The core gap it addresses (one sentence)
- Key market data found (one statistic with citation) - One key market statistic with citation (from research synthesis if available)
- Where to find the full proposal (Gitea: pae/crimson_leaf/deliverables/proposals/) - Where to find it: pae/crimson_leaf/deliverables/proposals/
- A clear request for approval or feedback - A clear request for approval or feedback
Plain ASCII only. No emojis. No Unicode characters.
Keep it to 4-6 lines. Professional tone. No fluff.
adjudication: adjudication:
enabled: true enabled: true
@@ -333,13 +413,13 @@ adjudication:
criteria: criteria:
market_research: market_research:
weight: 30 weight: 30
description: "Real web-sourced market data with [Title](URL) citations -- no invented statistics" description: "At least 5 real web-sourced data points with [Title](URL) inline citations -- no invented statistics"
completeness: completeness:
weight: 25 weight: 25
description: "All required sections present: exec summary, cost model, risk analysis, company spec" description: "All required sections present: exec summary, cost model, risk analysis, company spec, source list"
specificity: specificity:
weight: 25 weight: 25
description: "Specific company name/slug, concrete 90-day metrics, no vague filler" description: "Specific company name/slug from task message, concrete 90-day metrics, no vague filler"
no_duplicates: no_duplicates:
weight: 20 weight: 20
description: "Proposed company does not duplicate an existing subsidiary or prior proposal" description: "Proposed company does not duplicate an existing subsidiary or prior proposal"

View File

@@ -152,47 +152,42 @@ steps:
max_tokens: 800 max_tokens: 800
output_key: proposal_brief output_key: proposal_brief
hint: | hint: |
The proposal decision is: {proposal_project_slug} Gate: if {proposal_project_slug} is "none", output exactly: none
If {proposal_project_slug} is "none", output exactly: none If the gate passes, you are writing a task brief for Crimson Leaf Operations (CLO).
This is the company brief that will be sent to the company_proposal pipeline.
Do NOT use anything from the board debate above -- that was about execution, not proposals.
Do NOT rename the company. Do NOT propose anything else.
STOP. Do not read the board debate above. Do not use anything from the === MISSION CHARTER (authoritative source for company definition) ===
board debate in this brief. The board debate is about execution -- this
step is a separate, independent process.
Read ONLY the mission charter and prior proposals below.
=== MISSION CHARTER (authoritative) ===
{mission_data} {mission_data}
=== PRIOR PROPOSALS (all rejected -- do NOT repeat) === === PRIOR PROPOSALS (all rejected -- you must not repeat these) ===
{prior_proposals_index} {prior_proposals_index}
=== ===
The mission charter identifies GAP 1 as the top priority: THE COMPANY TO PROPOSE: Crimson Leaf Operations (CLO)
Company: Crimson Leaf Operations (CLO) This is a FIXED requirement -- not a suggestion. CLO is the only company
Slug: crimson_leaf_operations authorized as GAP 1 in the mission charter.
Mission: INTERNAL financial visibility for Crimson Leaf Holdings.
Edgar cannot see API costs or revenue. CLO fixes that.
This is NOT a client-facing company. NOT a storefront. NOT an author service.
It is an internal accounting and analytics subsidiary.
The prior proposals were ALL rejected because they proposed a client-facing CLO is:
indie author DTC storefront. That is the wrong company entirely. - An INTERNAL operations and accounting subsidiary
- Its purpose: give Edgar (and David) financial visibility into API costs,
revenue, and margins across all subsidiaries
- It generates weekly P&L reports, tracks cost-per-task, surfaces burn rate
- It does NOT interact with external clients, authors, or customers
- It is NOT a storefront, agency, or services business
Write the task brief for company_proposal. Include: Write the task brief. Include:
- Company: Crimson Leaf Operations (exact name, no renaming) - Company name: Crimson Leaf Operations
- Slug: crimson_leaf_operations - Slug: crimson_leaf_operations
- Gap: Edgar has zero financial visibility (cannot see spend, revenue, or margins) - The gap it fills: Edgar has zero financial visibility right now
- What CLO does: tracks API costs per task/company, generates weekly P&L, - What CLO does specifically: cost tracking, P&L, margin visibility
gives David real cost-per-chapter and margin data - Why now: 30 more days of blind spending
- Why now: every strategic decision is made blind; 30 more days means 30 more - What it is NOT (to prevent the proposal template from going off-rails)
days of unknown burn rate - 90-day success criteria: weekly P&L in #general, cost-per-chapter visible
- 90-day success criteria: weekly P&L report delivered to #general,
cost-per-chapter tracked and visible in portfolio report
- What it is NOT: not a client agency, not a storefront, not author services
Output ONLY the brief text. No labels. No preamble. Output ONLY the brief text. No labels. No preamble. No "Here is the brief:".
- type: tool - type: tool
action: enqueue_strategy action: enqueue_strategy