### Submitted Proposals ### Crimson Leaf -- Task ee0c11c4-33d0-49ae-a8e1-f9ab2c34e35b Date: 2026-04-29 Status: AWAITING DAVID'S APPROVAL Summary: This proposal outlines the development of the Foreman Probe, a standardized suite of model probe tasks designed to benchmark LLM reasoning and instruction-following. It fills the gap in internal evaluation by providing a controlled environment for performance stress-testing. Unlike previous iterations, this approach focuses on the Foreman's specific task-creation logic to ensure higher difficulty scaling.