docs: LLM-ready documentation suite — LLM_CONTEXT.md, module deep-dives, JSON schemas, CLI reference

Complete documentation system for LLM consumption: primary LLM_CONTEXT.md
(27KB system overview), 4 module deep-dives (composer, reaper-builder,
reaper-scripting, calibrator), 2 JSON Schema draft-07 contracts, CLI
reference, and README correction (FL Studio -> REAPER identity).
This commit is contained in:
renato97
2026-05-04 10:30:24 -03:00
parent b08dcccca2
commit 7bcd8052a9
13 changed files with 2402 additions and 29 deletions

View File

@@ -0,0 +1,68 @@
# Design: LLM-Ready Documentation Suite
## Technical Approach
Write 9 documentation files under `docs/` (currently empty) that ground any LLM in the real REAPER `.rpp` system. One primary file (`LLM_CONTEXT.md`, target ~35KB) serves as single-read grounding. Four module deep-dives, two JSON Schemas, and one CLI reference provide targeted detail. README.md gets a truth-preserving rewrite: FL Studio → REAPER, `.flp``.rpp`, no MCP. Zero code changes — all content derived from reading source files (`src/core/schema.py`, `src/reaper_builder/__init__.py`, `src/reaper_scripting/commands.py`, `src/calibrator/__init__.py`, `src/composer/*.py`, `scripts/*.py`).
LLM_CONTEXT.md sections follow a "what → how → where" flow: system identity first (REAPER target, `.rpp` format), then data model (all 11 dataclasses), then pipeline (compose → build → calibrate → validate → ReaScript), then module map, CLI reference, conventions, and extension guide. Each section self-contains the critical details so no multi-hop reading is required.
## Architecture Decisions
### Decision 1: LLM_CONTEXT.md — Flat Linear Sections
| Aspect | Flat Sections | Hierarchical (index + sub-pages) |
|--------|--------------|----------------------------------|
| LLM context window | One read → full knowledge | Multi-hop; LLM must follow links |
| Maintenance | Single file, coherent | Multiple files, drift risk |
| Size budget | ~35KB (within 40KB cap) | Index ~5KB, but requires cross-referencing |
| **Chosen** | ✅ | |
**Rationale**: LLMs process flat context windows efficiently. A self-contained 35KB file means the LLM reads once and understands architecture, data model, pipeline, module layout, and CLI entry points. Module deep-dives (`docs/modules/*.md`) are supplementary for targeted modifications, not prerequisites. Avoids the anti-pattern of forcing an LLM to chase links for basic system comprehension.
### Decision 2: JSON Schema — Hand-Written from Dataclass Source
| Aspect | Hand-Written | Auto-Generated (dataclasses-jsonschema) |
|--------|-------------|----------------------------------------|
| Exact draft-07 | Full control | Libraries target draft-2020-12 |
| Optional/Union types | Can match `str\|None``{"type": ["string","null"]}` exactly | May lose nullability nuance |
| Dependency | None (docs-only) | Adds pip dependency |
| Scale | 11 dataclasses — manageable | Overkill for this count |
| **Chosen** | ✅ | |
**Rationale**: 11 dataclasses total (`SongDefinition`, `SongMeta`, `TrackDef`, `ClipDef`, `MidiNote`, `SectionDef`, `PluginDef`, `PatternDef`, `ArrangementItemDef`, `CCEvent`, `ArrangementTrack`) are small enough for manual authoring. Manual writing ensures exact draft-07 compliance and correct handling of `Optional` types (`str | None`) and `field(default_factory=list)`. Auto-generation would add a tooling dependency for a documentation-only change and risk schema draft mismatches (spec requires draft-07, proposal mentions draft-2020-12 — resolved below).
## File Changes
| File | Action | Description |
|------|--------|-------------|
| `docs/LLM_CONTEXT.md` | Create | Primary entry point: architecture diagram, dataclasses, pipeline, module map, CLI, conventions, extension guide. Target 35KB. |
| `docs/CLI.md` | Create | Complete CLI reference: `scripts/compose.py` (--bpm, --key, --output, --seed, --emotion, --inversion, --no-calibrate), `scripts/generate.py` (--bpm, --key, --output, --seed, --emotion, --inversion, --validate), `scripts/run_in_reaper.py` (<rpp_path>, --output, --timeout, --plugins-config, --action) |
| `docs/modules/composer.md` | Create | Pattern generators (`patterns.py`, `rhythm.py`), chord engine (`chords.py`), melody engine (`melody_engine.py`), converters, templates, variation |
| `docs/modules/reaper-builder.md` | Create | `RPPBuilder` class, `PLUGIN_REGISTRY` (~150 entries), `ALIAS_MAP`, `PLUGIN_PRESETS`, preset transformer, `render.py` headless render |
| `docs/modules/reaper-scripting.md` | Create | `ReaScriptGenerator`, `ReaScriptCommand`, `ReaScriptResult`, command/result JSON contract, `ProtocolVersionError` |
| `docs/modules/calibrator.md` | Create | `Calibrator.apply()` post-processing, mix calibration presets (`calibrator/presets.py`) |
| `docs/schemas/song-definition.json` | Create | JSON Schema draft-07 for `SongDefinition` + all nested dataclasses |
| `docs/schemas/reascript-protocol.json` | Create | JSON Schema draft-07 for `ReaScriptCommand` and `ReaScriptResult` |
| `README.md` | Modify | Rewrite: REAPER `.rpp` target, real CLI scripts, link to `docs/LLM_CONTEXT.md`, remove FL Studio/MCP |
## Testing Strategy
| Layer | What to Test | Approach |
|-------|-------------|----------|
| Link integrity | All `[text](./path)` and `[text](#heading)` references | Shell script: extract all markdown links, verify each target exists on disk or as heading in destination file |
| Schema validity | `song-definition.json`, `reascript-protocol.json` | Validate each `.json` against JSON Schema meta-schema (draft-07) using `ajv` or Python `jsonschema`; validate a sample `SongDefinition.to_json()` output against `song-definition.json` |
| Field name accuracy | Dataclass field names in docs match `schema.py` | `grep` cross-check: for each field name in docs, verify exact match in source; run `grep -r "bpm\|velocity\|send_reverb" docs/` and diff against `schema.py` dataclass attributes |
| README truthiness | No FL Studio/MCP mentions | `grep -i "fl studio\|\.flp\|mcp" README.md` must return empty |
## Open Questions
- [ ] **Schema draft version**: Spec requires draft-07, proposal mentions draft-2020-12. Recommend draft-07 as it has wider LLM/tool support and matches spec requirement.
- [ ] **Module doc depth**: Should module docs include internal helper signatures (e.g., `_section_active`, `_get_kick_cache`) or only the public API surface (`RPPBuilder.write()`, `ChordEngine.progression()`)? Recommend public API only to avoid maintenance burden.
## Dependencies
None. All content derived from reading existing source files. No new packages, no code changes.
## Rollout
`git add docs/ README.md && git commit`. All changes additive to `docs/` + README.md rewrite. No code paths depend on docs. Zero-risk rollout.

View File

@@ -0,0 +1,65 @@
# Proposal: LLM-Ready Documentation
## Intent
No LLM can read this codebase and understand it. README.md is wrong (claims FL Studio `.flp`, mentions non-existent MCP server). `.sdd/design.md` references dead `FLPBuilder`. `docs/` is empty. An LLM encountering this project today would be misled for 3+ rounds before discovering it targets REAPER `.rpp`, not FL Studio `.flp`. We need a single entry-point document that instantly grounds any LLM in the real architecture.
## Scope
### In Scope
- `docs/LLM_CONTEXT.md` — primary entry point: architecture, data model, pipeline, module map, CLI reference, naming conventions, how to extend
- `docs/CLI.md` — complete CLI reference (compose, generate, run_in_reaper)
- `docs/modules/composer.md` — composition engine: pattern generators, converters, melody engine
- `docs/modules/reaper-builder.md` — RPP format, `RPPBuilder`, plugin registry (~150 plugins), presets
- `docs/modules/reaper-scripting.md` — ReaScript generation protocol, command/result JSON bridge
- `docs/modules/calibrator.md` — mix calibration presets and post-processing pipeline
- `docs/schemas/song-definition.json` — JSON Schema for `SongDefinition` (SongMeta, TrackDef, ClipDef, MidiNote, SectionDef)
- `docs/schemas/reascript-protocol.json` — JSON Schema for command.json / result.json contract
- `README.md` — fix: FL Studio → REAPER, remove MCP references, correct project structure
### Out of Scope
- `.sdd/design.md` update (separate change)
- Sphinx/pdoc API docs generation
- i18n docs
- Human-focused tutorials
## Capabilities
### New Capabilities
None — documentation only; no behavioral capability changes.
### Modified Capabilities
None.
## Approach
Hybrid: one primary file (`LLM_CONTEXT.md`, ~30KB) as immediate grounding document for any LLM session, plus modular deep-dive docs and JSON Schemas for contract validation. All docs live under existing `docs/` directory. Schemas use standard JSON Schema draft-2020-12. README.md gets a truth-preserving rewrite. Zero code changes — docs are generated manually with LLM assistance, not via docstring extraction.
## Affected Areas
| Area | Impact | Description |
|------|--------|-------------|
| `docs/` | New | Fill with 9 files: LLM_CONTEXT.md, CLI.md, modules/*.md, schemas/*.json |
| `README.md` | Modified | Fix FL Studio → REAPER, remove MCP, correct structure |
## Risks
| Risk | Likelihood | Mitigation |
|------|------------|------------|
| LLM_CONTEXT.md diverges from code | Low | Docs reference exact source files; verified during spec phase via grep cross-check |
| Plugin registry changes break builder docs | Low | Registry path is stable; doc notes it reflects current `PLUGIN_REGISTRY` dict |
## Rollback Plan
`git revert` the commit. All changes are additive to `docs/` + README.md rewrite; no code paths depend on docs. Revert is instant with zero side effects.
## Dependencies
- None (read-only source inspection; no new packages)
## Success Criteria
- [ ] LLM given only `docs/LLM_CONTEXT.md` correctly identifies: REAPER target, RPP format, module structure, CLI entry points
- [ ] JSON Schemas validate against actual `SongDefinition` and reascript command structures
- [ ] README.md accurately reflects project name, target DAW, and real features
- [ ] All module docs reference actual source files and class names

View File

@@ -0,0 +1,59 @@
# llm-docs Specification
## Purpose
Documentation-only capability. Defines 9 LLM-ready files that ground any LLM in the REAPER `.rpp` generation system. Zero code changes — all artifacts are additive under `docs/` plus a README rewrite.
## Requirements
### Requirement: LLM_CONTEXT.md Aggregates System Knowledge
`docs/LLM_CONTEXT.md` MUST be a single entry-point file under 40KB that an LLM can read to understand the entire system. Content MUST include: system description, ASCII architecture diagram, all dataclass definitions from `src/core/schema.py`, module index mapping each `src/` directory to its role, pipeline steps (compose → build → calibrate → validate), plugin system (`PLUGIN_REGISTRY`, `ALIAS_MAP`, presets), ReaScript protocol, CLI reference for `scripts/compose.py`, `scripts/generate.py`, `scripts/run_in_reaper.py`, naming conventions, and extension guide. File MUST be valid markdown.
#### Scenario: LLM grounds from LLM_CONTEXT.md alone
- GIVEN an LLM with no prior knowledge of fl_control
- WHEN it reads `docs/LLM_CONTEXT.md`
- THEN it identifies REAPER `.rpp` as the target format, understands `SongDefinition``TrackDef``ClipDef` as the core data model, knows module layout under `src/`, and locates CLI entry points
- AND the file is under 40KB valid markdown with all required sections present
### Requirement: Module Deep-Dives Enable Targeted Modification
Each module doc under `docs/modules/` MUST document: public API signatures with types, data flow in/out, dependencies, and known gotchas. Each MUST reference `LLM_CONTEXT.md` as parent.
#### Scenario: LLM modifies a module after reading its deep-dive
- GIVEN `docs/modules/reaper-builder.md` exists
- WHEN an LLM reads it
- THEN it identifies `RPPBuilder`, `PLUGIN_REGISTRY` format (key → display_name, filename, uid_guid), `write()` signature, and dependency on `src/core/schema.py`
### Requirement: JSON Schemas Match Dataclass Definitions
`docs/schemas/song-definition.json` MUST be JSON Schema draft-07 matching every field in `SongDefinition`, `SongMeta`, `TrackDef`, `ClipDef`, `MidiNote`, `SectionDef`, `PluginDef`, `PatternDef`, `ArrangementItemDef`, `CCEvent` as defined in `src/core/schema.py`. `docs/schemas/reascript-protocol.json` MUST match `ReaScriptCommand` and `ReaScriptResult` in `src/reaper_scripting/commands.py`. Field names, types, and required/optional flags MUST match the dataclass source exactly.
#### Scenario: Schema validates SongDefinition instance
- GIVEN a `SongDefinition` serialized via `to_json()`
- WHEN validated against `song-definition.json`
- THEN zero schema violations; every field name, type, and required constraint matches the dataclass
### Requirement: README.md Reflects True System Identity
`README.md` MUST NOT reference FL Studio, `.flp` files, or MCP server. MUST reference REAPER `.rpp` generation, list real CLI entry points (`compose.py`, `generate.py`, `run_in_reaper.py`), and point to `docs/LLM_CONTEXT.md`.
#### Scenario: New developer reads README
- GIVEN a developer reading `README.md`
- THEN they see "REAPER", ".rpp", and `docs/LLM_CONTEXT.md`
- AND they see NO mention of "FL Studio", ".flp", or "MCP"
### Requirement: Cross-Reference Integrity
All internal markdown links in docs MUST resolve to existing files or headings. Module names in docs MUST match actual Python module names under `src/`. Dataclass field names in docs MUST match `src/core/schema.py` exactly.
#### Scenario: Links resolve and names match code
- GIVEN all doc files under `docs/`
- WHEN an LLM or human follows every `[link](./path)` reference
- THEN every link resolves; no broken cross-references exist
- AND all module and field names match their definitions in code

View File

@@ -0,0 +1,33 @@
# Tasks: LLM-Ready Documentation
## Phase 1: Primary Entry Point
- [x] 1.1 `docs/LLM_CONTEXT.md` — Write full entry-point (~35KB). Sections: system identity (REAPER `.rpp`), ASCII architecture diagram, all 11 dataclasses from `src/core/schema.py`, pipeline (compose → build → calibrate → validate → ReaScript), module map (`src/` dirs → roles), plugin system (`PLUGIN_REGISTRY`, `ALIAS_MAP`, presets), ReaScript protocol, CLI reference (compose.py, generate.py, run_in_reaper.py), naming conventions, extension guide. Valid markdown, under 40KB.
## Phase 2: Module Deep-Dives
- [x] 2.1 `docs/modules/composer.md` — Document `src/composer/`: pattern generators (`patterns.py`, `rhythm.py`), chord engine (`chords.py`), melody engine (`melody_engine.py`), converters, templates, variation. Public API signatures, data flow, dependencies. Link to `LLM_CONTEXT.md`.
- [x] 2.2 `docs/modules/reaper-builder.md` — Document `RPPBuilder` class, `PLUGIN_REGISTRY` (~150 entries: key→display_name,filename,uid_guid), `ALIAS_MAP`, `PLUGIN_PRESETS`, preset transformer, `render.py` headless render. `write()` signature. Dependency on `src/core/schema.py`. Link to `LLM_CONTEXT.md`.
- [x] 2.3 `docs/modules/reaper-scripting.md` — Document `ReaScriptCommand`, `ReaScriptResult` (from `src/reaper_scripting/commands.py`), `ReaScriptGenerator`, command/result JSON contract, `ProtocolVersionError`. Link to `LLM_CONTEXT.md`.
- [x] 2.4 `docs/modules/calibrator.md` — Document `Calibrator.apply()` post-processing pipeline, mix calibration presets (`src/calibrator/presets.py`). Link to `LLM_CONTEXT.md`.
## Phase 3: Schemas and CLI Reference
- [x] 3.1 `docs/schemas/song-definition.json` — JSON Schema draft-07 for `SongDefinition` + all nested dataclasses (`SongMeta`, `TrackDef`, `ClipDef`, `MidiNote`, `SectionDef`, `PluginDef`, `PatternDef`, `ArrangementItemDef`, `CCEvent`, `ArrangementTrack`). Match field names and types exactly from `src/core/schema.py`.
- [x] 3.2 `docs/schemas/reascript-protocol.json` — JSON Schema draft-07 for `ReaScriptCommand` and `ReaScriptResult`. Match field names and types from `src/reaper_scripting/commands.py`.
- [x] 3.3 `docs/CLI.md` — Complete CLI reference: `compose.py` (--bpm, --key, --output, --seed, --emotion, --inversion, --no-calibrate), `generate.py` (--bpm, --key, --output, --seed, --emotion, --inversion, --validate), `run_in_reaper.py` (<rpp_path>, --output, --timeout, --plugins-config, --action). Link to `LLM_CONTEXT.md`.
## Phase 4: README Correction and Verification
- [x] 4.1 `README.md` — Replace FL Studio/`.flp` with REAPER/`.rpp`. Remove all MCP server references. List real CLI scripts. Link to `docs/LLM_CONTEXT.md`.
- [x] 4.2 Verify all internal links — Extract every `[text](path)` and `[text](#heading)` from all doc files; confirm each target file/heading exists.
- [x] 4.3 Validate JSON schemas — Validate `song-definition.json` and `reascript-protocol.json` against JSON Schema draft-07 meta-schema. Validate sample `SongDefinition.to_json()` output against `song-definition.json`.
- [x] 4.4 Verify field names match code — `grep` each dataclass field name found in docs against `src/core/schema.py`. `grep -i "fl studio|\.flp|mcp" README.md` must return empty.