Files
renato97 7bcd8052a9 docs: LLM-ready documentation suite — LLM_CONTEXT.md, module deep-dives, JSON schemas, CLI reference
Complete documentation system for LLM consumption: primary LLM_CONTEXT.md
(27KB system overview), 4 module deep-dives (composer, reaper-builder,
reaper-scripting, calibrator), 2 JSON Schema draft-07 contracts, CLI
reference, and README correction (FL Studio -> REAPER identity).
2026-05-04 10:30:24 -03:00

69 lines
6.1 KiB
Markdown

# Design: LLM-Ready Documentation Suite
## Technical Approach
Write 9 documentation files under `docs/` (currently empty) that ground any LLM in the real REAPER `.rpp` system. One primary file (`LLM_CONTEXT.md`, target ~35KB) serves as single-read grounding. Four module deep-dives, two JSON Schemas, and one CLI reference provide targeted detail. README.md gets a truth-preserving rewrite: FL Studio → REAPER, `.flp``.rpp`, no MCP. Zero code changes — all content derived from reading source files (`src/core/schema.py`, `src/reaper_builder/__init__.py`, `src/reaper_scripting/commands.py`, `src/calibrator/__init__.py`, `src/composer/*.py`, `scripts/*.py`).
LLM_CONTEXT.md sections follow a "what → how → where" flow: system identity first (REAPER target, `.rpp` format), then data model (all 11 dataclasses), then pipeline (compose → build → calibrate → validate → ReaScript), then module map, CLI reference, conventions, and extension guide. Each section self-contains the critical details so no multi-hop reading is required.
## Architecture Decisions
### Decision 1: LLM_CONTEXT.md — Flat Linear Sections
| Aspect | Flat Sections | Hierarchical (index + sub-pages) |
|--------|--------------|----------------------------------|
| LLM context window | One read → full knowledge | Multi-hop; LLM must follow links |
| Maintenance | Single file, coherent | Multiple files, drift risk |
| Size budget | ~35KB (within 40KB cap) | Index ~5KB, but requires cross-referencing |
| **Chosen** | ✅ | |
**Rationale**: LLMs process flat context windows efficiently. A self-contained 35KB file means the LLM reads once and understands architecture, data model, pipeline, module layout, and CLI entry points. Module deep-dives (`docs/modules/*.md`) are supplementary for targeted modifications, not prerequisites. Avoids the anti-pattern of forcing an LLM to chase links for basic system comprehension.
### Decision 2: JSON Schema — Hand-Written from Dataclass Source
| Aspect | Hand-Written | Auto-Generated (dataclasses-jsonschema) |
|--------|-------------|----------------------------------------|
| Exact draft-07 | Full control | Libraries target draft-2020-12 |
| Optional/Union types | Can match `str\|None``{"type": ["string","null"]}` exactly | May lose nullability nuance |
| Dependency | None (docs-only) | Adds pip dependency |
| Scale | 11 dataclasses — manageable | Overkill for this count |
| **Chosen** | ✅ | |
**Rationale**: 11 dataclasses total (`SongDefinition`, `SongMeta`, `TrackDef`, `ClipDef`, `MidiNote`, `SectionDef`, `PluginDef`, `PatternDef`, `ArrangementItemDef`, `CCEvent`, `ArrangementTrack`) are small enough for manual authoring. Manual writing ensures exact draft-07 compliance and correct handling of `Optional` types (`str | None`) and `field(default_factory=list)`. Auto-generation would add a tooling dependency for a documentation-only change and risk schema draft mismatches (spec requires draft-07, proposal mentions draft-2020-12 — resolved below).
## File Changes
| File | Action | Description |
|------|--------|-------------|
| `docs/LLM_CONTEXT.md` | Create | Primary entry point: architecture diagram, dataclasses, pipeline, module map, CLI, conventions, extension guide. Target 35KB. |
| `docs/CLI.md` | Create | Complete CLI reference: `scripts/compose.py` (--bpm, --key, --output, --seed, --emotion, --inversion, --no-calibrate), `scripts/generate.py` (--bpm, --key, --output, --seed, --emotion, --inversion, --validate), `scripts/run_in_reaper.py` (<rpp_path>, --output, --timeout, --plugins-config, --action) |
| `docs/modules/composer.md` | Create | Pattern generators (`patterns.py`, `rhythm.py`), chord engine (`chords.py`), melody engine (`melody_engine.py`), converters, templates, variation |
| `docs/modules/reaper-builder.md` | Create | `RPPBuilder` class, `PLUGIN_REGISTRY` (~150 entries), `ALIAS_MAP`, `PLUGIN_PRESETS`, preset transformer, `render.py` headless render |
| `docs/modules/reaper-scripting.md` | Create | `ReaScriptGenerator`, `ReaScriptCommand`, `ReaScriptResult`, command/result JSON contract, `ProtocolVersionError` |
| `docs/modules/calibrator.md` | Create | `Calibrator.apply()` post-processing, mix calibration presets (`calibrator/presets.py`) |
| `docs/schemas/song-definition.json` | Create | JSON Schema draft-07 for `SongDefinition` + all nested dataclasses |
| `docs/schemas/reascript-protocol.json` | Create | JSON Schema draft-07 for `ReaScriptCommand` and `ReaScriptResult` |
| `README.md` | Modify | Rewrite: REAPER `.rpp` target, real CLI scripts, link to `docs/LLM_CONTEXT.md`, remove FL Studio/MCP |
## Testing Strategy
| Layer | What to Test | Approach |
|-------|-------------|----------|
| Link integrity | All `[text](./path)` and `[text](#heading)` references | Shell script: extract all markdown links, verify each target exists on disk or as heading in destination file |
| Schema validity | `song-definition.json`, `reascript-protocol.json` | Validate each `.json` against JSON Schema meta-schema (draft-07) using `ajv` or Python `jsonschema`; validate a sample `SongDefinition.to_json()` output against `song-definition.json` |
| Field name accuracy | Dataclass field names in docs match `schema.py` | `grep` cross-check: for each field name in docs, verify exact match in source; run `grep -r "bpm\|velocity\|send_reverb" docs/` and diff against `schema.py` dataclass attributes |
| README truthiness | No FL Studio/MCP mentions | `grep -i "fl studio\|\.flp\|mcp" README.md` must return empty |
## Open Questions
- [ ] **Schema draft version**: Spec requires draft-07, proposal mentions draft-2020-12. Recommend draft-07 as it has wider LLM/tool support and matches spec requirement.
- [ ] **Module doc depth**: Should module docs include internal helper signatures (e.g., `_section_active`, `_get_kick_cache`) or only the public API surface (`RPPBuilder.write()`, `ChordEngine.progression()`)? Recommend public API only to avoid maintenance burden.
## Dependencies
None. All content derived from reading existing source files. No new packages, no code changes.
## Rollout
`git add docs/ README.md && git commit`. All changes additive to `docs/` + README.md rewrite. No code paths depend on docs. Zero-risk rollout.