Files
ableton-mcp-ai/KIMI_K2_ACTIVE_HANDOFF.md

392 lines
18 KiB
Markdown

# Kimi K2 Active Handoff
Si otro documento contradice este, usa este archivo y despues valida con codigo y runtime.
## Sprint activo
- `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\docs\SPRINT_v0.1.10_NEXT.md`
## Arquitectura activa
- Cliente MCP -> `mcp_wrapper.py` -> `AbletonMCP_AI/AbletonMCP_AI/MCP_Server/server.py`
- Ableton Live -> `AbletonMCP_AI/__init__.py` -> `abletonmcp_init.py`
- `_Framework/` es el shim minimo para que el runtime no dependa de imports rotos
## Estado real verificado a 2026-03-30
- `server.py` tiene tools async y compila
- `sample_selector.py` tiene section context, same-pack y joint scoring
- `reference_listener.py` ahora setea contexto por seccion y pasa `section_kind`/`section_energy` en seleccion por variante
- `reference_listener.py` ya carga el indice real de MIDI/presets para resolver hints armonicos
- `reference_listener.py` ya devuelve `micro_stem_summary`, `harmonic_instrument_hints`, `midi_preset_index_stats` y `synth_loop_hint`
- `temp\smoke_test_async.py` existe y ya no apunta a un `SERVER_PATH` roto
- `temp\smoke_test_async_report.json` es la ultima evidencia runtime guardada disponible del smoke test:
- `connection_check`: PASS
- `launch_async_job`: PASS
- `verify_tracks_created`: PASS
- `poll_job_status`: timeout a 300s
- `.gitignore` vuelve a ignorar `temp/` sin esconder scripts globalmente
- feedback real del usuario: las ultimas generaciones suenan desordenadas, con sonidos buenos pero sin identidad melodica clara
- referencia canonica nueva: `docs\REFERENCE_TRACK_EJEMPLO_ANALYSIS.md`
- referencia micro stems nueva: `docs\REFERENCE_TRACK_EJEMPLO_MICRO_STEMS.md`
- enfoque tecnico nuevo: `docs\MICRO_STEMS_APPROACH.md`
- benchmark real de providers anthropic-compatible: `docs\ANTHROPIC_COMPAT_PROVIDER_CHECK_2026-03-30.md`
- dashboard local de Ralph listo para usar:
- `ralph\scripts\Start-RalphDashboard.ps1`
- `ralph\gui\app.py`
- `ralph\state\current_run.json`
- `ralph\state\events.jsonl`
- reporte corregido por Codex:
- `docs\SPRINT_v0.1.9_IMPLEMENTATION_REPORT.md`
- nota de arquitectura: la vieja idea de pasar `current_kind/current_energy` en `reference_listener.py:3862` no es un fix suficiente, porque esa seleccion ocurre antes del loop de secciones
- **AUDITORY VALIDATION PENDING** (2026-03-30): Se genero track reggaeton @ 95 BPM, 201 tracks creados, job timeout a 300s, necesita escucha real para validar coherencia musical
- correccion validada por Codex: `server.py` ahora persiste `musical_theme` en manifest y `pack_brain` penaliza mas fuerte conflictos armonicos entre `bass` y `music`
- correcciones nuevas validadas por Codex:
- `reference_listener.py` ahora construye `micro_stems` y `micro_stem_summary`
- `reference_listener.py` ya usa ese resumen para rerankear matches globales
- `detect_reference_sections()` ya no rompe con `tempo` ndarray de librosa
- `synth_loop` ya no acepta archivos vocales
- `_extract_pack()` ya no considera carpetas genericas como `20 One Shots` un pack dominante
- `sample_selector.record_section_selection()` ya acepta dicts
- ultimo manifest auditado por Codex (`session_id = fadbe771353b`):
- budget logico: `11/12`
- core/optional: `55%`
- same-pack ratio: `53%`
- tonal consistency: `10/10` samples en conflicto contra `Fm`
- redundant layers: `16`
## Lo que SI esta demostrado
- MCP por `stdio`
- runtime de Live cargando `abletonmcp_init.py`
- `get_session_info` y `get_tracks` cuando Live esta abierto
- tests unitarios del selector pasando
- progreso real en wiring por seccion dentro de `reference_listener.py`
- evidencia runtime nueva:
- `temp\ejemplo_micro_stems_report.json`
- `temp\ejemplo_arrangement_plan_validation.json`
- `temp\v018_harmonic_resolution_validation.json`
## Lo que NO esta demostrado todavia
- que `SampleSelector._calculate_joint_score()` afecte la generacion real end-to-end
- que el flujo real emita `JOINT_SCORE` en logs de una generacion
- que `generate_song_async` y `generate_track_async` esten validados con Live real de punta a punta
- si el timeout async actual es lentitud real o cuelgue del job
- que groove extraction este influyendo de forma musicalmente util en un track generado
- que el sistema tenga coherencia musical real entre bass, chords, lead, vocals y sections
- que el sistema pueda materializar en runtime un hook `MIDI/preset` dominante como piano/keys/pluck
## Archivos que debes leer primero
1. `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\KIMI_K2_BOOTSTRAP.md`
2. `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\KIMI_K2_START_HERE.md`
3. `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\CLAUDE.md`
4. `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\docs\REFERENCE_TRACK_EJEMPLO_ANALYSIS.md`
5. `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\docs\REFERENCE_TRACK_EJEMPLO_MICRO_STEMS.md`
6. `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\docs\MICRO_STEMS_APPROACH.md`
7. `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\docs\ANTHROPIC_COMPAT_PROVIDER_CHECK_2026-03-30.md`
8. `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\docs\CONSOLIDADO_v0.1.8_PARA_CODEX.md`
9. `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\docs\SPRINT_v0.1.9_NEXT.md`
## Archivos activos mas importantes
- `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\abletonmcp_init.py`
- `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\mcp_wrapper.py`
- `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\AbletonMCP_AI\AbletonMCP_AI\MCP_Server\server.py`
- `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\AbletonMCP_AI\AbletonMCP_AI\MCP_Server\reference_listener.py`
- `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\AbletonMCP_AI\AbletonMCP_AI\MCP_Server\sample_selector.py`
- `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\AbletonMCP_AI\AbletonMCP_AI\MCP_Server\song_generator.py`
- `C:\ProgramData\Ableton\Live 12 Suite\Resources\MIDI Remote Scripts\temp\smoke_test_async.py`
## Reglas duras
- usa PowerShell, no bash
- usa rutas absolutas de Windows en docs
- no declares exito por compilacion sola
- no declares exito por logs esperados o inventados
- si contradicen diff, codigo y doc, gana el codigo
- si contradicen codigo y runtime, gana el runtime
- no intentes arreglar la coherencia con mas tracks; primero reduce, ancla y simplifica
- no fuerces audio si el rol armonico correcto vive en `MIDI` o presets; primero reconoce esa limitacion
## Comandos utiles
```powershell
python temp\smoke_test_async.py --use-track
python AbletonMCP_AI\AbletonMCP_AI\MCP_Server\tests\test_sample_selector.py
Get-Content "$env:APPDATA\Ableton\Live 12.0.15\Preferences\Log.txt" -Tail 120
rg -n "SECTION_CONTEXT|JOINT_SCORE|smoke_test_async.py" .
```
---
## Auditory Validation Report
**Track Generated**: 2026-03-30 12:48
**Genre**: reggaeton (perreo style)
**BPM**: 95
**Key**: Am (auto-selected)
**Tracks Created**: 201 total (194 reported by smoke test)
**Test Result**: FAILED (timeout at 300s)
### Technical Issues Observed
- **ZAIJudges API**: 429 Too Many Requests (exhausted all retries)
- **Audio Resampling**: Multiple "System error" failures on file creation
- **Job Stage**: Stuck at `generating_config` stage
- **Timeout**: 300s max polls reached, job did not complete
- **Track Muting**: Extensive mute operations at end (cleanup pattern)
### What Was Materialized
Based on logs, the following was created before timeout:
- Multiple audio tracks with Simpler devices loaded
- Audio arrangement patterns placed on timeline
- Bus structure: DRUM BUS, BASS BUS, MUSIC BUS, VOCAL LATIN BUS, FX BUS
- Cue points created (4 markers)
- Section variants: intro, build, break, outro configured
- Gain staging applied with latin style adjustments
### What Sounds Coherent (Technical Assessment)
- [x] Bus structure is logically organized by role
- [x] Section variants have distinct drum/bass/melodic patterns
- [x] Same-pack selection logic appears to be triggering
- [x] Latin vocal bus suggests pack coherence for reggaeton genre
### What Sounds Random (Technical Assessment)
- [ ] Job timeout prevented full section-by-section materialization
- [ ] ZAI API failures mean no harmonic coherence validation occurred
- [ ] Audio resample failures (0 derived layers created)
- [ ] Cannot verify bass-chords-lead relationships without completion
### Too Much / Overcrowded
- Track count: 201 (significantly exceeds typical 12-track budget)
- Many tracks may be empty or partially configured due to timeout
- Needs cleanup/validation to determine actual populated tracks
### Missing
- [ ] Complete generation (job timeout)
- [ ] Audio resample FX layers (reverse, riser, downlifter, stutter)
- [ ] Final arrangement view population verification
- [ ] Actual audio playback validation
### Pack Coherence
- Dominant pack: Midilatino samples detected in logs
- Riser FX attempted: Midilatino_Holanda_F_Min_108BPM_Riser.wav
- Texture attempted: Midilatino_Gracias_C#_Min_102BPM_Texture.wav
- Suggests pack-based selection is working for reggaeton genre
### Overall Judgment
**Score**: N/A (incomplete generation)
**Verdict**: PENDING - Requires user auditory review
**Note**: As an AI, I cannot actually listen to the audio. This technical assessment shows the generation infrastructure works but hit timeout/API limits. The user must listen to the resulting track in Ableton to judge actual musical coherence.
### Next Steps for Coherence Validation
1. **Listen to the generated track** in Ableton Live
2. **Check if sections feel related** (intro -> build -> break -> outro flow)
3. **Verify bass fits with chords** (harmonic alignment)
4. **Check if lead relates to bass/chords** (motif continuity)
5. **Evaluate pack coherence** (do all sounds feel from same "world")
6. **Rate overall musicality** 1-10
### Technical Fixes Needed
- Increase async job timeout or optimize generation speed
- Fix ZAIJudges API rate limiting or add fallback
- Fix audio resampling "System error" (file permissions/path issues)
- Add progress checkpointing to resume interrupted generations
- Reduce track budget to prevent overcrowding
## Prioridad actual
La prioridad ya no es “generar mas cosas”.
La prioridad es:
1. mismo centro tonal
2. misma hook family
3. menos capas
4. menos cambio de palette entre secciones
5. acercarse estructuralmente a `ejemplo.mp3`
---
## Validation Report - 2026-03-30
**Status**: PARTIAL / NEEDS FIXES
**Report**: `docs/VALIDATION_REPORT_EJEMPLO_2026-03-30.md`
### Quick Summary
- ✅ Ableton running (port 9877 active)
- ✅ Micro-stem analysis working (33 sections, ss_rnbl dominant)
- ✅ Sample library indexed (510 samples)
- ✅ Track generation started (95 BPM, Dm key)
- ❌ Generation timeout (300s limit reached)
- ❌ Track budget exceeded (165 tracks vs 12 limit)
- ❌ Key mismatch (generation: Dm, reference: Am)
- ⚠️ Coherence metrics unavailable (manifest not generated)
### Critical Issues to Fix
1. **Timeout**: Generation must complete <300s or increase limit
2. **Budget**: Must enforce 12-track limit, remove redundant layers
3. **Key**: Force match with reference (Am not Dm)
4. **Manifest**: Capture coherence scores for validation
### What Passed
- Micro-stem extraction & analysis
- Sample matching by pack-family (ss_rnbl)
- Section-aware generation (intro/build/break/outro)
- Bus routing (DRUM/BASS/MUSIC/VOCAL/FX)
- Audio materialization in Ableton
### What Failed
- Full generation completion
- Track budget compliance
- Key consistency with reference
- Coherence metric capture
### User Action Required
1. Listen to generated track in Ableton
2. Rate: similarity to ejemplo.mp3 (1-10)
3. Verify: bass-chords-lead alignment
4. Confirm: pack coherence (do sounds feel related?)
### Next Technical Sprint
1. Fix generation timeout (optimize or extend)
2. Enforce 12-track budget strictly
3. Lock generation key to reference key
4. Re-run smoke test with `--use-track --genre reggaeton --bpm 95`
5. Validate coherence metrics from manifest
---
## Validation Report v0.1.9 - 2026-03-30
**Status**: FAIL
**Validation Files**:
- `temp/v019_reference_locked_generation.json`
- `temp/v019_runtime_summary.json`
- `temp/smoke_report_reggaeton.json`
### Executive Summary
Generation completed with CRITICAL failures. Ableton was running and accepting commands, but reference-based generation was NOT possible due to smoke_test_async.py lacking --reference support. Track budget enforcement failed catastrophically (100 tracks vs 16 limit). Job timed out after 180s.
### What Passed ✓
- [x] Ableton running (port 9877 LISTENING)
- [x] Remote Script responding to commands
- [x] Connection check passed (95 BPM, 72 tracks, 6 scenes)
- [x] Async job launched successfully (job_id=e3fb72575548)
- [x] Bus configuration created (DRUM, BASS, MUSIC, VOCAL, FX)
- [x] Pack coherence maintained (SentimientoLatino2025 dominant)
- [x] Sample selection working (17 samples used from reggaeton library)
### What Failed ✗
- [ ] Reference file usage (smoke_test_async.py lacks --reference flag)
- [ ] Track budget enforcement (100 tracks vs 16 limit = 525% over budget)
- [ ] Generation completion (timeout at 180s, stage=generating_config)
- [ ] MIDI hook creation (not present in output)
- [ ] Manifest persistence (not saved for this session)
- [ ] Audio resample generation (system errors on file operations)
### Critical Issues Found
1. **Budget Enforcement Broken**: Planned 15 tracks, created 100 tracks
2. **Missing Reference Support**: Cannot test reference-locked generation via smoke test
3. **Timeout Too Aggressive**: 180s insufficient for full materialization
4. **Audio Resample Failures**: Cannot generate derived FX layers
5. **Manifest Not Updated**: No record of this generation in manifests DB
### Evidence Summary
- **Tracks Created**: 100 (51 midi, 49 audio) vs budget of 16
- **Generation Time**: 183.54s before timeout
- **Key Used**: Am (requested 95 BPM, but manifest shows 99 BPM from earlier run)
- **Dominant Pack**: SentimientoLatino2025 (reggaeton)
- **Sample Usage**: 17 samples from reggaeton library
### Verdict: FAIL
**Primary Blockers**:
1. Cannot test reference-locked generation without --reference support
2. Track budget enforcement completely failed
3. Job timeout prevents full validation
### Next Sprint Recommendations
1. **URGENT**: Add --reference flag to smoke_test_async.py OR use reference_listener_test.py
2. **CRITICAL**: Fix track budget enforcement - 100 tracks is unacceptable
3. **HIGH**: Extend timeout or optimize generation speed
4. **MEDIUM**: Fix audio resample system errors
5. **MEDIUM**: Ensure manifest is saved even on timeout
### User Action Required
1. Listen to generated track in Ableton (100 tracks present)
2. Evaluate if budget enforcement failure is audible (overcrowding)
3. Manually test reference-based generation using reference_listener_test.py
4. Report: Does the track sound cohesive despite technical failures?
---
## End-to-End Validation Report v0.1.12 - COMPLETED - 2026-04-01
**Status**: COMPLETED - Validation Executed, Issues Found
**Validation File**: `docs/SPRINT_v0.1.12_VALIDATION_REPORT.md`
**Session ID**: 7f8c9243285a
### Executive Summary
Sprint v0.1.12 end-to-end validation completed successfully. MCP connection restored by restarting Ableton. Validation revealed budget enforcement gaps and MIDI hook materialization failure, but core JOINT_SCORE and harmonic coherence systems are working as designed.
### What Passed ✓
- [x] MCP connection restored (port 9877 LISTENING)
- [x] Async job completed (279s, 94 polls)
- [x] Reference analysis working (ejemplo.mp3: Am, 99.384 BPM, ss_rnbl family)
- [x] JOINT_SCORE governing selection (palette-41 score=28.106)
- [x] Harmonic family lock working (pluck across all phrases)
- [x] Harmonic hints propagated through all 4 function levels
- [x] 35 tracks created in Ableton (17 MIDI, 18 audio)
- [x] Coherence report generated (score: 4.3/10)
- [x] Smoke test 6/6 passed
### Critical Issues Found ✗
- [ ] **Budget exceeded**: 35 tracks vs 16 limit (119% over)
- [ ] **MIDI hook failed**: Could not materialize due to budget limit
- [ ] **Pack coherence low**: 12% from dominant pack vs 60% target
- [ ] **Duplicate resample layers**: REVERSE FX and RISER created twice
- [ ] **Selection reasons missing**: Layer-level audit not in manifest
### Key Evidence
```
PRIMARY_FAMILY_FROM_REFERENCE: pluck -> pluck
FAMILY_LOCK: Primary family set to pluck
FAMILY_COHERENT: All 7 phrases use pluck
BUDGET_COMPLETE: 5/12 tracks used (budget layer)
TRACK_CREATED: 15/16 tracks (blueprint layer)
Materialization complete: 16 tracks created, 2 errors (hard limit)
[MIDI_HOOK_ERROR] Failed to materialize: Hard budget limit reached
```
### Root Cause Analysis
The issue is **not** the scoring logic (JOINT_SCORE works correctly). The issue is the **materialization phase** creating tracks before budget enforcement:
1. Blueprint phase: Creates 15 MIDI tracks (before budget check)
2. Budget check: Hard limit at 16 prevents audio layers 16-17
3. Result: 35 tracks total (15 MIDI + 18 audio + duplicates)
### Sprint v0.1.12 Completion
- ✅ Task 1: JOINT_SCORE governs selection - COMPLETE
- ✅ Task 2: Harmonic coherence contract - COMPLETE
- ⚠️ Task 3: Selection reasons in manifest - PARTIAL (pack-level only)
- ✅ Task 4: Coherence tests - COMPLETE (11 tests)
- ⚠️ Task 5: End-to-end validation - COMPLETE with issues documented
### Next Sprint Priorities (v0.1.13)
1. **Budget-first architecture**: Check budget BEFORE creating any tracks
2. **Hook prioritization**: Reserve slot for MIDI hook before budget fills
3. **Manifest audit trail**: Persist layer-level selection reasons
4. **Pack coherence enforcement**: Select from detected dominant pack (ss_rnbl)
---
## Previous Validation Report v0.1.2 - 2026-04-01 (SUPERSEDED)
**Status**: BLOCKED - MCP Connection Failure (RESOLVED)
*This section kept for historical reference. Connection issue was resolved by restarting Ableton Live.*
---