Building a Chord Progression Assistant: An AI Development Journey
When I set out to build a chord progression assistant, I had a simple goal: help musicians discover chords that fit in their progressions using AI. What started as a proof of concept project evolved into a tool I've found myself turning to while writing music.
In an afternoon, I built a chord progression assistant that suggests harmonically valid, emotionally targeted chords. The software stack is FastAPI + Nginx + HAProxy on Docker Compose. I started with a locally hosted Mistral LLM (25–30s latency) and migrated to OpenAI (~5s). Instead of a bespoke music‑theory engine, I embedded validation, key detection, and intent‑guided suggestions in prompts with JSON outputs. The result is a API and simple UI that feels creative and fast, with clear fallbacks.
Using AI to make AI
From the beginning, this was developed using AI tools. I use opencode to give plaintext instructions to AI coding agents. I used a mix of Claude Sonnet (primary), ChatGPT (typically used as the senior developer giving feedback to the Claude model), and Grok Code (currently free, used when API usage limits blocked development). A few times, I fired up vscode to manually write a critical function, tweak prompts endlessly, or configure the environment. But 90% of the code was written by agentic AI -- both frontend and backend.
The Foundation: Architecture First
The first major decision was choosing the right stack. I went with FastAPI for the backend. It's fast, has excellent auto-documentation(swagger), and handles async operations beautifully. For the frontend, I deliberately chose vanilla CSS and JavaScript over framework complexity. Sometimes the simplest solution is the best solution, and a static HTML/CSS/JS frontend served by Nginx fit the bill perfectly.
The containerization strategy used Docker Compose with three services: a FastAPI backend, an Nginx frontend server, and HAProxy handling SSL/TLS while routing API calls to the backend and static requests to the frontend.
The Music Theory Challenge
One of the early challenges was handling music theory. Musicians use complex chord notation that look like hieroglyphics to laypeople: C#add9b5, Am7/G, F#aug... and detecting the musical key from chord progressions requires understanding harmonic relationships. I initially implemented simple regex parsing as an MVP, but the real breakthrough came when I embedded this logic into the AI prompts rather than building a separate music theory engine. Letting the AI handle the whole analysis meant I could focus on application architecture while getting sophisticated harmonic analysis "for free."
AI rEvolution: From Local to Cloud
The most significant architectural change was migrating from a local Ollama server to OpenAI's GPT-4o mini. Initially, I built around a locally-hosted AI model, Mistral 7b, which ran on my Nvidia Tesla P4. Performance was less than ideal, taking 25-30 seconds to return results. I made several attempts to optimize performance. I tried creating a "model warming" service that periodically pinged the local server to keep models loaded in memory, avoiding delays loading the model into VRAM. I tried using quantized models to lower vram usage, and tried other models entirely (Qwen2.5-3B-Instruct) to no avail. The Tesla P4 just didn't have the processing power to generate tokens fast enough.
The migration to using OpenAI transformed the entire user experience. Response times dropped to ~5 seconds, and I could remove significant infrastructure complexity. The migration preserved all existing functionality while eliminating management overhead. And I did it with one prompt in opencode in about 5 minutes of thinking: "Switch from using ollama to OpenAI APIs, add configuration for the openai token, and ensure the spirit of the prompt is maintained when converting"
Extensive testing determined that GPT-4o-mini just didn't understand music theory well enough. Switching to GPT-4.1 solved this while keeping costs around $.01 per request.
User Experience Iterations
The frontend went through several major iterations. The original system used technical music theory terms like "tonic," "subdominant," and "dominant" to describe chord functions. I found these uninspiring.
I replaced this with an emotional tagging system using three categories:
- Mood: nostalgic, hopeful, melancholic, uplifting, mysterious, dreamy
- Energy: flowing, driving, floating, grounded, suspended, resolved
- Color: warm, bright, dark, crystalline, lush, sparse
This transformation changed the tool from a technical music theory calculator into an inspirational creative assistant. Musicians could now understand not just what chord to play, but why it would create the feeling they wanted.
Later, I added charts that show how to play the suggested chords on a standard-tuned guitar through a javascript helper from scales-chords.com. Clicking these charts takes you to their page with a variety of different fingerings for each chord.
The Streaming Experiment
One interesting detour was implementing Server-Sent Events for real-time chord suggestions. 30+ second delays for batch results from the local AI model was a terrible user experience, so I built progressive loading where suggestions appeared one by one as they were generated.
This required extensive prompt engineering to ensure each suggestion excluded previously suggested chords within the same request, and frontend JavaScript to handle streaming data with graceful fallbacks. The implementation worked beautifully, until the OpenAI migration made it obsolete overnight. Suddenly, batch requests for 5 chords took 5 seconds instead of 30, making streaming unnecessary.
The lesson: decisions that solve current problems may become technical debt when underlying constraints change.
AI Prompt Engineering as Core Architecture
One of the most interesting aspects was how prompt engineering became a core architectural component. Rather than building separate services for key detection, chord validation, and music theory analysis, I embedded all this logic into carefully structured AI prompts.
The system validates input using AI rather than regex patterns:
FIRST, validate that the input contains actual chord symbols...
If the input does NOT contain recognizable chord symbols, return this exact JSON:
{"error": "invalid_input", "message": "..."}
This approach handles edge cases that traditional parsing missed, while providing contextual error messages that help users understand what went wrong.
The prompt instructs the AI model to respond in a specific, bespoke JSON format. Something like the below:
{
"detected_key": "key name (e.g., C major, A minor)",
"confidence": 0.8,
"key_analysis": "Brief explanation of why this key was chosen",
"suggestions": [
{
"chord": "chord symbol",
"explanation": "3–4 sentence explanation of why and how the chord fits",
"mood": "1 word description of the mood this chord presents within the progression",
"energy": "1 word description of the energy this chord gives within the progression",
"color": "1 word description of the color/vibe this chord adds to the progression",
"tension_level": "low|medium|high"
}
]
}
From Predictive AI to Creative Assistant
The system evolved from simple "next chord" suggestions to intent-driven recommendations. Users can specify whether they want to:
- Continue the progression naturally
- Create resolution and closure
- Add harmonic tension or color
- Brighten or darken the mood
- Build turnarounds back to the beginning
Each intent triggers different AI guidance, transforming a generic suggestion engine into a targeted creative tool that understands musical goals.
The End Result
What emerged is a comprehensive chord progression assistant: AI integration with proper error handling, responsive user interfaces, and decent performance.
More importantly, it showcases how AI can be embedded into domain-specific applications: not as a chatbot or generic assistant, but as a specialized tool that understands context, validates input, and provides structured, actionable output.
The final architecture is surprisingly simple: a stateless API that transforms musical input into AI-powered suggestions, served through a clean web interface with sensible defaults and powerful customization options. Sometimes the best software is the kind that solves real problems without getting in the way. I use this on my phone with my guitar on my shoulder.
For developers building AI-integrated applications, the key insight is treating AI as another infrastructure component. Powerful enough to replace hundreds of lines of code, but requiring the same attention to error handling, performance monitoring, and graceful degradation as any other external service. The magic isn't in the AI itself, but in how thoughtfully you prompt it and integrate it into solutions that genuinely help users accomplish their goals.
Lessons in AI-First Development
Building this application taught me several lessons about AI-first software development:
- Embed intelligence in prompts: Complex logic often fits in AI prompts rather than traditional algorithms
- Design for AI failure: Fallback mechanisms are essential. AI services can be slow, expensive, or unavailable. Falling back to local AI or sensible default recommendations solves this.
- Structured outputs are crucial: JSON structure and careful prompt engineering ensure reliable data formats
- Performance vs. cost trade-offs: Local AI means control but operational overhead; cloud AI means simplicity but ongoing costs