Changelog
All notable changes to CallProxy will be documented in this file.
> **Bubble quality tracking** → see [docs/BUBBLE-HISTORY.md](docs/BUBBLE-HISTORY.md)
[0.9.0] - 2026-03-22
**Milestone: Unified Stable Voice Engines** | All three engines (Gemini+Twilio, Retell, Vapi) produce correct transcripts
Fixed — Vapi Engine
Eliminated duplicate transcript bubbles — `conversation-update` is now the single source of truth; individual `transcript` webhook events are skipped when conversation tracking is active
Strengthened dedup logic — `_emitTranscript()` now checks ALL previously sent entries (not just the last one) to prevent duplicates from overlapping event paths
Final transcript guard — `_processFinalTranscript()` correctly skips when entries were already sent during the call
Fixed — Retell Engine
Prevented webhook/WS transcript collision — `transcript_updated` webhook now only pushes entries when Custom LLM WebSocket hasn't delivered any (WS is the primary source since v0.7.10)
call_ended webhook guard — already had the `lastTranscriptLength === 0` check (confirmed correct)
Verified — Gemini+Twilio Engine
Turn-based transcript (delta → turn_complete → final bubble) — no changes needed, already stable
Briefing data (collectedData, conversationStrategy, researchContext, objective) passes through correctly
Architecture
Transcript source hierarchy per engine:
Gemini+Twilio: turn-based events (no dedup needed)
Retell: Custom LLM WebSocket → webhook fallback → API polling fallback
Vapi: conversation-update → transcript event fallback → final transcript fallback
Each fallback only activates when the primary source hasn't delivered data
[0.7.14] - 2026-03-20 22:15
**Баблы: ✅ РАБОТАЮТ** | Источник: Custom LLM WS
Fixed
Agent bubbles appear only after Retell confirms speech (no phantom bubbles)
Track `lastTranscriptShown` to prevent duplicates during call
All transcript entries (agent + user) from `response_required` event
Known Issue
After call ends (receptionist hangs up), bubbles re-arrive and mix with existing ones
Root cause: likely final transcript fetch duplicates already-shown entries
Test
Transcript `0d4551d9` — clean 12-entry dialog, no dupes, no garbage
[0.7.13] - 2026-03-20 22:00
**Баблы: ❌ фрагменты** | Источник: Custom LLM WS
Changed
Show only finalized bubbles (no partials) — but agent fragments got truncated to single words
[0.7.12] - 2026-03-20 21:53
Fixed
Single transcript source: Custom LLM WS only (removed polling/webhook for transcript)
[0.7.10] - 2026-03-20 21:45
Added
Retell Custom LLM WebSocket integration — real-time transcript via WS instead of polling
[0.7.05] - 2026-03-20 20:07
Fixed
Always fetch final transcript on endCall
Dynamic version display everywhere
[0.6.0] - 2026-03-20
Added
Retell AI Voice Engine — full integration with Retell AI as alternative voice stack
Dynamic LLM + Agent creation per call
Phone calls via Retell API with SIP trunking through Twilio
Webhook handling for call_started, call_ended, call_analyzed events
Polling for real-time call status updates
Final transcript retrieval after call ends
SIP Trunking — Twilio Elastic SIP Trunk configured for Retell AI
Retell IP whitelist (18.98.16.120/30)
Origination URI for inbound (sip:sip.retellai.com)
Phone number +14587777791 imported to Retell via SIP trunking
Enables calling international numbers (not just US)
Claude 4.6 Sonnet as default LLM for Retell voice engine
Live transcript in UI — call transcripts pushed to browser during Retell calls
WS stays alive during call phase (call_started event + reconnect)
Normalized transcript format (role/text ↔ speaker/original)
Final transcript fetch with retry after call ends
Number pronunciation — prompt instructs AI to dictate numbers in pairs (48, 34, 11) instead of large numbers
Changelog — version link in UI opens changelog
Fixed
Retell API endpoint paths (mixed v2/non-v2 across endpoints)
WS disconnect during Retell calls (missing call_started handler in UI)
Transcript format mismatch between Retell and Gemini+Twilio engines
Session cleanup race condition (polling vs webhook vs endCall)
SIP trunk auth error (removed credential list requirement, IP ACL only)
Changed
Retell LLM model updated from gpt-4o to claude-4.6-sonnet
RETELL_FROM_NUMBER switched from Retell-purchased to Twilio SIP trunk number
[0.5.0] - 2026-03-20
Added
VoiceEngine abstraction — pluggable voice engine architecture
Base class with startCall/endCall/sendUserAudio/injectText interface
Factory pattern for engine selection (gemini-twilio, retell, vapi)
Per-call engine creation with voice stack selector
Setup panel with 5 configuration toggles
Languages (I speak / Call in)
Briefing mode (Pre-filled / Empty)
Call Mode (AI Test / I Test / Real Call)
Call Channel (Browser / Test Phone / Real Phone)
Voice Stack (Gemini+Twilio / Retell AI / Vapi)
localStorage persistence for all settings
Call recording — CallRecorder for real calls (inbound/outbound/mixed WAV)
Transcript saving to data/transcripts/{sessionId}.json
Retell AI account registered (nina@bookly.net), phone number purchased
Vapi AI account registered (nina@bookly.net)
Receiver.html full refactor — 3 screens (waiting/incoming/active), dark theme
4-Agent browser mode — Proxy + Oxy + Receptionist + User in browser
Fixed
Phone number cleaning (strips text like "(Sanitas)" from numbers)
Auto-sync between Call Mode and Call Channel
Real Call mode forces Real Phone channel
[0.4.0] - 2026-03-19
Added
Initial voice briefing with Gemini Live
Case context system with pre-loaded facts
Gemini+Twilio voice engine for real phone calls
Browser-based call mode with WebRTC
[0.6.1] - 2026-03-20
Added
Real-time transcript via `transcript_updated` webhook — bubbles appear during the call
Ask User detection — `[ВОПРОС]` pattern in agent speech triggers ask_user UI prompt
Fixed
Event listeners kept alive until `endCall()` completes (transcript was lost)
Server crash from escaped backticks in changelog route
Debug logging when WS is dead during transcript push