Changelog

All notable changes to CallProxy will be documented in this file.
> **Bubble quality tracking** → see [docs/BUBBLE-HISTORY.md](docs/BUBBLE-HISTORY.md)

[0.9.0] - 2026-03-22

**Milestone: Unified Stable Voice Engines** | All three engines (Gemini+Twilio, Retell, Vapi) produce correct transcripts

Fixed — Vapi Engine

Eliminated duplicate transcript bubbles — `conversation-update` is now the single source of truth; individual `transcript` webhook events are skipped when conversation tracking is active

Strengthened dedup logic — `_emitTranscript()` now checks ALL previously sent entries (not just the last one) to prevent duplicates from overlapping event paths

Final transcript guard — `_processFinalTranscript()` correctly skips when entries were already sent during the call

Fixed — Retell Engine

Prevented webhook/WS transcript collision — `transcript_updated` webhook now only pushes entries when Custom LLM WebSocket hasn't delivered any (WS is the primary source since v0.7.10)

call_ended webhook guard — already had the `lastTranscriptLength === 0` check (confirmed correct)

Verified — Gemini+Twilio Engine

Turn-based transcript (delta → turn_complete → final bubble) — no changes needed, already stable

Briefing data (collectedData, conversationStrategy, researchContext, objective) passes through correctly

Architecture

Transcript source hierarchy per engine:

Gemini+Twilio: turn-based events (no dedup needed)

Retell: Custom LLM WebSocket → webhook fallback → API polling fallback

Vapi: conversation-update → transcript event fallback → final transcript fallback

Each fallback only activates when the primary source hasn't delivered data

[0.7.14] - 2026-03-20 22:15

**Баблы: ✅ РАБОТАЮТ** | Источник: Custom LLM WS

Fixed

Agent bubbles appear only after Retell confirms speech (no phantom bubbles)

Track `lastTranscriptShown` to prevent duplicates during call

All transcript entries (agent + user) from `response_required` event

Known Issue

After call ends (receptionist hangs up), bubbles re-arrive and mix with existing ones

Root cause: likely final transcript fetch duplicates already-shown entries

Test

Transcript `0d4551d9` — clean 12-entry dialog, no dupes, no garbage

[0.7.13] - 2026-03-20 22:00

**Баблы: ❌ фрагменты** | Источник: Custom LLM WS

Changed

Show only finalized bubbles (no partials) — but agent fragments got truncated to single words

[0.7.12] - 2026-03-20 21:53

Fixed

Single transcript source: Custom LLM WS only (removed polling/webhook for transcript)

[0.7.10] - 2026-03-20 21:45

Added

Retell Custom LLM WebSocket integration — real-time transcript via WS instead of polling

[0.7.05] - 2026-03-20 20:07

Fixed

Always fetch final transcript on endCall

Dynamic version display everywhere

[0.6.0] - 2026-03-20

Added

Retell AI Voice Engine — full integration with Retell AI as alternative voice stack

Dynamic LLM + Agent creation per call

Phone calls via Retell API with SIP trunking through Twilio

Webhook handling for call_started, call_ended, call_analyzed events

Polling for real-time call status updates

Final transcript retrieval after call ends

SIP Trunking — Twilio Elastic SIP Trunk configured for Retell AI

Retell IP whitelist (18.98.16.120/30)

Origination URI for inbound (sip:sip.retellai.com)

Phone number +14587777791 imported to Retell via SIP trunking

Enables calling international numbers (not just US)

Claude 4.6 Sonnet as default LLM for Retell voice engine

Live transcript in UI — call transcripts pushed to browser during Retell calls

WS stays alive during call phase (call_started event + reconnect)

Normalized transcript format (role/text ↔ speaker/original)

Final transcript fetch with retry after call ends

Number pronunciation — prompt instructs AI to dictate numbers in pairs (48, 34, 11) instead of large numbers

Changelog — version link in UI opens changelog

Fixed

Retell API endpoint paths (mixed v2/non-v2 across endpoints)

WS disconnect during Retell calls (missing call_started handler in UI)

Transcript format mismatch between Retell and Gemini+Twilio engines

Session cleanup race condition (polling vs webhook vs endCall)

SIP trunk auth error (removed credential list requirement, IP ACL only)

Changed

Retell LLM model updated from gpt-4o to claude-4.6-sonnet

RETELL_FROM_NUMBER switched from Retell-purchased to Twilio SIP trunk number

[0.5.0] - 2026-03-20

Added

VoiceEngine abstraction — pluggable voice engine architecture

Base class with startCall/endCall/sendUserAudio/injectText interface

Factory pattern for engine selection (gemini-twilio, retell, vapi)

Per-call engine creation with voice stack selector

Setup panel with 5 configuration toggles

Languages (I speak / Call in)

Briefing mode (Pre-filled / Empty)

Call Mode (AI Test / I Test / Real Call)

Call Channel (Browser / Test Phone / Real Phone)

Voice Stack (Gemini+Twilio / Retell AI / Vapi)

localStorage persistence for all settings

Call recording — CallRecorder for real calls (inbound/outbound/mixed WAV)

Transcript saving to data/transcripts/{sessionId}.json

Retell AI account registered (nina@bookly.net), phone number purchased

Vapi AI account registered (nina@bookly.net)

Receiver.html full refactor — 3 screens (waiting/incoming/active), dark theme

4-Agent browser mode — Proxy + Oxy + Receptionist + User in browser

Fixed

Phone number cleaning (strips text like "(Sanitas)" from numbers)

Auto-sync between Call Mode and Call Channel

Real Call mode forces Real Phone channel

[0.4.0] - 2026-03-19

Added

Initial voice briefing with Gemini Live

Case context system with pre-loaded facts

Gemini+Twilio voice engine for real phone calls

Browser-based call mode with WebRTC

[0.6.1] - 2026-03-20

Added

Real-time transcript via `transcript_updated` webhook — bubbles appear during the call

Ask User detection — `[ВОПРОС]` pattern in agent speech triggers ask_user UI prompt

Fixed

Event listeners kept alive until `endCall()` completes (transcript was lost)

Server crash from escaped backticks in changelog route

Debug logging when WS is dead during transcript push