Skip to main content
POST /api/v1/ai-sessions/{call_id}/control is the single endpoint your voice AI uses to drive a live call. The JSON body’s event field is the discriminator — there are 25 verbs across three groups: call-state primitives (13), conferencing + supervisor (9), and lead / context (3).

Authentication

CredentialWhen to useScope
Per-call callback token (yt_cb_<26 ULID>)Production — your voice AI gets one in the WS metadata frame, valid 30 min, bound to one call_idvoice_agent:control
Tenant API keyAdmin / dev / testing — explicit opt-in onlyai_sessions:control_admin
The callback token auto-rotates at minute 25 of the call via a token_refresh text frame on the WS. Agents that ignore the refresh lose control-verb access at the 30-min mark; the call continues without interruption.
ai_sessions:control_admin is opt-in. Tenant API keys do not carry control-verb permission by default. Add the scope to a key explicitly when you want to drive calls from a dashboard or test harness.

Request envelope

POST /api/v1/ai-sessions/{call_id}/control HTTP/1.1
Authorization: Bearer <yt_cb_... | yt_live_...>
Content-Type: application/json
Idempotency-Key: <uuid v4>

{ "event": "<verb>", ...verb-specific fields }
  • Tenant fence. call_id must belong to the token’s tenant. Cross-tenant access returns 403.
  • Rate limit. 10 req/sec per call_id.
  • Idempotency. The Yotel SDKs auto-generate an Idempotency-Key for state-changing verbs. Replays of the same (call_id, key) pair return the cached response. See Idempotency.

Response envelope

{
  "ok": true,
  "call_id": "<uuid>",
  "dispatched_at": "2026-05-01T14:37:08.102Z",
  "result": {}
}
result carries verb-specific data (e.g. {"conference_id": "..."}) for verbs that have a return; empty {} otherwise.

Group 1 — Call-state primitives (13)

Operate on the AI’s own call leg.

transfer

Bridge the caller to an agent queue, an E.164 number, or a SIP URI. The AI’s leg drops; outcome ← transferred_*. Fires ai_session.transferred + ai_session.ended.
{ "event": "transfer", "destination_type": "agent_queue",
  "destination": "tier1", "metadata": { "intent": "billing" } }
Python
client.ai_sessions.transfer(
    call_id, destination_type="agent_queue", destination="tier1",
    metadata={"intent": "billing"}, callback_token=cb_token,
)

hangup

End the call. outcome ← hangup; fires ai_session.ended.
{ "event": "hangup", "metadata": { "reason": "task_complete" } }
TypeScript
await client.aiSessions.hangup(callId, {
  metadata: { reason: "task_complete" },
  callbackToken,
});

log

Append a structured entry to ai_sessions.metadata.logs[]. No state effect, no webhook — useful for breadcrumbs in the audit log.
{ "event": "log", "metadata": { "step": "intent_classified",
  "intent": "renewal" } }
Python
client.ai_sessions.log(call_id, metadata={"step": "intent_classified"})

mute / unmute

Mute one party for the rest of the call. target is "caller" (default) or "agent" (the AI’s leg).
{ "event": "mute", "target": "caller" }
Python
client.ai_sessions.mute(call_id, target="caller")
client.ai_sessions.unmute(call_id, target="caller")

hold / unhold

Place the caller on hold; optional moh_url for music. unhold clears the state.
{ "event": "hold", "moh_url": "https://cdn.example.com/moh.wav" }
TypeScript
await client.aiSessions.hold(callId, { moh_url: mohUrl, callbackToken });
await client.aiSessions.unhold(callId, { callbackToken });

send_dtmf

Inject DTMF digits into the call (e.g. when navigating an external IVR after a transfer attempt).
{ "event": "send_dtmf", "digits": "1234#" }
Python
client.ai_sessions.send_dtmf(call_id, digits="1234#")

play_audio

Play either a pre-recorded URL or TTS text — exactly one of audio_url / tts_text must be set. See the TTS limitation below.
{ "event": "play_audio",
  "audio_url": "https://cdn.example.com/prompts/greeting.wav" }
Python
client.ai_sessions.play_audio(call_id, audio_url=greeting_url)

set_disposition

Save a disposition code on the call. source is recorded as "ai".
{ "event": "set_disposition", "disposition": "interested",
  "notes": "Asked for callback Tue 2pm" }
Python
client.ai_sessions.set_disposition(
    call_id, disposition="interested", notes="callback Tue 2pm"
)

recording_pause / recording_resume

Stop/start recording mid-call (e.g. PCI compliance during card capture). Optional reason is stored on the audit log.
{ "event": "recording_pause", "reason": "pci" }
Python
client.ai_sessions.recording_pause(call_id, reason="pci")
client.ai_sessions.recording_resume(call_id)

get_call_state

Read-only snapshot of current state. No idempotency key needed.
{ "event": "get_call_state" }
Returns:
{ "ok": true, "result": {
    "state": "in_progress", "duration_s": 47, "answered": true,
    "current_participants": ["caller", "ai_agent"],
    "is_recording": true,
    "is_muted": {"caller": false, "agent": false},
    "is_held": false, "conference_id": null, "disposition": null
}}

Group 2 — Conferencing + supervisor (9)

Add other participants to the call, monitor or barge into another call leg, or escalate to a human supervisor.

conference_start

Promote the AI’s two-leg call into a conference. The AI stays in. Fires ai_session.conference_changed.
{ "event": "conference_start" }
Python
client.ai_sessions.conference_start(call_id)

conference_add

Dial a participant into the conference. participant_type is one of e164 | sip_uri | agent | supervisor.
{ "event": "conference_add", "participant_type": "agent",
  "destination": "ag-7", "metadata": {"reason": "warm_handoff"} }
Python
client.ai_sessions.conference_add(
    call_id, participant_type="agent", destination="ag-7"
)

conference_remove

Kick a participant by member_id. member_id comes from the participants[] array on ai_session.conference_changed.
{ "event": "conference_remove", "member_id": "mem-42" }

conference_leave

The AI’s leg drops; the rest of the conference continues.
{ "event": "conference_leave" }

request_supervisor

Enqueue a supervisor escalation. UI alert + WebSocket push to on-duty supervisors. Fires ai_session.escalated twice: once on invocation (supervisor_id null), once on claim.
{ "event": "request_supervisor",
  "reason": "Caller is angry, needs senior rep",
  "urgency": "high" }
Python
client.ai_sessions.request_supervisor(
    call_id, reason="Caller is angry", urgency="high"
)

whisper

One-way audio injection into another call leg (typically an agent’s ear during a coaching session). Requires target_call_id and an existing monitor or conference relationship — bare invocation on an unrelated call returns 403.
{ "event": "whisper", "target_call_id": "<agent-call-uuid>",
  "audio_url": "https://cdn.example.com/coach/upsell-tip.wav" }
Python
client.ai_sessions.whisper(
    call_id, target_call_id=agent_call_id, audio_url=tip_url
)

barge

Promote from monitor-only to a full participant in target_call_id. Both sides hear the AI.
{ "event": "barge", "target_call_id": "<agent-call-uuid>" }

monitor_start / monitor_stop

Open or close a one-way audio fork from a target call into the AI’s WS. mode is listen (default) or listen_and_whisper.
{ "event": "monitor_start",
  "target_call_id": "<agent-call-uuid>", "mode": "listen" }
Python
client.ai_sessions.monitor_start(
    call_id, target_call_id=agent_call_id, mode="listen"
)
client.ai_sessions.monitor_stop(call_id, target_call_id=agent_call_id)

Group 3 — Lead / context (3)

Persist data back to the lead row or schedule follow-up work.

set_lead_field

Patch one custom field on the lead’s custom_fields JSONB.
{ "event": "set_lead_field", "field": "preferred_callback_window",
  "value": "weekday_evenings" }
Python
client.ai_sessions.set_lead_field(
    call_id, field="preferred_callback_window", value="weekday_evenings"
)

set_lead_status

Override the lead status. Recorded as status_set_by='ai'.
{ "event": "set_lead_status", "status": "qualified" }
Python
client.ai_sessions.set_lead_status(call_id, status="qualified")

schedule_callback

Insert a callbacks row. The campaign engine picks it up at scheduled_at. Optional voice_agent_id to route the callback to a different agent (e.g. escalate Tier-1 → Tier-2).
{ "event": "schedule_callback",
  "scheduled_at": "2026-05-03T14:00:00Z",
  "reason": "Customer requested Tuesday 2pm",
  "voice_agent_id": "<tier2-agent-id>" }
Python
client.ai_sessions.schedule_callback(
    call_id, scheduled_at="2026-05-03T14:00:00Z",
    reason="callback requested",
)

Idempotency

Idempotency-Key is honored on every state-changing verb (read-only log and get_call_state ignore it). The Python and TypeScript SDKs auto-generate a UUID v4 per call so retries are safe by default; override with idempotency_key=... (Python) / idempotencyKey: ... (TS) when you want cross-process dedup.
Class of verbBehaviour
transfer, hangup, conference_*Strongly idempotent — second invocation hits a state-machine 409, but a replay with the same key returns the cached 200.
log, set_lead_field, set_disposition, set_lead_statusLast-write-wins; idempotency key just suppresses retry storms.
play_audio, whisper, send_dtmfWithout a key, replays replay-the-action. Key dedupes within a 60s window.
Replay window: 24h on (call_id, idempotency_key).

TTS not yet supported

play_audio and whisper accept a tts_text field in the v1 protocol, but the dispatcher currently returns 422 when that shape is sent — TTS rendering isn’t wired in v1. Use audio_url with a pre-rendered WAV/MP3 for now. TTS support is tracked for v1.1.

Error reference

HTTPSDK class (Python / TS)Typical cause
400ValidationError / ValidationErrorBody schema invalid (e.g. both audio_url and tts_text set, missing target_call_id)
401AuthenticationError / AuthenticationErrorCallback token expired or bad signature
403PermissionDenied / PermissionDeniedTenant fence (call_id ≠ token tenant), missing scope, or target_call_id lacks monitor/conference relationship
404NotFoundError / NotFoundErrorCall doesn’t exist
409ConflictError / ConflictErrorCall already terminated, conference state mismatch, or voice_agent in use during delete
422ValidationErrorTTS not wired (use audio_url)
424NoVoiceAgentConfigured / NoVoiceAgentConfiguredOverride hierarchy resolved no agent — set a tenant default or pin per campaign/flow
429RateLimitedError / RateLimitedError10 req/sec/call_id exceeded, or max_concurrent cap on the agent hit. retry_after_s returned
5xxServerError / ServerErrorYotel-internal — request_id returned for support

See also