feat(vela): mock push-to-talk transcript updates

This commit is contained in:
2026-04-08 20:13:36 +02:00
parent 103bb11954
commit 98bcc543f5
8 changed files with 179 additions and 6 deletions

View File

@@ -62,7 +62,8 @@ type ClientEvent =
- `mocked.turn.trigger` is accepted only when no other mocked turn is already in flight for that session
- a mocked turn emits deterministic `transcript.final`, `response.text.delta`, `response.completed`, and `session.state` events in protocol-valid order
- `input_audio.append` updates the ephemeral session record and moves the session to `listening`
- `input_audio.commit` resets the minimal buffered state and returns the session to `idle`
- each accepted `input_audio.append` emits one deterministic `transcript.partial` for the current placeholder turn
- `input_audio.commit` emits exactly one deterministic `transcript.final`, resets the minimal buffered state, and returns the session to `idle`
- after a completed placeholder input cycle, the same socket can still send `mocked.turn.trigger`
- `response.cancel` is safe to send even when no mocked turn is active
- `response.cancel` stops any still-pending mocked turn events for the active turn and resets the minimal session state back to `idle`
@@ -90,6 +91,8 @@ Notes:
- the UI disables the mocked-turn control until `session.ready` arrives, while disconnected, or while a mocked turn is already in flight
- the UI disables the mic control while disconnected, before `session.ready`, or while a mocked turn is already in flight
- pressing the mic control sends one placeholder `input_audio.append` chunk and releasing it sends `input_audio.commit`
- while a placeholder push-to-talk turn is in progress, the UI renders the latest `transcript.partial`
- after placeholder commit, the UI renders the `transcript.final` and clears the partial-only display
- the UI copy explicitly labels the mic button as a control shell and not real microphone capture
- the UI shows a cancel control and enables it only while a mocked turn is active
- after cancel returns the gateway to `idle`, the UI clears the active-turn indicator but keeps any transcript or response text that was already rendered
@@ -144,6 +147,29 @@ Notes:
- no audio, STT, LLM, TTS, or external providers participate in this flow
- `response.cancel` can stop the mocked turn early, suppress any later mocked response events for that turn, and return the session to `idle`
### Deterministic placeholder push-to-talk transcript sequence
For this increment, the existing mic-control shell still sends placeholder `input_audio.append` on press and `input_audio.commit` on release. The gateway now translates that shell flow into deterministic mocked transcript events only:
```text
input_audio.append #1
→ session.state(listening) when entering the turn
→ transcript.partial("[mocked partial] Placeholder push-to-talk transcript in progress.")
input_audio.append #N (N > 1)
→ transcript.partial("[mocked partial] Placeholder push-to-talk transcript in progress (N chunks).")
input_audio.commit after N appends
→ transcript.final("[mocked final] Placeholder push-to-talk transcript completed from N appended chunk(s).")
→ session.state(idle)
```
Safe deterministic edge cases for this mocked placeholder flow:
- commit without any prior append is accepted and emits `transcript.final("[mocked final] Placeholder push-to-talk transcript completed without appended audio.")`
- repeated appends during one placeholder turn are accepted and each append replaces the latest partial transcript with a chunk-count-based deterministic value
- placeholder commit does not automatically start assistant thinking, response streaming, or audio playback
## Contract Scope for This Increment
This contract is intentionally limited to the smallest event set needed to unblock: