feat(vela): add mocked turn transcript response slice
This commit is contained in:
@@ -15,7 +15,7 @@ Current UI baseline:
|
||||
|
||||
- the browser opens a WebSocket directly to `/ws`
|
||||
- the UI tracks connection status separately from gateway session status
|
||||
- the UI currently consumes server events but does not send `session.start` or any audio events yet
|
||||
- the UI can send `mocked.turn.trigger` after `session.ready` while connected to request one deterministic mocked turn for the active session
|
||||
|
||||
## WebSocket Message Envelope
|
||||
|
||||
@@ -40,6 +40,7 @@ This increment intentionally keeps the envelope minimal:
|
||||
```ts
|
||||
type ClientEvent =
|
||||
| { type: "session.start"; payload: {} }
|
||||
| { type: "mocked.turn.trigger"; payload: {} }
|
||||
| { type: "input_audio.append"; payload: { chunk: string } }
|
||||
| { type: "input_audio.commit"; payload: {} }
|
||||
| { type: "response.cancel"; payload: {} };
|
||||
@@ -48,6 +49,7 @@ type ClientEvent =
|
||||
#### Client event intent
|
||||
|
||||
- `session.start` initializes a voice session without locking in transport or auth details yet
|
||||
- `mocked.turn.trigger` asks the gateway to run one obviously mocked, deterministic transcript/response turn
|
||||
- `input_audio.append` carries a chunk of captured input audio as an encoded string
|
||||
- `input_audio.commit` marks the current buffered user turn as ready for downstream processing
|
||||
- `response.cancel` interrupts the active listen/think/speak flow
|
||||
@@ -56,9 +58,12 @@ type ClientEvent =
|
||||
|
||||
- on connect, the gateway creates an ephemeral in-memory session and emits `session.ready` plus `session.state`
|
||||
- `session.start` is accepted as an idempotent session acknowledgment and re-sends readiness/state
|
||||
- `mocked.turn.trigger` is accepted only when no other mocked turn is already in flight for that session
|
||||
- a mocked turn emits deterministic `transcript.final`, `response.text.delta`, `response.completed`, and `session.state` events in protocol-valid order
|
||||
- `input_audio.append` updates the ephemeral session record and moves the session to `listening`
|
||||
- `input_audio.commit` resets the minimal buffered state and returns the session to `idle`
|
||||
- `response.cancel` resets the minimal session state back to `idle`
|
||||
- a second mocked-turn trigger during an active mocked turn produces `error` with code `mocked_turn_in_flight`
|
||||
- malformed JSON produces `error` with code `invalid_json`
|
||||
- invalid envelopes or unsupported client event names produce `error` with code `invalid_message`
|
||||
- malformed WebSocket frames are rejected without crashing the gateway process
|
||||
@@ -79,6 +84,7 @@ Notes:
|
||||
|
||||
- this UI state is transport-oriented and is separate from the shared gateway `session.state` payload
|
||||
- `session.state` currently reflects the gateway session phase (`idle`, `listening`, `thinking`, `speaking`)
|
||||
- the UI disables the mocked-turn control until `session.ready` arrives, while disconnected, or while a mocked turn is already in flight
|
||||
- the UI treats malformed server messages, browser WebSocket errors, and gateway `error` events as safe error states instead of throwing
|
||||
|
||||
### Server → Client
|
||||
@@ -109,6 +115,27 @@ type ServerEvent =
|
||||
- `response.completed` marks the current assistant turn as done
|
||||
- `error` is the minimal recoverable failure shape for both UI and gateway work
|
||||
|
||||
### Deterministic mocked turn sequence
|
||||
|
||||
For this increment, `mocked.turn.trigger` produces one fixed interaction for the active session:
|
||||
|
||||
```text
|
||||
session.state(listening)
|
||||
→ transcript.final("[mocked user] What is the current mocked vertical slice?")
|
||||
→ session.state(thinking)
|
||||
→ session.state(speaking)
|
||||
→ response.text.delta("[mocked assistant] ")
|
||||
→ response.text.delta("This is a deterministic mocked response from the gateway vertical slice.")
|
||||
→ response.completed
|
||||
→ session.state(idle)
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- the content is intentionally fixed and obviously mocked
|
||||
- no audio, STT, LLM, TTS, or external providers participate in this flow
|
||||
- `response.cancel` can stop the mocked turn early and return the session to `idle`
|
||||
|
||||
## Contract Scope for This Increment
|
||||
|
||||
This contract is intentionally limited to the smallest event set needed to unblock:
|
||||
@@ -118,6 +145,7 @@ This contract is intentionally limited to the smallest event set needed to unblo
|
||||
|
||||
Explicitly deferred for later increments:
|
||||
|
||||
- freeform typed user input
|
||||
- tool-calling events
|
||||
- streamed TTS/output-audio events
|
||||
- reconnect/resume semantics
|
||||
|
||||
Reference in New Issue
Block a user