feat(vela): start mocked response flow after push-to-talk commit

2026-04-08 21:20:17 +02:00
parent 98bcc543f5
commit 28712443cc
8 changed files with 284 additions and 25 deletions
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -36,7 +36,7 @@ The repository now includes separate runnable workspaces for the UI and gateway
 - PWA enabled
 - WebSocket client

-The current implementation is a minimal SvelteKit app with a single voice-session shell page. The shipped UI can open and close a browser WebSocket connection to the gateway `/ws` endpoint, show explicit connection status (`not connected`, `connecting`, `connected`, `disconnected`, `error`), expose mic control shell interactions that emit placeholder `input_audio.append` / `input_audio.commit` events, trigger one deterministic mocked turn while connected, render deterministic placeholder partial/final transcripts for the push-to-talk shell, and render the mocked user transcript plus mocked assistant response for the existing mocked-turn path. This remains a shell only: there is no real microphone capture, real provider integration, or audio playback yet.
+The current implementation is a minimal SvelteKit app with a single voice-session shell page. The shipped UI can open and close a browser WebSocket connection to the gateway `/ws` endpoint, show explicit connection status (`not connected`, `connecting`, `connected`, `disconnected`, `error`), expose mic control shell interactions that emit placeholder `input_audio.append` / `input_audio.commit` events, trigger one deterministic mocked turn while connected, render deterministic placeholder partial/final transcripts for the push-to-talk shell, and stream the mocked assistant response both for `mocked.turn.trigger` and for push-to-talk commits. This remains a shell only: there is no real microphone capture, real provider integration, or audio playback yet.

 #### Responsibilities

@@ -104,8 +104,8 @@ The current implementation is a minimal Fastify service with `/`, `/health`, and
 - `GET /ws` documents the route for plain HTTP clients and returns `426 Upgrade Required`
 - WebSocket upgrades on `/ws` create an ephemeral session immediately
 - the gateway sends `session.ready` followed by `session.state` (`idle`) when the socket is established
- valid minimal client events, including placeholder `input_audio.append` / `input_audio.commit`, can move the session between `idle` and `listening`
- placeholder `input_audio.append` emits deterministic mocked `transcript.partial` events and `input_audio.commit` emits one deterministic mocked `transcript.final`
+- valid minimal client events, including placeholder `input_audio.append` / `input_audio.commit`, can move the session through the mocked turn states on one socket
+- placeholder `input_audio.append` emits deterministic mocked `transcript.partial` events and `input_audio.commit` emits one deterministic mocked `transcript.final` before starting the existing mocked assistant response stream
 - `mocked.turn.trigger` drives a fixed transcript/response event sequence over the existing shared protocol
 - only one mocked turn is allowed in flight per session at a time
 - invalid JSON, invalid envelopes, and malformed frames are handled defensively so the process stays up
@@ -116,13 +116,13 @@ The current implementation is a minimal Fastify service with `/`, `/health`, and
 - exposes connect, disconnect, mic-control shell interactions, and mocked-turn controls
 - does not request microphone permission or capture real microphone audio
 - only emits placeholder `input_audio.append` / `input_audio.commit` events; it does not send real audio data or play back audio
- renders the latest placeholder partial transcript during a push-to-talk shell turn and replaces it with the final deterministic transcript on commit
+- renders the latest placeholder partial transcript during a push-to-talk shell turn, replaces it with the final deterministic transcript on commit, and appends streamed mocked assistant text for that same push-to-talk turn
 - reads mocked transcript and mocked response events from the shared protocol contract

 ## Voice Pipeline

 ```text
-Mic control shell / mocked turn button → Placeholder `input_audio.append` / `input_audio.commit` or mocked session flow → Deterministic transcript events → Mocked response text events when using mocked.turn.trigger → UI
+Mic control shell / mocked turn button → Placeholder `input_audio.append` / `input_audio.commit` or mocked session flow → Deterministic transcript events → Shared mocked response engine → Mocked response text events → UI
 ```

 This mocked vertical slice intentionally stands in for the future real pipeline: