feat(vela-ui): add placeholder push-to-talk control shell
This commit is contained in:
@@ -36,7 +36,7 @@ The repository now includes separate runnable workspaces for the UI and gateway
|
||||
- PWA enabled
|
||||
- WebSocket client
|
||||
|
||||
The current implementation is a minimal SvelteKit app with a single voice-session shell page. The shipped UI can open and close a browser WebSocket connection to the gateway `/ws` endpoint, show explicit connection status (`not connected`, `connecting`, `connected`, `disconnected`, `error`), trigger one deterministic mocked turn while connected, and render the mocked user transcript plus mocked assistant response for the active session. Microphone capture, real provider integration, and audio playback are still future work.
|
||||
The current implementation is a minimal SvelteKit app with a single voice-session shell page. The shipped UI can open and close a browser WebSocket connection to the gateway `/ws` endpoint, show explicit connection status (`not connected`, `connecting`, `connected`, `disconnected`, `error`), expose mic control shell interactions that emit placeholder `input_audio.append` / `input_audio.commit` events, trigger one deterministic mocked turn while connected, and render the mocked user transcript plus mocked assistant response for the active session. This remains a shell only: there is no real microphone capture, real provider integration, or audio playback yet.
|
||||
|
||||
#### Responsibilities
|
||||
|
||||
@@ -104,7 +104,7 @@ The current implementation is a minimal Fastify service with `/`, `/health`, and
|
||||
- `GET /ws` documents the route for plain HTTP clients and returns `426 Upgrade Required`
|
||||
- WebSocket upgrades on `/ws` create an ephemeral session immediately
|
||||
- the gateway sends `session.ready` followed by `session.state` (`idle`) when the socket is established
|
||||
- valid minimal client events can move the session between `idle` and `listening`
|
||||
- valid minimal client events, including placeholder `input_audio.append` / `input_audio.commit`, can move the session between `idle` and `listening`
|
||||
- `mocked.turn.trigger` drives a fixed transcript/response event sequence over the existing shared protocol
|
||||
- only one mocked turn is allowed in flight per session at a time
|
||||
- invalid JSON, invalid envelopes, and malformed frames are handled defensively so the process stays up
|
||||
@@ -112,15 +112,15 @@ The current implementation is a minimal Fastify service with `/`, `/health`, and
|
||||
### Current UI shell behavior
|
||||
|
||||
- renders a minimal developer-focused voice-session panel
|
||||
- exposes connect, disconnect, and mocked-turn controls
|
||||
- does not request microphone permission
|
||||
- does not send or process audio data
|
||||
- exposes connect, disconnect, mic-control shell interactions, and mocked-turn controls
|
||||
- does not request microphone permission or capture real microphone audio
|
||||
- only emits placeholder `input_audio.append` / `input_audio.commit` events; it does not send real audio data or play back audio
|
||||
- reads mocked transcript and mocked response events from the shared protocol contract
|
||||
|
||||
## Voice Pipeline
|
||||
|
||||
```text
|
||||
Mocked turn button → Gateway mocked session flow → Transcript events → Response text events → UI
|
||||
Mic control shell / mocked turn button → Placeholder `input_audio.append` / `input_audio.commit` or mocked session flow → Transcript events → Response text events → UI
|
||||
```
|
||||
|
||||
This mocked vertical slice intentionally stands in for the future real pipeline:
|
||||
|
||||
@@ -35,7 +35,7 @@ Prove the end-to-end interaction model with mocked or stubbed providers.
|
||||
- [x] bootstrap `vela-gateway` as a runnable Fastify app in the Yarn workspace
|
||||
- [x] add the first UI voice-session shell with connect/disconnect controls and explicit WebSocket status
|
||||
- [x] create a minimal mocked-turn UI with transcript and response text over the shared WebSocket session
|
||||
- create a minimal UI with mic control
|
||||
- [x] create a minimal UI with mic control
|
||||
- [x] create a gateway WebSocket session skeleton
|
||||
- [x] implement a mocked transcript/response vertical slice over the existing WebSocket session
|
||||
- implement mocked STT flow for partial transcript events
|
||||
@@ -184,10 +184,12 @@ Polish the system after the core voice loop is reliable.
|
||||
- `apps/vela-ui` now boots as a minimal SvelteKit app with a starter page
|
||||
- `apps/vela-ui` now includes a minimal voice-session shell that can connect to the gateway `/ws` endpoint and display developer-visible session status
|
||||
- `apps/vela-ui` can now trigger one deterministic mocked turn while connected and render the mocked transcript plus assistant response for the active session
|
||||
- `apps/vela-ui` now exposes a visible push-to-talk mic control shell that sends placeholder `input_audio.append` / `input_audio.commit` events without requesting browser mic permission or capturing real audio
|
||||
- `apps/vela-ui` now includes browser-level coverage for the mocked transcript/response slice, including connect, disconnect, and disconnected-state trigger guarding
|
||||
- `apps/vela-gateway` now boots as a minimal Fastify app with `/` and `/health` endpoints
|
||||
- `apps/vela-gateway` now exposes a minimal `/ws` WebSocket session skeleton with ephemeral in-memory sessions and defensive message handling
|
||||
- `apps/vela-gateway` now accepts `mocked.turn.trigger` and emits protocol-valid mocked transcript/response events with one in-flight mocked turn per session
|
||||
- `apps/vela-gateway` now supports placeholder input-audio append/commit cycles before running another mocked turn on the same socket
|
||||
- `apps/vela-ui` now exposes a cancel control for active mocked turns and keeps already-rendered transcript/response text visible after cancellation
|
||||
- `apps/vela-gateway` now honors `response.cancel` during mocked turns by stopping pending mocked response events, returning the session to `idle`, and allowing a new mocked turn on the same socket
|
||||
- `apps/vela-protocol` now provides the shared WebSocket event contract for the UI and gateway
|
||||
|
||||
@@ -16,6 +16,7 @@ Current UI baseline:
|
||||
- the browser opens a WebSocket directly to `/ws`
|
||||
- the UI tracks connection status separately from gateway session status
|
||||
- the UI can send `mocked.turn.trigger` after `session.ready` while connected to request one deterministic mocked turn for the active session
|
||||
- the UI exposes a push-to-talk mic control shell that sends placeholder `input_audio.append` on press and `input_audio.commit` on release without capturing real audio
|
||||
|
||||
## WebSocket Message Envelope
|
||||
|
||||
@@ -62,6 +63,7 @@ type ClientEvent =
|
||||
- a mocked turn emits deterministic `transcript.final`, `response.text.delta`, `response.completed`, and `session.state` events in protocol-valid order
|
||||
- `input_audio.append` updates the ephemeral session record and moves the session to `listening`
|
||||
- `input_audio.commit` resets the minimal buffered state and returns the session to `idle`
|
||||
- after a completed placeholder input cycle, the same socket can still send `mocked.turn.trigger`
|
||||
- `response.cancel` is safe to send even when no mocked turn is active
|
||||
- `response.cancel` stops any still-pending mocked turn events for the active turn and resets the minimal session state back to `idle`
|
||||
- a second mocked-turn trigger during an active mocked turn produces `error` with code `mocked_turn_in_flight`
|
||||
@@ -86,6 +88,9 @@ Notes:
|
||||
- this UI state is transport-oriented and is separate from the shared gateway `session.state` payload
|
||||
- `session.state` currently reflects the gateway session phase (`idle`, `listening`, `thinking`, `speaking`)
|
||||
- the UI disables the mocked-turn control until `session.ready` arrives, while disconnected, or while a mocked turn is already in flight
|
||||
- the UI disables the mic control while disconnected, before `session.ready`, or while a mocked turn is already in flight
|
||||
- pressing the mic control sends one placeholder `input_audio.append` chunk and releasing it sends `input_audio.commit`
|
||||
- the UI copy explicitly labels the mic button as a control shell and not real microphone capture
|
||||
- the UI shows a cancel control and enables it only while a mocked turn is active
|
||||
- after cancel returns the gateway to `idle`, the UI clears the active-turn indicator but keeps any transcript or response text that was already rendered
|
||||
- the UI treats malformed server messages, browser WebSocket errors, and gateway `error` events as safe error states instead of throwing
|
||||
|
||||
Reference in New Issue
Block a user