feat(vela-ui): add voice session shell

Add a minimal UI shell that connects to the gateway WebSocket and exposes developer-visible session state. Align the architecture, protocol, setup, integration, and backlog docs with the current UI increment.
This commit is contained in:
2026-04-08 18:40:45 +02:00
parent fa5a458003
commit 4b11703c93
7 changed files with 317 additions and 20 deletions

View File

@@ -36,18 +36,35 @@ The repository now includes separate runnable workspaces for the UI and gateway
- PWA enabled
- WebSocket client
The current implementation is a minimal SvelteKit app with a single starter page. PWA behavior, microphone capture, and the WebSocket client are later increments.
The current implementation is a minimal SvelteKit app with a single voice-session shell page. The shipped UI can open and close a browser WebSocket connection to the gateway `/ws` endpoint, show explicit connection status (`not connected`, `connecting`, `connected`, `disconnected`, `error`), and surface session metadata for developers. Microphone capture, transcript rendering, interrupt controls, streamed assistant response display, and audio playback are not part of the current shell and remain future work.
#### Responsibilities
Current shell responsibilities:
- connection state rendering
- developer-oriented session metadata rendering
- browser session connect/disconnect controls
Future UI responsibilities:
- audio capture from microphone
- audio playback for TTS
- UI state rendering
- session management
- broader voice-session UI state rendering
- interrupt handling
#### Main Screen
Current shell:
- developer-focused voice-session panel
- connect button
- disconnect button
- connection status indicator
- session metadata display
Future interactive voice screen:
- large mic button
- live transcript
- streamed assistant response text
@@ -85,6 +102,14 @@ The current implementation is a minimal Fastify service with `/`, `/health`, and
- valid minimal client events can move the session between `idle` and `listening`
- invalid JSON, invalid envelopes, and malformed frames are handled defensively so the process stays up
### Current UI shell behavior
- renders a minimal developer-focused voice-session panel
- exposes connect and disconnect controls only
- does not request microphone permission
- does not send or process audio data
- reads `session.ready`, `session.state`, and `error` messages from the shared protocol contract
## Voice Pipeline
```text

View File

@@ -33,7 +33,8 @@ Prove the end-to-end interaction model with mocked or stubbed providers.
- [x] bootstrap `vela-ui` as a runnable SvelteKit app in the Yarn workspace
- [x] bootstrap `vela-gateway` as a runnable Fastify app in the Yarn workspace
- create a minimal UI with mic control, state indicator, transcript, and response text
- [x] add the first UI voice-session shell with connect/disconnect controls and explicit WebSocket status
- create a minimal UI with mic control, transcript, and response text
- [x] create a gateway WebSocket session skeleton
- implement mocked STT flow for partial and final transcript events
- implement mocked LLM response streaming
@@ -179,6 +180,7 @@ Polish the system after the core voice loop is reliable.
## Current Progress Notes
- `apps/vela-ui` now boots as a minimal SvelteKit app with a starter page
- `apps/vela-ui` now includes a minimal voice-session shell that can connect to the gateway `/ws` endpoint and display developer-visible session status
- `apps/vela-gateway` now boots as a minimal Fastify app with `/` and `/health` endpoints
- `apps/vela-gateway` now exposes a minimal `/ws` WebSocket session skeleton with ephemeral in-memory sessions and defensive message handling
- `apps/vela-protocol` now provides the shared WebSocket event contract for the UI and gateway

View File

@@ -5,6 +5,7 @@
- `vela-ui` is implemented as a SvelteKit application
- `vela-gateway` is implemented as a Fastify service
- `vela-gateway` now exposes `/ws` as the minimal WebSocket session entrypoint using the shared `@vela/protocol` contract
- `vela-ui` now opens a minimal browser WebSocket client against that `/ws` entrypoint and surfaces connection/session status for developers
- current integration work beyond the gateway WebSocket/session baseline remains future implementation
## Gateway Session Contract
@@ -14,6 +15,7 @@
- message format: `@vela/protocol` `MessageEnvelope<{ type, payload }>`
- current server behavior: acknowledge connect with `session.ready` and `session.state`
- safety baseline: invalid JSON, invalid envelopes, and malformed frames return protocol errors or close that socket without taking down the service
- current UI behavior: connect/disconnect only, no microphone access, no audio payloads, and safe error-state handling for `open`/`error`/`close`
## STT (Speech-to-Text)

View File

@@ -11,6 +11,12 @@ Current gateway baseline:
- the gateway sends `session.ready` and `session.state` immediately after a successful socket upgrade
- the gateway accepts JSON text messages only in the shared envelope shape
Current UI baseline:
- the browser opens a WebSocket directly to `/ws`
- the UI tracks connection status separately from gateway session status
- the UI currently consumes server events but does not send `session.start` or any audio events yet
## WebSocket Message Envelope
Every WebSocket message uses one envelope format:
@@ -57,6 +63,24 @@ type ClientEvent =
- invalid envelopes or unsupported client event names produce `error` with code `invalid_message`
- malformed WebSocket frames are rejected without crashing the gateway process
### UI connection shell behavior
The UI currently exposes a small browser-side connection state machine for the WebSocket transport:
```text
not connected
→ connecting
→ connected
→ disconnected
→ error
```
Notes:
- this UI state is transport-oriented and is separate from the shared gateway `session.state` payload
- `session.state` currently reflects the gateway session phase (`idle`, `listening`, `thinking`, `speaking`)
- the UI treats malformed server messages, browser WebSocket errors, and gateway `error` events as safe error states instead of throwing
### Server → Client
```ts

View File

@@ -66,7 +66,8 @@ mise exec -- yarn build:gateway
## Notes
- the concrete framework choices are now SvelteKit for `vela-ui` and Fastify for `vela-gateway`
- the UI is intentionally minimal and does not yet include mic capture, transcript rendering, or WebSocket session state
- the gateway is intentionally minimal and does not yet expose the planned WebSocket contract
- the UI is intentionally minimal and currently includes only a developer-facing WebSocket voice-session shell
- the UI does not yet include mic capture, transcript rendering, assistant response rendering, or audio playback
- the gateway now exposes the minimal shared-protocol `/ws` WebSocket contract used by that shell
- if your shell is configured for mise activation, plain `yarn` commands can be used after `mise install`
- update this document when the repo layout or package manager workflow changes