diff --git a/apps/vela-ui/README.md b/apps/vela-ui/README.md index f8d47f5..0240b1a 100644 --- a/apps/vela-ui/README.md +++ b/apps/vela-ui/README.md @@ -5,5 +5,6 @@ This workspace contains the Vela browser UI as a minimal SvelteKit app. Current status: - SvelteKit app boots in the Yarn workspace -- root page shows the initial Vela UI starter screen -- PWA features and voice interaction flows remain future increments +- root page shows a minimal voice-session shell with connect/disconnect controls +- the shell can connect to the gateway `/ws` endpoint and display developer-visible session status +- microphone capture, transcript rendering, and audio playback remain future increments diff --git a/apps/vela-ui/src/routes/+page.svelte b/apps/vela-ui/src/routes/+page.svelte index 7a6f7af..fb8156c 100644 --- a/apps/vela-ui/src/routes/+page.svelte +++ b/apps/vela-ui/src/routes/+page.svelte @@ -2,43 +2,262 @@ Vela UI

Vela UI

-

Minimal SvelteKit starter

+

Voice session shell

- This workspace now runs as the browser shell for Vela. The voice controls, transcript, and - streaming session UI will be added in later increments. + This minimal browser shell can connect to the gateway WebSocket and expose developer-visible + session status. Microphone capture, transcript rendering, and audio playback remain future + increments.

Shared protocol package loaded with {CLIENT_EVENT_TYPES.length} client event types and - {SERVER_EVENT_TYPES.length} server event types. + {SERVER_EVENT_TYPES.length} server event types across {SESSION_STATES.length} gateway session + states.

+
+ + +
+
- Status - {appStatus} + UI connection state + {connectionState}
- Next - {nextFocus} + Connection detail + {connectionDetail} +
+
+ Gateway WebSocket URL + {gatewayWebSocketUrl} +
+
+ Session ID + {sessionId} +
+
+ Gateway session state + {gatewaySessionState} +
+
+ Last server event + {lastServerEvent} +
+
+ Last error + {lastError} +
+
+ Last close + {lastClose} +
+
+ Connection attempts + {connectionAttempts} +
+
+ Protocol package + {PROTOCOL_PACKAGE_NAME}
@@ -91,9 +310,32 @@ margin-top: 1rem; } + .controls { + margin-top: 1.5rem; + display: flex; + gap: 0.75rem; + flex-wrap: wrap; + } + + button { + padding: 0.8rem 1.1rem; + border: 1px solid #2b4a6b; + border-radius: 0.75rem; + background: #12233a; + color: #e6eef8; + font: inherit; + cursor: pointer; + } + + button:disabled { + opacity: 0.55; + cursor: not-allowed; + } + .meta { margin-top: 1.5rem; display: grid; + grid-template-columns: repeat(auto-fit, minmax(14rem, 1fr)); gap: 1rem; } diff --git a/docs/architecture.md b/docs/architecture.md index f612609..6660e46 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -36,18 +36,35 @@ The repository now includes separate runnable workspaces for the UI and gateway - PWA enabled - WebSocket client -The current implementation is a minimal SvelteKit app with a single starter page. PWA behavior, microphone capture, and the WebSocket client are later increments. +The current implementation is a minimal SvelteKit app with a single voice-session shell page. The shipped UI can open and close a browser WebSocket connection to the gateway `/ws` endpoint, show explicit connection status (`not connected`, `connecting`, `connected`, `disconnected`, `error`), and surface session metadata for developers. Microphone capture, transcript rendering, interrupt controls, streamed assistant response display, and audio playback are not part of the current shell and remain future work. #### Responsibilities +Current shell responsibilities: + +- connection state rendering +- developer-oriented session metadata rendering +- browser session connect/disconnect controls + +Future UI responsibilities: + - audio capture from microphone - audio playback for TTS -- UI state rendering -- session management +- broader voice-session UI state rendering - interrupt handling #### Main Screen +Current shell: + +- developer-focused voice-session panel +- connect button +- disconnect button +- connection status indicator +- session metadata display + +Future interactive voice screen: + - large mic button - live transcript - streamed assistant response text @@ -85,6 +102,14 @@ The current implementation is a minimal Fastify service with `/`, `/health`, and - valid minimal client events can move the session between `idle` and `listening` - invalid JSON, invalid envelopes, and malformed frames are handled defensively so the process stays up +### Current UI shell behavior + +- renders a minimal developer-focused voice-session panel +- exposes connect and disconnect controls only +- does not request microphone permission +- does not send or process audio data +- reads `session.ready`, `session.state`, and `error` messages from the shared protocol contract + ## Voice Pipeline ```text diff --git a/docs/backlog.md b/docs/backlog.md index f2282c3..638d83b 100644 --- a/docs/backlog.md +++ b/docs/backlog.md @@ -33,7 +33,8 @@ Prove the end-to-end interaction model with mocked or stubbed providers. - [x] bootstrap `vela-ui` as a runnable SvelteKit app in the Yarn workspace - [x] bootstrap `vela-gateway` as a runnable Fastify app in the Yarn workspace -- create a minimal UI with mic control, state indicator, transcript, and response text +- [x] add the first UI voice-session shell with connect/disconnect controls and explicit WebSocket status +- create a minimal UI with mic control, transcript, and response text - [x] create a gateway WebSocket session skeleton - implement mocked STT flow for partial and final transcript events - implement mocked LLM response streaming @@ -179,6 +180,7 @@ Polish the system after the core voice loop is reliable. ## Current Progress Notes - `apps/vela-ui` now boots as a minimal SvelteKit app with a starter page +- `apps/vela-ui` now includes a minimal voice-session shell that can connect to the gateway `/ws` endpoint and display developer-visible session status - `apps/vela-gateway` now boots as a minimal Fastify app with `/` and `/health` endpoints - `apps/vela-gateway` now exposes a minimal `/ws` WebSocket session skeleton with ephemeral in-memory sessions and defensive message handling - `apps/vela-protocol` now provides the shared WebSocket event contract for the UI and gateway diff --git a/docs/integrations.md b/docs/integrations.md index c092447..571f025 100644 --- a/docs/integrations.md +++ b/docs/integrations.md @@ -5,6 +5,7 @@ - `vela-ui` is implemented as a SvelteKit application - `vela-gateway` is implemented as a Fastify service - `vela-gateway` now exposes `/ws` as the minimal WebSocket session entrypoint using the shared `@vela/protocol` contract +- `vela-ui` now opens a minimal browser WebSocket client against that `/ws` entrypoint and surfaces connection/session status for developers - current integration work beyond the gateway WebSocket/session baseline remains future implementation ## Gateway Session Contract @@ -14,6 +15,7 @@ - message format: `@vela/protocol` `MessageEnvelope<{ type, payload }>` - current server behavior: acknowledge connect with `session.ready` and `session.state` - safety baseline: invalid JSON, invalid envelopes, and malformed frames return protocol errors or close that socket without taking down the service +- current UI behavior: connect/disconnect only, no microphone access, no audio payloads, and safe error-state handling for `open`/`error`/`close` ## STT (Speech-to-Text) diff --git a/docs/protocol.md b/docs/protocol.md index fa2f346..f0a8e31 100644 --- a/docs/protocol.md +++ b/docs/protocol.md @@ -11,6 +11,12 @@ Current gateway baseline: - the gateway sends `session.ready` and `session.state` immediately after a successful socket upgrade - the gateway accepts JSON text messages only in the shared envelope shape +Current UI baseline: + +- the browser opens a WebSocket directly to `/ws` +- the UI tracks connection status separately from gateway session status +- the UI currently consumes server events but does not send `session.start` or any audio events yet + ## WebSocket Message Envelope Every WebSocket message uses one envelope format: @@ -57,6 +63,24 @@ type ClientEvent = - invalid envelopes or unsupported client event names produce `error` with code `invalid_message` - malformed WebSocket frames are rejected without crashing the gateway process +### UI connection shell behavior + +The UI currently exposes a small browser-side connection state machine for the WebSocket transport: + +```text +not connected + → connecting + → connected + → disconnected + → error +``` + +Notes: + +- this UI state is transport-oriented and is separate from the shared gateway `session.state` payload +- `session.state` currently reflects the gateway session phase (`idle`, `listening`, `thinking`, `speaking`) +- the UI treats malformed server messages, browser WebSocket errors, and gateway `error` events as safe error states instead of throwing + ### Server → Client ```ts diff --git a/docs/setup.md b/docs/setup.md index dfe2b6f..ca97522 100644 --- a/docs/setup.md +++ b/docs/setup.md @@ -66,7 +66,8 @@ mise exec -- yarn build:gateway ## Notes - the concrete framework choices are now SvelteKit for `vela-ui` and Fastify for `vela-gateway` -- the UI is intentionally minimal and does not yet include mic capture, transcript rendering, or WebSocket session state -- the gateway is intentionally minimal and does not yet expose the planned WebSocket contract +- the UI is intentionally minimal and currently includes only a developer-facing WebSocket voice-session shell +- the UI does not yet include mic capture, transcript rendering, assistant response rendering, or audio playback +- the gateway now exposes the minimal shared-protocol `/ws` WebSocket contract used by that shell - if your shell is configured for mise activation, plain `yarn` commands can be used after `mise install` - update this document when the repo layout or package manager workflow changes