feat: bootstrap vela UI and gateway workspace
Establish the monorepo, tooling, and starter apps so UI and gateway development can begin from a documented, runnable baseline.
This commit is contained in:
183
docs/backlog.md
Normal file
183
docs/backlog.md
Normal file
@@ -0,0 +1,183 @@
|
||||
# Vela Phased Backlog
|
||||
|
||||
This backlog is the implementation plan translated into phased, actionable work. It should be updated whenever implementation changes scope, ordering, or done criteria.
|
||||
|
||||
## Phase 1 — Foundation and Contracts
|
||||
|
||||
### Goal
|
||||
|
||||
Establish the boundaries, protocol, and state model for the system before integrating providers.
|
||||
|
||||
### Backlog Items
|
||||
|
||||
- [x] define repository structure for `vela-ui` and `vela-gateway`
|
||||
- define the WebSocket event contract used by the UI and gateway
|
||||
- define the session state machine and interrupt semantics
|
||||
- define provider adapter interfaces for STT, LLM, TTS, and tools
|
||||
- document error handling and cancellation behavior
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- protocol and state machine are documented
|
||||
- UI and gateway responsibilities are explicit
|
||||
- interrupt behavior is defined for every active phase
|
||||
- provider boundaries are clear enough to implement mocks first
|
||||
|
||||
## Phase 2 — Vertical Slice Skeleton
|
||||
|
||||
### Goal
|
||||
|
||||
Prove the end-to-end interaction model with mocked or stubbed providers.
|
||||
|
||||
### Backlog Items
|
||||
|
||||
- [x] bootstrap `vela-ui` as a runnable SvelteKit app in the Yarn workspace
|
||||
- [x] bootstrap `vela-gateway` as a runnable Fastify app in the Yarn workspace
|
||||
- create a minimal UI with mic control, state indicator, transcript, and response text
|
||||
- create a gateway WebSocket session skeleton
|
||||
- implement mocked STT flow for partial and final transcript events
|
||||
- implement mocked LLM response streaming
|
||||
- implement stubbed audio playback or placeholder TTS output
|
||||
- implement interrupt handling across the mocked pipeline
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- one client can complete a voice turn through the real UI↔gateway contract
|
||||
- transcript appears in the UI
|
||||
- assistant text appears progressively or in structured steps
|
||||
- audio playback or stubbed playback is visible to the user
|
||||
- interrupt stops the active response and resets state cleanly
|
||||
|
||||
## Phase 3 — Real STT Integration
|
||||
|
||||
### Goal
|
||||
|
||||
Replace the mocked transcription layer with a real streaming STT provider.
|
||||
|
||||
### Backlog Items
|
||||
|
||||
- integrate `whisper.cpp` behind the STT adapter
|
||||
- support partial and final transcript delivery
|
||||
- handle audio format conversion if browser capture format differs
|
||||
- handle late transcript events after cancellation
|
||||
- expose recoverable error handling for STT failures
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- live mic audio produces usable transcripts
|
||||
- partial and final results reach the UI
|
||||
- cancellation prevents late transcript results from corrupting session state
|
||||
- STT failure paths are visible and recoverable
|
||||
|
||||
## Phase 4 — Ollama Streaming and Tool Calling
|
||||
|
||||
### Goal
|
||||
|
||||
Replace the mocked reasoning layer with real LLM orchestration.
|
||||
|
||||
### Backlog Items
|
||||
|
||||
- integrate Ollama behind the LLM adapter
|
||||
- stream assistant text deltas to the UI
|
||||
- define and validate tool-calling schema
|
||||
- reject invalid or unsafe tool calls
|
||||
- support interrupt during active thinking
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- assistant responses stream from Ollama
|
||||
- invalid tool requests fail safely
|
||||
- cancellation stops active model work
|
||||
- the LLM cannot directly execute external actions
|
||||
|
||||
## Phase 5 — Tool Layer
|
||||
|
||||
### Goal
|
||||
|
||||
Introduce useful tools in increasing order of operational risk.
|
||||
|
||||
### Backlog Items
|
||||
|
||||
- implement SearXNG search adapter
|
||||
- normalize search results for LLM consumption
|
||||
- implement Home Assistant read actions
|
||||
- implement Home Assistant write actions gated by confirmation
|
||||
- implement clarification flow for ambiguous tool requests
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- web search works end-to-end
|
||||
- Home Assistant read queries work for approved entities
|
||||
- Home Assistant write actions require explicit confirmation
|
||||
- ambiguous actions do not execute automatically
|
||||
|
||||
## Phase 6 — Kokoro TTS
|
||||
|
||||
### Goal
|
||||
|
||||
Convert assistant text responses into spoken output.
|
||||
|
||||
### Backlog Items
|
||||
|
||||
- integrate Kokoro behind the TTS adapter
|
||||
- support streamed audio when practical
|
||||
- add a temporary fallback for full-response playback if streaming is not ready
|
||||
- stop or suppress playback correctly on interrupt
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- spoken output plays in the UI
|
||||
- interrupt stops or suppresses playback reliably
|
||||
- any non-streaming fallback is explicitly documented as temporary
|
||||
|
||||
## Phase 7 — Resilience and Performance
|
||||
|
||||
### Goal
|
||||
|
||||
Make the system robust enough for routine use on the target hardware.
|
||||
|
||||
### Backlog Items
|
||||
|
||||
- handle disconnect and reconnect cleanly
|
||||
- add bounded timeouts for STT, LLM, tool, and TTS calls
|
||||
- measure latency by pipeline stage
|
||||
- improve buffering and recovery paths for flaky network dependencies
|
||||
- validate behavior under cancellation and partial failure
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- common network and provider failures do not leave sessions stuck
|
||||
- latency is measurable at each major stage
|
||||
- user-visible recovery paths exist for expected failure modes
|
||||
|
||||
## Phase 8 — Productization and Secondary Surfaces
|
||||
|
||||
### Goal
|
||||
|
||||
Polish the system after the core voice loop is reliable.
|
||||
|
||||
### Backlog Items
|
||||
|
||||
- add PWA installability and UX polish
|
||||
- implement `/history`
|
||||
- implement `/settings`
|
||||
- implement `/admin`
|
||||
- document operational settings and maintenance guidance
|
||||
|
||||
### Exit Criteria
|
||||
|
||||
- the app is installable as a PWA
|
||||
- secondary screens exist without degrading the core voice loop
|
||||
- supporting docs reflect the implemented behavior
|
||||
|
||||
## Ongoing Documentation Tasks
|
||||
|
||||
- update docs whenever implementation changes the protocol, architecture, integrations, deployment, or backlog order
|
||||
- mark completed backlog items or split phases into smaller slices as work progresses
|
||||
- keep root `README.md` as the entrypoint and keep detailed technical docs in `docs/`
|
||||
|
||||
## Current Progress Notes
|
||||
|
||||
- `apps/vela-ui` now boots as a minimal SvelteKit app with a starter page
|
||||
- `apps/vela-gateway` now boots as a minimal Fastify app with `/` and `/health` endpoints
|
||||
- backend framework choice is now concrete: Fastify
|
||||
Reference in New Issue
Block a user