Files
assistant/docs/backlog.md

5.8 KiB

Vela Phased Backlog

This backlog is the implementation plan translated into phased, actionable work. It should be updated whenever implementation changes scope, ordering, or done criteria.

Phase 1 — Foundation and Contracts

Goal

Establish the boundaries, protocol, and state model for the system before integrating providers.

Backlog Items

  • define repository structure for vela-ui and vela-gateway
  • define the WebSocket event contract used by the UI and gateway via shared package
  • define the session state machine and interrupt semantics
  • define provider adapter interfaces for STT, LLM, TTS, and tools
  • document error handling and cancellation behavior

Exit Criteria

  • protocol and state machine are documented
  • UI and gateway responsibilities are explicit
  • interrupt behavior is defined for every active phase
  • provider boundaries are clear enough to implement mocks first

Phase 2 — Vertical Slice Skeleton

Goal

Prove the end-to-end interaction model with mocked or stubbed providers.

Backlog Items

  • bootstrap vela-ui as a runnable SvelteKit app in the Yarn workspace
  • bootstrap vela-gateway as a runnable Fastify app in the Yarn workspace
  • create a minimal UI with mic control, state indicator, transcript, and response text
  • create a gateway WebSocket session skeleton
  • implement mocked STT flow for partial and final transcript events
  • implement mocked LLM response streaming
  • implement stubbed audio playback or placeholder TTS output
  • implement interrupt handling across the mocked pipeline

Exit Criteria

  • one client can complete a voice turn through the real UI↔gateway contract
  • transcript appears in the UI
  • assistant text appears progressively or in structured steps
  • audio playback or stubbed playback is visible to the user
  • interrupt stops the active response and resets state cleanly

Phase 3 — Real STT Integration

Goal

Replace the mocked transcription layer with a real streaming STT provider.

Backlog Items

  • integrate whisper.cpp behind the STT adapter
  • support partial and final transcript delivery
  • handle audio format conversion if browser capture format differs
  • handle late transcript events after cancellation
  • expose recoverable error handling for STT failures

Exit Criteria

  • live mic audio produces usable transcripts
  • partial and final results reach the UI
  • cancellation prevents late transcript results from corrupting session state
  • STT failure paths are visible and recoverable

Phase 4 — Ollama Streaming and Tool Calling

Goal

Replace the mocked reasoning layer with real LLM orchestration.

Backlog Items

  • integrate Ollama behind the LLM adapter
  • stream assistant text deltas to the UI
  • define and validate tool-calling schema
  • reject invalid or unsafe tool calls
  • support interrupt during active thinking

Exit Criteria

  • assistant responses stream from Ollama
  • invalid tool requests fail safely
  • cancellation stops active model work
  • the LLM cannot directly execute external actions

Phase 5 — Tool Layer

Goal

Introduce useful tools in increasing order of operational risk.

Backlog Items

  • implement SearXNG search adapter
  • normalize search results for LLM consumption
  • implement Home Assistant read actions
  • implement Home Assistant write actions gated by confirmation
  • implement clarification flow for ambiguous tool requests

Exit Criteria

  • web search works end-to-end
  • Home Assistant read queries work for approved entities
  • Home Assistant write actions require explicit confirmation
  • ambiguous actions do not execute automatically

Phase 6 — Kokoro TTS

Goal

Convert assistant text responses into spoken output.

Backlog Items

  • integrate Kokoro behind the TTS adapter
  • support streamed audio when practical
  • add a temporary fallback for full-response playback if streaming is not ready
  • stop or suppress playback correctly on interrupt

Exit Criteria

  • spoken output plays in the UI
  • interrupt stops or suppresses playback reliably
  • any non-streaming fallback is explicitly documented as temporary

Phase 7 — Resilience and Performance

Goal

Make the system robust enough for routine use on the target hardware.

Backlog Items

  • handle disconnect and reconnect cleanly
  • add bounded timeouts for STT, LLM, tool, and TTS calls
  • measure latency by pipeline stage
  • improve buffering and recovery paths for flaky network dependencies
  • validate behavior under cancellation and partial failure

Exit Criteria

  • common network and provider failures do not leave sessions stuck
  • latency is measurable at each major stage
  • user-visible recovery paths exist for expected failure modes

Phase 8 — Productization and Secondary Surfaces

Goal

Polish the system after the core voice loop is reliable.

Backlog Items

  • add PWA installability and UX polish
  • implement /history
  • implement /settings
  • implement /admin
  • document operational settings and maintenance guidance

Exit Criteria

  • the app is installable as a PWA
  • secondary screens exist without degrading the core voice loop
  • supporting docs reflect the implemented behavior

Ongoing Documentation Tasks

  • update docs whenever implementation changes the protocol, architecture, integrations, deployment, or backlog order
  • mark completed backlog items or split phases into smaller slices as work progresses
  • keep root README.md as the entrypoint and keep detailed technical docs in docs/

Current Progress Notes

  • apps/vela-ui now boots as a minimal SvelteKit app with a starter page
  • apps/vela-gateway now boots as a minimal Fastify app with / and /health endpoints
  • apps/vela-protocol now provides the shared WebSocket event contract for the UI and gateway
  • backend framework choice is now concrete: Fastify