Establish the monorepo, tooling, and starter apps so UI and gateway development can begin from a documented, runnable baseline.
93 lines
2.2 KiB
Markdown
93 lines
2.2 KiB
Markdown
# Vela Overview
|
|
|
|
## Objective
|
|
|
|
Vela is a fully local, voice-first assistant system with:
|
|
|
|
- local-first architecture and no mandatory cloud dependencies
|
|
- natural TTS output via Kokoro
|
|
- voice-driven interaction as the primary interface
|
|
- integrations with Home Assistant and SearXNG
|
|
- a lightweight SvelteKit PWA
|
|
- remote LLM inference via Ollama on a NAS
|
|
|
|
## Core Design Principles
|
|
|
|
### Voice-first
|
|
|
|
- UI optimized for speaking instead of typing
|
|
- minimal visual clutter
|
|
- real-time feedback through partial transcripts and streaming responses
|
|
|
|
### Local-first
|
|
|
|
- no required cloud APIs
|
|
- all services self-hosted
|
|
- browser used for capture and playback only
|
|
|
|
### Tool-driven intelligence
|
|
|
|
- the LLM does not directly control external systems
|
|
- all external actions route through explicit tools
|
|
|
|
### Low-latency interaction
|
|
|
|
- streaming STT partial results
|
|
- streaming LLM token output
|
|
- streaming TTS audio chunks
|
|
- interruptible responses
|
|
|
|
## Product Scope
|
|
|
|
### Primary Interface
|
|
|
|
- browser-based PWA
|
|
- push-to-talk interaction
|
|
- transcript and response display
|
|
- playback of streamed or returned audio
|
|
|
|
### Secondary Screens
|
|
|
|
- `/history`
|
|
- `/settings`
|
|
- `/admin`
|
|
|
|
These screens are lower priority than the main voice loop and should be implemented after the core interaction path is stable.
|
|
|
|
## Repository Layout
|
|
|
|
- `apps/vela-ui` — minimal SvelteKit browser UI
|
|
- `apps/vela-gateway` — minimal Fastify gateway service
|
|
- `docs/` — technical documentation and phased backlog
|
|
|
|
Use Yarn workspaces from the repository root to manage these packages.
|
|
|
|
## Primary User Flow
|
|
|
|
```text
|
|
User presses mic
|
|
→ audio streaming starts
|
|
→ transcript appears
|
|
→ final transcript sent
|
|
→ assistant processes
|
|
→ response streams as text and audio
|
|
→ user can interrupt anytime
|
|
```
|
|
|
|
## Non-Goals for v1
|
|
|
|
- full conversational memory system
|
|
- emotion simulation or personality modeling
|
|
- multi-user identity separation
|
|
- offline LLM on the NanoPi
|
|
- wake word and other future extensions listed in architecture docs
|
|
|
|
## Documentation Map
|
|
|
|
- [Architecture](architecture.md)
|
|
- [Protocol](protocol.md)
|
|
- [Integrations](integrations.md)
|
|
- [Deployment](deployment.md)
|
|
- [Setup](setup.md)
|
|
- [Backlog](backlog.md)
|