feat: bootstrap vela UI and gateway workspace

Establish the monorepo, tooling, and starter apps so UI and gateway development can begin from a documented, runnable baseline.
2026-04-08 17:49:46 +02:00
commit bba0095bc0
23 changed files with 2023 additions and 0 deletions
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -0,0 +1,129 @@
+# Vela Architecture
+
+## High-Level Architecture
+
+```text
+[ Browser (PWA UI) ]
+        |
+   WebSocket
+        |
+[ Vela Gateway (NanoPi R6S) ]
+        |
+        +--> STT (local or NAS)
+        +--> Ollama (NAS GPU)
+        +--> Kokoro TTS (NAS or NanoPi)
+        +--> Home Assistant
+        +--> SearXNG
+```
+
+## Core Components
+
+## Repository Structure
+
+```text
+apps/
+  vela-ui/
+  vela-gateway/
+```
+
+The repository now includes separate runnable workspaces for the UI and gateway so implementation can proceed independently while staying aligned through shared documentation.
+
+### Frontend — `vela-ui`
+
+#### Tech
+
+- SvelteKit
+- PWA enabled
+- WebSocket client
+
+The current implementation is a minimal SvelteKit app with a single starter page. PWA behavior, microphone capture, and the WebSocket client are later increments.
+
+#### Responsibilities
+
+- audio capture from microphone
+- audio playback for TTS
+- UI state rendering
+- session management
+- interrupt handling
+
+#### Main Screen
+
+- large mic button
+- live transcript
+- streamed assistant response text
+- state indicator:
+  - idle
+  - listening
+  - thinking
+  - speaking
+- interrupt button during speaking
+
+### Backend — `vela-gateway`
+
+#### Tech
+
+- Fastify (Node)
+- WebSocket-based session layer
+
+The current implementation is a minimal Fastify service with `/` and `/health` HTTP endpoints. The WebSocket session layer is a later increment.
+
+#### Responsibilities
+
+- session lifecycle
+- audio ingestion
+- STT orchestration
+- LLM orchestration
+- tool execution
+- TTS orchestration
+- event streaming
+
+## Voice Pipeline
+
+```text
+Mic → Gateway → STT → Transcript
+→ LLM → Tool Calls → Results
+→ LLM → Final Response
+→ TTS → Audio Stream → UI
+```
+
+## Gateway Internal Flow
+
+```text
+1. Receive audio
+2. Run STT (streaming)
+3. Emit partial transcripts
+4. On final:
+   → call LLM
+5. LLM decides:
+   → direct response OR tool call
+6. Execute tool
+7. Feed result back to LLM
+8. Generate final response
+9. Send text stream
+10. Send TTS stream
+```
+
+## LLM Layer
+
+### Location
+
+- NAS with RTX 3050 8GB
+
+### Role
+
+- intent parsing
+- tool selection
+- response generation
+
+### Constraints
+
+- must use a tool-calling schema
+- must not directly control systems
+- target approximately 7B-class models because of hardware limits
+
+## Naming
+
+- system: **Vela**
+- gateway: `vela-gateway`
+- UI: `vela-ui`
+- voice profile: `vela-neutral`