Files
assistant/docs/integrations.md
Johannes Kresner 4b11703c93 feat(vela-ui): add voice session shell
Add a minimal UI shell that connects to the gateway WebSocket and exposes developer-visible session state. Align the architecture, protocol, setup, integration, and backlog docs with the current UI increment.
2026-04-08 18:40:45 +02:00

109 lines
2.5 KiB
Markdown

# Vela Integrations and Tool Safety
## Current Runtime Baseline
- `vela-ui` is implemented as a SvelteKit application
- `vela-gateway` is implemented as a Fastify service
- `vela-gateway` now exposes `/ws` as the minimal WebSocket session entrypoint using the shared `@vela/protocol` contract
- `vela-ui` now opens a minimal browser WebSocket client against that `/ws` entrypoint and surfaces connection/session status for developers
- current integration work beyond the gateway WebSocket/session baseline remains future implementation
## Gateway Session Contract
- transport: WebSocket on `/ws`
- session storage: in-memory only, one ephemeral record per live connection
- message format: `@vela/protocol` `MessageEnvelope<{ type, payload }>`
- current server behavior: acknowledge connect with `session.ready` and `session.state`
- safety baseline: invalid JSON, invalid envelopes, and malformed frames return protocol errors or close that socket without taking down the service
- current UI behavior: connect/disconnect only, no microphone access, no audio payloads, and safe error-state handling for `open`/`error`/`close`
## STT (Speech-to-Text)
### Primary Option
- `whisper.cpp`
### Deployment
- start on NanoPi
- move to NAS if latency is insufficient
### Requirements
- streaming transcription
- partial and final output
- low latency, with sub-second response preferred
## TTS (Text-to-Speech)
### Engine
- Kokoro TTS
### Deployment
- prefer NAS for more compute headroom
### API Contract
```http
POST /speak
{
"text": "...",
"voice": "vela",
"format": "wav"
}
```
### Requirements
- streaming audio preferred
- low startup latency
- interrupt support
## Tool System
### Home Assistant Tool
#### Functions
```ts
turn_on(entity_id);
turn_off(entity_id);
set_temperature(entity_id, value);
get_state(entity_id);
```
#### Backend
- REST API
- optional Conversation API
#### Safety
- require confirmation for destructive actions
- require confirmation for irreversible or significant state changes
- keep secrets server-side only
### SearXNG Tool
#### Endpoint
```http
GET /search?q=...&format=json
```
#### Flow
- query SearXNG
- return top results
- let the LLM summarize the result set
## Safety Rules
- the LLM does not directly control systems
- all external actions go through explicit tool adapters
- Home Assistant write actions require confirmation
- frontend must not contain Home Assistant tokens or other secrets
- ambiguous tool intents should be clarified instead of guessed