Spaces:
Running
Running
Eddie Hudson commited on
Commit ·
6af0b1a
0
Parent(s):
Initial commit
Browse files- .env.example +2 -0
- .gitignore +7 -0
- .python-version +1 -0
- LICENSE +21 -0
- README.md +212 -0
- index.html +143 -0
- moltbot_body/__init__.py +3 -0
- moltbot_body/audio/__init__.py +1 -0
- moltbot_body/audio/head_wobbler.py +171 -0
- moltbot_body/audio/speech_tapper.py +268 -0
- moltbot_body/clawdbot_handler.py +672 -0
- moltbot_body/main.py +322 -0
- moltbot_body/moves.py +849 -0
- pyproject.toml +49 -0
- style.css +395 -0
- uv.lock +0 -0
.env.example
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
|
|
|
| 1 |
+
CLAWDBOT_TOKEN="PUT YOUR CLAWDBOT TOKEN HERE"
|
| 2 |
+
ELEVENLABS_API_KEY="PUT YOUR ELEVENLABS API KEY HERE"
|
.gitignore
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
.venv/
|
| 2 |
+
*.egg-info/
|
| 3 |
+
*.code-workspace
|
| 4 |
+
.env
|
| 5 |
+
build/
|
| 6 |
+
dist/
|
| 7 |
+
__pycache__/
|
.python-version
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
3.12
|
LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
MIT License
|
| 2 |
+
|
| 3 |
+
Copyright (c) 2026
|
| 4 |
+
|
| 5 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
| 6 |
+
of this software and associated documentation files (the "Software"), to deal
|
| 7 |
+
in the Software without restriction, including without limitation the rights
|
| 8 |
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
| 9 |
+
copies of the Software, and to permit persons to whom the Software is
|
| 10 |
+
furnished to do so, subject to the following conditions:
|
| 11 |
+
|
| 12 |
+
The above copyright notice and this permission notice shall be included in all
|
| 13 |
+
copies or substantial portions of the Software.
|
| 14 |
+
|
| 15 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
| 16 |
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
| 17 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
| 18 |
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
| 19 |
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
| 20 |
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
| 21 |
+
SOFTWARE.
|
README.md
ADDED
|
@@ -0,0 +1,212 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Moltbot Body
|
| 3 |
+
emoji: 🤖
|
| 4 |
+
colorFrom: green
|
| 5 |
+
colorTo: blue
|
| 6 |
+
sdk: static
|
| 7 |
+
pinned: false
|
| 8 |
+
short_description: Give Moltbot a physical presence with Reachy Mini
|
| 9 |
+
tags:
|
| 10 |
+
- reachy_mini
|
| 11 |
+
- reachy_mini_python_app
|
| 12 |
+
- clawdbot
|
| 13 |
+
- moltbot
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
# Moltbot's Body
|
| 17 |
+
|
| 18 |
+
> **Security Warning**: This project uses Moltbot, which runs AI-generated code with access to your system. Ensure you understand the security implications before installation. Only run Moltbot from trusted sources and review its permissions carefully. See the [Moltbot Security documentation](https://docs.molt.bot/gateway/security) for details.
|
| 19 |
+
|
| 20 |
+
Reachy Mini integration with Moltbot — giving Moltbot a physical presence.
|
| 21 |
+
|
| 22 |
+
## What is Moltbot?
|
| 23 |
+
|
| 24 |
+
[Moltbot](https://docs.molt.bot/start/getting-started) is an AI assistant platform that can connect to various chat surfaces (WhatsApp, Telegram, Discord, etc.) and execute tasks autonomously. This project extends Moltbot by giving it a physical robot body using [Reachy Mini](https://huggingface.co/spaces/pollen-robotics/Reachy_Mini), a small expressive robot from Pollen Robotics.
|
| 25 |
+
|
| 26 |
+
With this integration, Moltbot can:
|
| 27 |
+
- Listen to speech via the robot's microphone
|
| 28 |
+
- Transcribe speech locally using Whisper
|
| 29 |
+
- Generate responses through the Moltbot gateway
|
| 30 |
+
- Speak responses through ElevenLabs TTS
|
| 31 |
+
- Move its head expressively while speaking
|
| 32 |
+
|
| 33 |
+
## Architecture
|
| 34 |
+
|
| 35 |
+
```
|
| 36 |
+
Microphone → VAD → Whisper STT → Moltbot Gateway → ElevenLabs TTS → Speaker
|
| 37 |
+
↓
|
| 38 |
+
MovementManager
|
| 39 |
+
HeadWobbler (speech-driven head movement)
|
| 40 |
+
```
|
| 41 |
+
|
| 42 |
+
## Prerequisites
|
| 43 |
+
|
| 44 |
+
Before running this project, you need:
|
| 45 |
+
|
| 46 |
+
### 1. Moltbot Gateway (Required)
|
| 47 |
+
|
| 48 |
+
Moltbot must be installed and the gateway must be running. Follow the [Moltbot Getting Started guide](https://docs.molt.bot/start/getting-started) to:
|
| 49 |
+
|
| 50 |
+
1. Install the CLI: `curl -fsSL https://molt.bot/install.sh | bash`
|
| 51 |
+
2. Run the onboarding wizard: `moltbot onboard --install-daemon`
|
| 52 |
+
3. Start the gateway: `moltbot gateway --port 18789`
|
| 53 |
+
|
| 54 |
+
Verify it's running:
|
| 55 |
+
```bash
|
| 56 |
+
moltbot gateway status
|
| 57 |
+
```
|
| 58 |
+
|
| 59 |
+
### 2. Reachy Mini Robot (Required)
|
| 60 |
+
|
| 61 |
+
You need a [Reachy Mini](https://huggingface.co/spaces/pollen-robotics/Reachy_Mini) robot from Pollen Robotics with its daemon running.
|
| 62 |
+
|
| 63 |
+
Verify the daemon is running:
|
| 64 |
+
```bash
|
| 65 |
+
curl -s http://localhost:8000/api/daemon/status | jq .state
|
| 66 |
+
```
|
| 67 |
+
|
| 68 |
+
### 3. ElevenLabs Account (Required)
|
| 69 |
+
|
| 70 |
+
Sign up at [ElevenLabs](https://elevenlabs.io/) and get an API key for text-to-speech.
|
| 71 |
+
|
| 72 |
+
### 4. Python 3.12+ and uv
|
| 73 |
+
|
| 74 |
+
This project requires Python 3.12 or later and uses [uv](https://docs.astral.sh/uv/) for package management.
|
| 75 |
+
|
| 76 |
+
## Setup
|
| 77 |
+
|
| 78 |
+
```bash
|
| 79 |
+
git clone <this-repo>
|
| 80 |
+
cd reachy
|
| 81 |
+
uv sync
|
| 82 |
+
```
|
| 83 |
+
|
| 84 |
+
### Environment Variables
|
| 85 |
+
|
| 86 |
+
Create a `.env` file:
|
| 87 |
+
|
| 88 |
+
```bash
|
| 89 |
+
CLAWDBOT_TOKEN=your_gateway_token
|
| 90 |
+
ELEVENLABS_API_KEY=your_elevenlabs_key
|
| 91 |
+
```
|
| 92 |
+
|
| 93 |
+
Get your gateway token from the Moltbot configuration, or these will be pulled from the Moltbot config automatically if not set.
|
| 94 |
+
|
| 95 |
+
## Running
|
| 96 |
+
|
| 97 |
+
```bash
|
| 98 |
+
# Make sure Reachy Mini daemon is running
|
| 99 |
+
curl -s http://localhost:8000/api/daemon/status | jq .state
|
| 100 |
+
|
| 101 |
+
# Make sure Moltbot gateway is running
|
| 102 |
+
moltbot gateway status
|
| 103 |
+
|
| 104 |
+
# Start Moltbot's body
|
| 105 |
+
uv run moltbot-body
|
| 106 |
+
```
|
| 107 |
+
|
| 108 |
+
## CLI Options
|
| 109 |
+
|
| 110 |
+
| Flag | Description |
|
| 111 |
+
|------|-------------|
|
| 112 |
+
| `--debug` | Enable debug logging (verbose output) |
|
| 113 |
+
| `--profile` | Enable timing profiler - prints detailed timing breakdown after each conversation turn |
|
| 114 |
+
| `--profile-once` | Profile one conversation turn then exit (useful for benchmarking) |
|
| 115 |
+
| `--robot-name NAME` | Specify robot name for connection (if you have multiple robots) |
|
| 116 |
+
| `--gateway-url URL` | Moltbot gateway URL (default: `http://localhost:18789`) |
|
| 117 |
+
|
| 118 |
+
### Examples
|
| 119 |
+
|
| 120 |
+
```bash
|
| 121 |
+
# Run with debug logging
|
| 122 |
+
uv run moltbot-body --debug
|
| 123 |
+
|
| 124 |
+
# Profile a single conversation turn
|
| 125 |
+
uv run moltbot-body --profile-once
|
| 126 |
+
|
| 127 |
+
# Connect to a specific robot and gateway
|
| 128 |
+
uv run moltbot-body --robot-name my-reachy --gateway-url http://192.168.1.100:18789
|
| 129 |
+
```
|
| 130 |
+
|
| 131 |
+
### Profiling Output
|
| 132 |
+
|
| 133 |
+
When using `--profile` or `--profile-once`, you'll see a detailed timing breakdown after each turn:
|
| 134 |
+
|
| 135 |
+
```
|
| 136 |
+
============================================================
|
| 137 |
+
CONVERSATION TIMING PROFILE
|
| 138 |
+
============================================================
|
| 139 |
+
|
| 140 |
+
📝 User: "Hello, how are you?"
|
| 141 |
+
🤖 Assistant: "I'm doing well, thank you for asking!"
|
| 142 |
+
|
| 143 |
+
------------------------------------------------------------
|
| 144 |
+
TIMING BREAKDOWN
|
| 145 |
+
------------------------------------------------------------
|
| 146 |
+
|
| 147 |
+
🎤 Speech Detection:
|
| 148 |
+
Duration spoken: 1.23s
|
| 149 |
+
|
| 150 |
+
📜 Whisper Transcription:
|
| 151 |
+
Time: 0.45s
|
| 152 |
+
|
| 153 |
+
🧠 LLM (Moltbot):
|
| 154 |
+
Time to first token: 0.32s
|
| 155 |
+
Streaming time: 1.15s
|
| 156 |
+
Total time: 1.47s
|
| 157 |
+
Tokens: 42 (36.5 tok/s)
|
| 158 |
+
|
| 159 |
+
🔊 TTS (ElevenLabs):
|
| 160 |
+
Time to first audio: 0.28s
|
| 161 |
+
Total streaming: 1.82s
|
| 162 |
+
Audio chunks: 15
|
| 163 |
+
|
| 164 |
+
------------------------------------------------------------
|
| 165 |
+
END-TO-END LATENCY
|
| 166 |
+
------------------------------------------------------------
|
| 167 |
+
|
| 168 |
+
⏱️ Speech end → First audio: 1.05s
|
| 169 |
+
⏱️ Total turn time: 4.50s
|
| 170 |
+
|
| 171 |
+
============================================================
|
| 172 |
+
```
|
| 173 |
+
|
| 174 |
+
## Features
|
| 175 |
+
|
| 176 |
+
- **Voice Activation**: Listens for speech, processes when silence detected
|
| 177 |
+
- **Whisper STT**: Local speech-to-text transcription using faster-whisper
|
| 178 |
+
- **Moltbot Brain**: Claude-powered responses via the Moltbot gateway API
|
| 179 |
+
- **ElevenLabs TTS**: Natural voice output with streaming
|
| 180 |
+
- **Head Wobble**: Audio-driven head movement while speaking for natural expressiveness
|
| 181 |
+
- **Movement Manager**: 100Hz control loop for smooth robot motion
|
| 182 |
+
- **Breathing Animation**: Gentle idle breathing when not actively engaged
|
| 183 |
+
|
| 184 |
+
## Tips for a Better Experience
|
| 185 |
+
|
| 186 |
+
### Use a Low-Latency Inference Provider
|
| 187 |
+
|
| 188 |
+
For natural, conversational interactions, response latency is critical. The time from when you stop speaking to when the robot starts responding should ideally be under 1 second.
|
| 189 |
+
|
| 190 |
+
Consider using a fast inference provider like [Groq](https://groq.com/) which offers extremely low latency for supported models. You can configure this in your Moltbot settings. Use the `--profile` flag to measure your end-to-end latency and identify bottlenecks.
|
| 191 |
+
|
| 192 |
+
### Let Moltbot Help You Set Up
|
| 193 |
+
|
| 194 |
+
Since Moltbot is an AI coding assistant, you can chat with it to help configure and customize the robot body! Try asking Moltbot (via any of its chat surfaces) to:
|
| 195 |
+
|
| 196 |
+
- Help you tune the head movement parameters
|
| 197 |
+
- Adjust the voice activation sensitivity
|
| 198 |
+
- Add new expressions or gestures
|
| 199 |
+
- Debug connection issues
|
| 200 |
+
|
| 201 |
+
Moltbot can read and modify this codebase, so it's a great collaborator for extending the robot's capabilities.
|
| 202 |
+
|
| 203 |
+
## Roadmap
|
| 204 |
+
|
| 205 |
+
- [ ] Face tracking (look at the person speaking)
|
| 206 |
+
- [ ] DoA-based head tracking (direction of arrival for speaker localization)
|
| 207 |
+
- [ ] Wake word detection
|
| 208 |
+
- [ ] Expression gestures
|
| 209 |
+
|
| 210 |
+
## License
|
| 211 |
+
|
| 212 |
+
MIT License - see [LICENSE](LICENSE) for details.
|
index.html
ADDED
|
@@ -0,0 +1,143 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<!doctype html>
|
| 2 |
+
<html>
|
| 3 |
+
|
| 4 |
+
<head>
|
| 5 |
+
<meta charset="utf-8" />
|
| 6 |
+
<meta name="viewport" content="width=device-width, initial-scale=1" />
|
| 7 |
+
<title>Moltbot Body - Reachy Mini App</title>
|
| 8 |
+
<link rel="preconnect" href="https://fonts.googleapis.com">
|
| 9 |
+
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
| 10 |
+
<link href="https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@400;500;600;700&family=Manrope:wght@400;500;600&display=swap" rel="stylesheet">
|
| 11 |
+
<link rel="stylesheet" href="style.css" />
|
| 12 |
+
</head>
|
| 13 |
+
|
| 14 |
+
<body>
|
| 15 |
+
<header class="hero">
|
| 16 |
+
<div class="topline">
|
| 17 |
+
<div class="brand">
|
| 18 |
+
<span class="logo">🤖</span>
|
| 19 |
+
<span class="brand-name">Moltbot Body</span>
|
| 20 |
+
</div>
|
| 21 |
+
<div class="pill">Voice conversation · Moltbot AI · Expressive motion</div>
|
| 22 |
+
</div>
|
| 23 |
+
<div class="hero-grid">
|
| 24 |
+
<div class="hero-copy">
|
| 25 |
+
<p class="eyebrow">Reachy Mini App</p>
|
| 26 |
+
<h1>Give Moltbot a physical presence.</h1>
|
| 27 |
+
<p class="lede">
|
| 28 |
+
Connect your Moltbot AI assistant to a Reachy Mini robot. Listen through the microphone, transcribe with Whisper, respond through Moltbot, and speak with natural TTS—all while moving expressively.
|
| 29 |
+
</p>
|
| 30 |
+
<div class="hero-actions">
|
| 31 |
+
<a class="btn primary" href="#highlights">Explore features</a>
|
| 32 |
+
<a class="btn ghost" href="#architecture">See how it works</a>
|
| 33 |
+
</div>
|
| 34 |
+
<div class="hero-badges">
|
| 35 |
+
<span>Local Whisper STT</span>
|
| 36 |
+
<span>Moltbot Gateway</span>
|
| 37 |
+
<span>ElevenLabs TTS</span>
|
| 38 |
+
<span>Expressive head movement</span>
|
| 39 |
+
</div>
|
| 40 |
+
</div>
|
| 41 |
+
<div class="hero-visual">
|
| 42 |
+
<div class="glass-card">
|
| 43 |
+
<div class="architecture-preview">
|
| 44 |
+
<pre>
|
| 45 |
+
Microphone → VAD → Whisper STT
|
| 46 |
+
↓
|
| 47 |
+
Moltbot Gateway
|
| 48 |
+
↓
|
| 49 |
+
ElevenLabs TTS → Speaker
|
| 50 |
+
↓
|
| 51 |
+
MovementManager
|
| 52 |
+
HeadWobbler
|
| 53 |
+
</pre>
|
| 54 |
+
</div>
|
| 55 |
+
<p class="caption">End-to-end voice conversation pipeline with expressive robot motion.</p>
|
| 56 |
+
</div>
|
| 57 |
+
</div>
|
| 58 |
+
</div>
|
| 59 |
+
</header>
|
| 60 |
+
|
| 61 |
+
<section id="highlights" class="section features">
|
| 62 |
+
<div class="section-header">
|
| 63 |
+
<p class="eyebrow">What's inside</p>
|
| 64 |
+
<h2>A complete voice interface for your robot</h2>
|
| 65 |
+
<p class="intro">
|
| 66 |
+
Moltbot Body combines speech recognition, AI conversation, and expressive motion into a seamless experience.
|
| 67 |
+
</p>
|
| 68 |
+
</div>
|
| 69 |
+
<div class="feature-grid">
|
| 70 |
+
<div class="feature-card">
|
| 71 |
+
<span class="icon">🎤</span>
|
| 72 |
+
<h3>Voice activation</h3>
|
| 73 |
+
<p>Listens continuously and detects when you're speaking using voice activity detection.</p>
|
| 74 |
+
</div>
|
| 75 |
+
<div class="feature-card">
|
| 76 |
+
<span class="icon">📝</span>
|
| 77 |
+
<h3>Local transcription</h3>
|
| 78 |
+
<p>Fast, private speech-to-text using Whisper running locally on your machine.</p>
|
| 79 |
+
</div>
|
| 80 |
+
<div class="feature-card">
|
| 81 |
+
<span class="icon">🧠</span>
|
| 82 |
+
<h3>Moltbot brain</h3>
|
| 83 |
+
<p>Claude-powered responses through the Moltbot gateway with full tool access and memory.</p>
|
| 84 |
+
</div>
|
| 85 |
+
<div class="feature-card">
|
| 86 |
+
<span class="icon">🔊</span>
|
| 87 |
+
<h3>Natural voice</h3>
|
| 88 |
+
<p>High-quality streaming text-to-speech through ElevenLabs for natural conversation.</p>
|
| 89 |
+
</div>
|
| 90 |
+
<div class="feature-card">
|
| 91 |
+
<span class="icon">💃</span>
|
| 92 |
+
<h3>Expressive motion</h3>
|
| 93 |
+
<p>Audio-driven head wobble and breathing animations bring the robot to life while speaking.</p>
|
| 94 |
+
</div>
|
| 95 |
+
<div class="feature-card">
|
| 96 |
+
<span class="icon">⚡</span>
|
| 97 |
+
<h3>Low latency</h3>
|
| 98 |
+
<p>Optimized pipeline with profiling tools to measure and minimize response time.</p>
|
| 99 |
+
</div>
|
| 100 |
+
</div>
|
| 101 |
+
</section>
|
| 102 |
+
|
| 103 |
+
<section id="architecture" class="section story">
|
| 104 |
+
<div class="story-grid">
|
| 105 |
+
<div class="story-card">
|
| 106 |
+
<p class="eyebrow">How it works</p>
|
| 107 |
+
<h3>From speech to response in under a second</h3>
|
| 108 |
+
<ul class="story-list">
|
| 109 |
+
<li><span>🎤</span> Robot microphone captures your voice continuously.</li>
|
| 110 |
+
<li><span>🔇</span> Voice Activity Detection identifies when you stop speaking.</li>
|
| 111 |
+
<li><span>📝</span> Whisper transcribes your speech locally and privately.</li>
|
| 112 |
+
<li><span>🧠</span> Moltbot gateway processes your message with full AI capabilities.</li>
|
| 113 |
+
<li><span>🔊</span> ElevenLabs streams natural voice output in real-time.</li>
|
| 114 |
+
<li><span>🤖</span> Head wobbles expressively while the robot speaks.</li>
|
| 115 |
+
</ul>
|
| 116 |
+
</div>
|
| 117 |
+
<div class="story-card secondary">
|
| 118 |
+
<p class="eyebrow">Prerequisites</p>
|
| 119 |
+
<h3>What you need to get started</h3>
|
| 120 |
+
<p class="story-text">
|
| 121 |
+
This app requires a running Moltbot gateway, an ElevenLabs API key for TTS, and a Reachy Mini robot connected to your network.
|
| 122 |
+
</p>
|
| 123 |
+
<div class="chips">
|
| 124 |
+
<span class="chip">Moltbot Gateway</span>
|
| 125 |
+
<span class="chip">ElevenLabs API</span>
|
| 126 |
+
<span class="chip">Reachy Mini</span>
|
| 127 |
+
<span class="chip">Python 3.12+</span>
|
| 128 |
+
</div>
|
| 129 |
+
</div>
|
| 130 |
+
</div>
|
| 131 |
+
</section>
|
| 132 |
+
|
| 133 |
+
<footer class="footer">
|
| 134 |
+
<p>
|
| 135 |
+
Moltbot Body — giving Moltbot a physical presence with Reachy Mini.
|
| 136 |
+
Learn more about <a href="https://docs.molt.bot/" target="_blank" rel="noopener">Moltbot</a> and
|
| 137 |
+
<a href="https://huggingface.co/spaces/pollen-robotics/Reachy_Mini" target="_blank" rel="noopener">Reachy Mini</a>.
|
| 138 |
+
</p>
|
| 139 |
+
</footer>
|
| 140 |
+
|
| 141 |
+
</body>
|
| 142 |
+
|
| 143 |
+
</html>
|
moltbot_body/__init__.py
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Moltbot's physical body - Reachy Mini integration with Clawdbot."""
|
| 2 |
+
|
| 3 |
+
__version__ = "0.1.0"
|
moltbot_body/audio/__init__.py
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
"""Audio processing modules for head movement."""
|
moltbot_body/audio/head_wobbler.py
ADDED
|
@@ -0,0 +1,171 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Moves head given audio samples."""
|
| 2 |
+
|
| 3 |
+
import time
|
| 4 |
+
import queue
|
| 5 |
+
import base64
|
| 6 |
+
import logging
|
| 7 |
+
import threading
|
| 8 |
+
from typing import Tuple
|
| 9 |
+
from collections.abc import Callable
|
| 10 |
+
|
| 11 |
+
import numpy as np
|
| 12 |
+
from numpy.typing import NDArray
|
| 13 |
+
|
| 14 |
+
from moltbot_body.audio.speech_tapper import HOP_MS, SwayRollRT
|
| 15 |
+
|
| 16 |
+
|
| 17 |
+
SAMPLE_RATE = 24000
|
| 18 |
+
MOVEMENT_LATENCY_S = 0.2 # seconds between audio and robot movement
|
| 19 |
+
logger = logging.getLogger(__name__)
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
class HeadWobbler:
|
| 23 |
+
"""Converts audio deltas (base64) into head movement offsets."""
|
| 24 |
+
|
| 25 |
+
def __init__(self, set_speech_offsets: Callable[[Tuple[float, float, float, float, float, float]], None]) -> None:
|
| 26 |
+
"""Initialize the head wobbler."""
|
| 27 |
+
self._apply_offsets = set_speech_offsets
|
| 28 |
+
self._base_ts: float | None = None
|
| 29 |
+
self._hops_done: int = 0
|
| 30 |
+
|
| 31 |
+
self.audio_queue: "queue.Queue[Tuple[int, int, NDArray[np.int16]]]" = queue.Queue()
|
| 32 |
+
self.sway = SwayRollRT()
|
| 33 |
+
|
| 34 |
+
# Synchronization primitives
|
| 35 |
+
self._state_lock = threading.Lock()
|
| 36 |
+
self._sway_lock = threading.Lock()
|
| 37 |
+
self._generation = 0
|
| 38 |
+
|
| 39 |
+
self._stop_event = threading.Event()
|
| 40 |
+
self._thread: threading.Thread | None = None
|
| 41 |
+
|
| 42 |
+
def feed(self, delta_b64: str) -> None:
|
| 43 |
+
"""Thread-safe: push audio into the consumer queue."""
|
| 44 |
+
buf = np.frombuffer(base64.b64decode(delta_b64), dtype=np.int16).reshape(1, -1)
|
| 45 |
+
with self._state_lock:
|
| 46 |
+
generation = self._generation
|
| 47 |
+
self.audio_queue.put((generation, SAMPLE_RATE, buf))
|
| 48 |
+
|
| 49 |
+
def start(self) -> None:
|
| 50 |
+
"""Start the head wobbler loop in a thread."""
|
| 51 |
+
self._stop_event.clear()
|
| 52 |
+
self._thread = threading.Thread(target=self.working_loop, daemon=True)
|
| 53 |
+
self._thread.start()
|
| 54 |
+
logger.debug("Head wobbler started")
|
| 55 |
+
|
| 56 |
+
def stop(self) -> None:
|
| 57 |
+
"""Stop the head wobbler loop."""
|
| 58 |
+
self._stop_event.set()
|
| 59 |
+
if self._thread is not None:
|
| 60 |
+
self._thread.join()
|
| 61 |
+
logger.debug("Head wobbler stopped")
|
| 62 |
+
|
| 63 |
+
def working_loop(self) -> None:
|
| 64 |
+
"""Convert audio deltas into head movement offsets."""
|
| 65 |
+
hop_dt = HOP_MS / 1000.0
|
| 66 |
+
|
| 67 |
+
logger.debug("Head wobbler thread started")
|
| 68 |
+
while not self._stop_event.is_set():
|
| 69 |
+
queue_ref = self.audio_queue
|
| 70 |
+
try:
|
| 71 |
+
chunk_generation, sr, chunk = queue_ref.get_nowait() # (gen, sr, data)
|
| 72 |
+
except queue.Empty:
|
| 73 |
+
# avoid while to never exit
|
| 74 |
+
time.sleep(MOVEMENT_LATENCY_S)
|
| 75 |
+
continue
|
| 76 |
+
|
| 77 |
+
try:
|
| 78 |
+
with self._state_lock:
|
| 79 |
+
current_generation = self._generation
|
| 80 |
+
if chunk_generation != current_generation:
|
| 81 |
+
continue
|
| 82 |
+
|
| 83 |
+
if self._base_ts is None:
|
| 84 |
+
with self._state_lock:
|
| 85 |
+
if self._base_ts is None:
|
| 86 |
+
self._base_ts = time.monotonic()
|
| 87 |
+
|
| 88 |
+
pcm = np.asarray(chunk).squeeze(0)
|
| 89 |
+
with self._sway_lock:
|
| 90 |
+
results = self.sway.feed(pcm, sr)
|
| 91 |
+
|
| 92 |
+
i = 0
|
| 93 |
+
while i < len(results):
|
| 94 |
+
with self._state_lock:
|
| 95 |
+
if self._generation != current_generation:
|
| 96 |
+
break
|
| 97 |
+
base_ts = self._base_ts
|
| 98 |
+
hops_done = self._hops_done
|
| 99 |
+
|
| 100 |
+
if base_ts is None:
|
| 101 |
+
base_ts = time.monotonic()
|
| 102 |
+
with self._state_lock:
|
| 103 |
+
if self._base_ts is None:
|
| 104 |
+
self._base_ts = base_ts
|
| 105 |
+
hops_done = self._hops_done
|
| 106 |
+
|
| 107 |
+
target = base_ts + MOVEMENT_LATENCY_S + hops_done * hop_dt
|
| 108 |
+
now = time.monotonic()
|
| 109 |
+
|
| 110 |
+
if now - target >= hop_dt:
|
| 111 |
+
lag_hops = int((now - target) / hop_dt)
|
| 112 |
+
drop = min(lag_hops, len(results) - i - 1)
|
| 113 |
+
if drop > 0:
|
| 114 |
+
with self._state_lock:
|
| 115 |
+
self._hops_done += drop
|
| 116 |
+
hops_done = self._hops_done
|
| 117 |
+
i += drop
|
| 118 |
+
continue
|
| 119 |
+
|
| 120 |
+
if target > now:
|
| 121 |
+
time.sleep(target - now)
|
| 122 |
+
with self._state_lock:
|
| 123 |
+
if self._generation != current_generation:
|
| 124 |
+
break
|
| 125 |
+
|
| 126 |
+
r = results[i]
|
| 127 |
+
offsets = (
|
| 128 |
+
r["x_mm"] / 1000.0,
|
| 129 |
+
r["y_mm"] / 1000.0,
|
| 130 |
+
r["z_mm"] / 1000.0,
|
| 131 |
+
r["roll_rad"],
|
| 132 |
+
r["pitch_rad"],
|
| 133 |
+
r["yaw_rad"],
|
| 134 |
+
)
|
| 135 |
+
|
| 136 |
+
with self._state_lock:
|
| 137 |
+
if self._generation != current_generation:
|
| 138 |
+
break
|
| 139 |
+
|
| 140 |
+
self._apply_offsets(offsets)
|
| 141 |
+
|
| 142 |
+
with self._state_lock:
|
| 143 |
+
self._hops_done += 1
|
| 144 |
+
i += 1
|
| 145 |
+
finally:
|
| 146 |
+
queue_ref.task_done()
|
| 147 |
+
logger.debug("Head wobbler thread exited")
|
| 148 |
+
|
| 149 |
+
def reset(self) -> None:
|
| 150 |
+
"""Reset the internal state."""
|
| 151 |
+
with self._state_lock:
|
| 152 |
+
self._generation += 1
|
| 153 |
+
self._base_ts = None
|
| 154 |
+
self._hops_done = 0
|
| 155 |
+
|
| 156 |
+
# Drain any queued audio chunks from previous generations
|
| 157 |
+
drained_any = False
|
| 158 |
+
while True:
|
| 159 |
+
try:
|
| 160 |
+
_, _, _ = self.audio_queue.get_nowait()
|
| 161 |
+
except queue.Empty:
|
| 162 |
+
break
|
| 163 |
+
else:
|
| 164 |
+
drained_any = True
|
| 165 |
+
self.audio_queue.task_done()
|
| 166 |
+
|
| 167 |
+
with self._sway_lock:
|
| 168 |
+
self.sway.reset()
|
| 169 |
+
|
| 170 |
+
if drained_any:
|
| 171 |
+
logger.debug("Head wobbler queue drained during reset")
|
moltbot_body/audio/speech_tapper.py
ADDED
|
@@ -0,0 +1,268 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from __future__ import annotations
|
| 2 |
+
import math
|
| 3 |
+
from typing import Any, Dict, List
|
| 4 |
+
from itertools import islice
|
| 5 |
+
from collections import deque
|
| 6 |
+
|
| 7 |
+
import numpy as np
|
| 8 |
+
from numpy.typing import NDArray
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
# Tunables
|
| 12 |
+
SR = 16_000
|
| 13 |
+
FRAME_MS = 20
|
| 14 |
+
HOP_MS = 50
|
| 15 |
+
|
| 16 |
+
SWAY_MASTER = 1.5
|
| 17 |
+
SENS_DB_OFFSET = +4.0
|
| 18 |
+
VAD_DB_ON = -35.0
|
| 19 |
+
VAD_DB_OFF = -45.0
|
| 20 |
+
VAD_ATTACK_MS = 40
|
| 21 |
+
VAD_RELEASE_MS = 250
|
| 22 |
+
ENV_FOLLOW_GAIN = 0.65
|
| 23 |
+
|
| 24 |
+
SWAY_F_PITCH = 2.2
|
| 25 |
+
SWAY_A_PITCH_DEG = 4.5
|
| 26 |
+
SWAY_F_YAW = 0.6
|
| 27 |
+
SWAY_A_YAW_DEG = 7.5
|
| 28 |
+
SWAY_F_ROLL = 1.3
|
| 29 |
+
SWAY_A_ROLL_DEG = 2.25
|
| 30 |
+
SWAY_F_X = 0.35
|
| 31 |
+
SWAY_A_X_MM = 4.5
|
| 32 |
+
SWAY_F_Y = 0.45
|
| 33 |
+
SWAY_A_Y_MM = 3.75
|
| 34 |
+
SWAY_F_Z = 0.25
|
| 35 |
+
SWAY_A_Z_MM = 2.25
|
| 36 |
+
|
| 37 |
+
SWAY_DB_LOW = -46.0
|
| 38 |
+
SWAY_DB_HIGH = -18.0
|
| 39 |
+
LOUDNESS_GAMMA = 0.9
|
| 40 |
+
SWAY_ATTACK_MS = 50
|
| 41 |
+
SWAY_RELEASE_MS = 250
|
| 42 |
+
|
| 43 |
+
# Derived
|
| 44 |
+
FRAME = int(SR * FRAME_MS / 1000)
|
| 45 |
+
HOP = int(SR * HOP_MS / 1000)
|
| 46 |
+
ATTACK_FR = max(1, int(VAD_ATTACK_MS / HOP_MS))
|
| 47 |
+
RELEASE_FR = max(1, int(VAD_RELEASE_MS / HOP_MS))
|
| 48 |
+
SWAY_ATTACK_FR = max(1, int(SWAY_ATTACK_MS / HOP_MS))
|
| 49 |
+
SWAY_RELEASE_FR = max(1, int(SWAY_RELEASE_MS / HOP_MS))
|
| 50 |
+
|
| 51 |
+
|
| 52 |
+
def _rms_dbfs(x: NDArray[np.float32]) -> float:
|
| 53 |
+
"""Root-mean-square in dBFS for float32 mono array in [-1,1]."""
|
| 54 |
+
# numerically stable rms (avoid overflow)
|
| 55 |
+
x = x.astype(np.float32, copy=False)
|
| 56 |
+
rms = np.sqrt(np.mean(x * x, dtype=np.float32) + 1e-12, dtype=np.float32)
|
| 57 |
+
return float(20.0 * math.log10(float(rms) + 1e-12))
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
def _loudness_gain(db: float, offset: float = SENS_DB_OFFSET) -> float:
|
| 61 |
+
"""Normalize dB into [0,1] with gamma; clipped to [0,1]."""
|
| 62 |
+
t = (db + offset - SWAY_DB_LOW) / (SWAY_DB_HIGH - SWAY_DB_LOW)
|
| 63 |
+
if t < 0.0:
|
| 64 |
+
t = 0.0
|
| 65 |
+
elif t > 1.0:
|
| 66 |
+
t = 1.0
|
| 67 |
+
return t**LOUDNESS_GAMMA if LOUDNESS_GAMMA != 1.0 else t
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
def _to_float32_mono(x: NDArray[Any]) -> NDArray[np.float32]:
|
| 71 |
+
"""Convert arbitrary PCM array to float32 mono in [-1,1].
|
| 72 |
+
|
| 73 |
+
Accepts shapes: (N,), (1,N), (N,1), (C,N), (N,C).
|
| 74 |
+
"""
|
| 75 |
+
a = np.asarray(x)
|
| 76 |
+
if a.ndim == 0:
|
| 77 |
+
return np.zeros(0, dtype=np.float32)
|
| 78 |
+
|
| 79 |
+
# If 2D, decide which axis is channels (prefer small first dim)
|
| 80 |
+
if a.ndim == 2:
|
| 81 |
+
# e.g., (channels, samples) if channels is small (<=8)
|
| 82 |
+
if a.shape[0] <= 8 and a.shape[0] <= a.shape[1]:
|
| 83 |
+
a = np.mean(a, axis=0)
|
| 84 |
+
else:
|
| 85 |
+
a = np.mean(a, axis=1)
|
| 86 |
+
elif a.ndim > 2:
|
| 87 |
+
a = np.mean(a.reshape(a.shape[0], -1), axis=0)
|
| 88 |
+
|
| 89 |
+
# Now 1D, cast/scale
|
| 90 |
+
if np.issubdtype(a.dtype, np.floating):
|
| 91 |
+
return a.astype(np.float32, copy=False)
|
| 92 |
+
# integer PCM
|
| 93 |
+
info = np.iinfo(a.dtype)
|
| 94 |
+
scale = float(max(-info.min, info.max))
|
| 95 |
+
return a.astype(np.float32) / (scale if scale != 0.0 else 1.0)
|
| 96 |
+
|
| 97 |
+
|
| 98 |
+
def _resample_linear(x: NDArray[np.float32], sr_in: int, sr_out: int) -> NDArray[np.float32]:
|
| 99 |
+
"""Lightweight linear resampler for short buffers."""
|
| 100 |
+
if sr_in == sr_out or x.size == 0:
|
| 101 |
+
return x
|
| 102 |
+
# guard tiny sizes
|
| 103 |
+
n_out = int(round(x.size * sr_out / sr_in))
|
| 104 |
+
if n_out <= 1:
|
| 105 |
+
return np.zeros(0, dtype=np.float32)
|
| 106 |
+
t_in = np.linspace(0.0, 1.0, num=x.size, dtype=np.float32, endpoint=True)
|
| 107 |
+
t_out = np.linspace(0.0, 1.0, num=n_out, dtype=np.float32, endpoint=True)
|
| 108 |
+
return np.interp(t_out, t_in, x).astype(np.float32, copy=False)
|
| 109 |
+
|
| 110 |
+
|
| 111 |
+
class SwayRollRT:
|
| 112 |
+
"""Feed audio chunks → per-hop sway outputs.
|
| 113 |
+
|
| 114 |
+
Usage:
|
| 115 |
+
rt = SwayRollRT()
|
| 116 |
+
rt.feed(pcm_int16_or_float, sr) -> List[dict]
|
| 117 |
+
"""
|
| 118 |
+
|
| 119 |
+
def __init__(self, rng_seed: int = 7):
|
| 120 |
+
"""Initialize state."""
|
| 121 |
+
self._seed = int(rng_seed)
|
| 122 |
+
self.samples: deque[float] = deque(maxlen=10 * SR) # sliding window for VAD/env
|
| 123 |
+
self.carry: NDArray[np.float32] = np.zeros(0, dtype=np.float32)
|
| 124 |
+
|
| 125 |
+
self.vad_on = False
|
| 126 |
+
self.vad_above = 0
|
| 127 |
+
self.vad_below = 0
|
| 128 |
+
|
| 129 |
+
self.sway_env = 0.0
|
| 130 |
+
self.sway_up = 0
|
| 131 |
+
self.sway_down = 0
|
| 132 |
+
|
| 133 |
+
rng = np.random.default_rng(self._seed)
|
| 134 |
+
self.phase_pitch = float(rng.random() * 2 * math.pi)
|
| 135 |
+
self.phase_yaw = float(rng.random() * 2 * math.pi)
|
| 136 |
+
self.phase_roll = float(rng.random() * 2 * math.pi)
|
| 137 |
+
self.phase_x = float(rng.random() * 2 * math.pi)
|
| 138 |
+
self.phase_y = float(rng.random() * 2 * math.pi)
|
| 139 |
+
self.phase_z = float(rng.random() * 2 * math.pi)
|
| 140 |
+
self.t = 0.0
|
| 141 |
+
|
| 142 |
+
def reset(self) -> None:
|
| 143 |
+
"""Reset state (VAD/env/buffers/time) but keep initial phases/seed."""
|
| 144 |
+
self.samples.clear()
|
| 145 |
+
self.carry = np.zeros(0, dtype=np.float32)
|
| 146 |
+
self.vad_on = False
|
| 147 |
+
self.vad_above = 0
|
| 148 |
+
self.vad_below = 0
|
| 149 |
+
self.sway_env = 0.0
|
| 150 |
+
self.sway_up = 0
|
| 151 |
+
self.sway_down = 0
|
| 152 |
+
self.t = 0.0
|
| 153 |
+
|
| 154 |
+
def feed(self, pcm: NDArray[Any], sr: int | None) -> List[Dict[str, float]]:
|
| 155 |
+
"""Stream in PCM chunk. Returns a list of sway dicts, one per hop (HOP_MS).
|
| 156 |
+
|
| 157 |
+
Args:
|
| 158 |
+
pcm: np.ndarray, shape (N,) or (C,N)/(N,C); int or float.
|
| 159 |
+
sr: sample rate of `pcm` (None -> assume SR).
|
| 160 |
+
|
| 161 |
+
"""
|
| 162 |
+
sr_in = SR if sr is None else int(sr)
|
| 163 |
+
x = _to_float32_mono(pcm)
|
| 164 |
+
if x.size == 0:
|
| 165 |
+
return []
|
| 166 |
+
if sr_in != SR:
|
| 167 |
+
x = _resample_linear(x, sr_in, SR)
|
| 168 |
+
if x.size == 0:
|
| 169 |
+
return []
|
| 170 |
+
|
| 171 |
+
# append to carry and consume fixed HOP chunks
|
| 172 |
+
if self.carry.size:
|
| 173 |
+
self.carry = np.concatenate([self.carry, x])
|
| 174 |
+
else:
|
| 175 |
+
self.carry = x
|
| 176 |
+
|
| 177 |
+
out: List[Dict[str, float]] = []
|
| 178 |
+
|
| 179 |
+
while self.carry.size >= HOP:
|
| 180 |
+
hop = self.carry[:HOP]
|
| 181 |
+
remaining: NDArray[np.float32] = self.carry[HOP:]
|
| 182 |
+
self.carry = remaining
|
| 183 |
+
|
| 184 |
+
# keep sliding window for VAD/env computation
|
| 185 |
+
# (deque accepts any iterable; list() for small HOP is fine)
|
| 186 |
+
self.samples.extend(hop.tolist())
|
| 187 |
+
if len(self.samples) < FRAME:
|
| 188 |
+
self.t += HOP_MS / 1000.0
|
| 189 |
+
continue
|
| 190 |
+
|
| 191 |
+
frame = np.fromiter(
|
| 192 |
+
islice(self.samples, len(self.samples) - FRAME, len(self.samples)),
|
| 193 |
+
dtype=np.float32,
|
| 194 |
+
count=FRAME,
|
| 195 |
+
)
|
| 196 |
+
db = _rms_dbfs(frame)
|
| 197 |
+
|
| 198 |
+
# VAD with hysteresis + attack/release
|
| 199 |
+
if db >= VAD_DB_ON:
|
| 200 |
+
self.vad_above += 1
|
| 201 |
+
self.vad_below = 0
|
| 202 |
+
if not self.vad_on and self.vad_above >= ATTACK_FR:
|
| 203 |
+
self.vad_on = True
|
| 204 |
+
elif db <= VAD_DB_OFF:
|
| 205 |
+
self.vad_below += 1
|
| 206 |
+
self.vad_above = 0
|
| 207 |
+
if self.vad_on and self.vad_below >= RELEASE_FR:
|
| 208 |
+
self.vad_on = False
|
| 209 |
+
|
| 210 |
+
if self.vad_on:
|
| 211 |
+
self.sway_up = min(SWAY_ATTACK_FR, self.sway_up + 1)
|
| 212 |
+
self.sway_down = 0
|
| 213 |
+
else:
|
| 214 |
+
self.sway_down = min(SWAY_RELEASE_FR, self.sway_down + 1)
|
| 215 |
+
self.sway_up = 0
|
| 216 |
+
|
| 217 |
+
up = self.sway_up / SWAY_ATTACK_FR
|
| 218 |
+
down = 1.0 - (self.sway_down / SWAY_RELEASE_FR)
|
| 219 |
+
target = up if self.vad_on else down
|
| 220 |
+
self.sway_env += ENV_FOLLOW_GAIN * (target - self.sway_env)
|
| 221 |
+
# clamp
|
| 222 |
+
if self.sway_env < 0.0:
|
| 223 |
+
self.sway_env = 0.0
|
| 224 |
+
elif self.sway_env > 1.0:
|
| 225 |
+
self.sway_env = 1.0
|
| 226 |
+
|
| 227 |
+
loud = _loudness_gain(db) * SWAY_MASTER
|
| 228 |
+
env = self.sway_env
|
| 229 |
+
self.t += HOP_MS / 1000.0
|
| 230 |
+
|
| 231 |
+
# oscillators
|
| 232 |
+
pitch = (
|
| 233 |
+
math.radians(SWAY_A_PITCH_DEG)
|
| 234 |
+
* loud
|
| 235 |
+
* env
|
| 236 |
+
* math.sin(2 * math.pi * SWAY_F_PITCH * self.t + self.phase_pitch)
|
| 237 |
+
)
|
| 238 |
+
yaw = (
|
| 239 |
+
math.radians(SWAY_A_YAW_DEG)
|
| 240 |
+
* loud
|
| 241 |
+
* env
|
| 242 |
+
* math.sin(2 * math.pi * SWAY_F_YAW * self.t + self.phase_yaw)
|
| 243 |
+
)
|
| 244 |
+
roll = (
|
| 245 |
+
math.radians(SWAY_A_ROLL_DEG)
|
| 246 |
+
* loud
|
| 247 |
+
* env
|
| 248 |
+
* math.sin(2 * math.pi * SWAY_F_ROLL * self.t + self.phase_roll)
|
| 249 |
+
)
|
| 250 |
+
x_mm = SWAY_A_X_MM * loud * env * math.sin(2 * math.pi * SWAY_F_X * self.t + self.phase_x)
|
| 251 |
+
y_mm = SWAY_A_Y_MM * loud * env * math.sin(2 * math.pi * SWAY_F_Y * self.t + self.phase_y)
|
| 252 |
+
z_mm = SWAY_A_Z_MM * loud * env * math.sin(2 * math.pi * SWAY_F_Z * self.t + self.phase_z)
|
| 253 |
+
|
| 254 |
+
out.append(
|
| 255 |
+
{
|
| 256 |
+
"pitch_rad": pitch,
|
| 257 |
+
"yaw_rad": yaw,
|
| 258 |
+
"roll_rad": roll,
|
| 259 |
+
"pitch_deg": math.degrees(pitch),
|
| 260 |
+
"yaw_deg": math.degrees(yaw),
|
| 261 |
+
"roll_deg": math.degrees(roll),
|
| 262 |
+
"x_mm": x_mm,
|
| 263 |
+
"y_mm": y_mm,
|
| 264 |
+
"z_mm": z_mm,
|
| 265 |
+
},
|
| 266 |
+
)
|
| 267 |
+
|
| 268 |
+
return out
|
moltbot_body/clawdbot_handler.py
ADDED
|
@@ -0,0 +1,672 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Handler that bridges audio I/O with Clawdbot via Whisper STT and ElevenLabs TTS."""
|
| 2 |
+
|
| 3 |
+
import io
|
| 4 |
+
import os
|
| 5 |
+
import json
|
| 6 |
+
import time
|
| 7 |
+
import base64
|
| 8 |
+
import queue
|
| 9 |
+
import asyncio
|
| 10 |
+
import logging
|
| 11 |
+
import tempfile
|
| 12 |
+
import threading
|
| 13 |
+
from dataclasses import dataclass, field
|
| 14 |
+
from typing import TYPE_CHECKING, Tuple, Optional, Callable, AsyncIterator
|
| 15 |
+
from pathlib import Path
|
| 16 |
+
|
| 17 |
+
import httpx
|
| 18 |
+
import numpy as np
|
| 19 |
+
import soundfile as sf
|
| 20 |
+
import websockets
|
| 21 |
+
from httpx_sse import aconnect_sse
|
| 22 |
+
from numpy.typing import NDArray
|
| 23 |
+
from dotenv import load_dotenv, find_dotenv
|
| 24 |
+
|
| 25 |
+
if TYPE_CHECKING:
|
| 26 |
+
from moltbot_body.audio.head_wobbler import HeadWobbler
|
| 27 |
+
|
| 28 |
+
load_dotenv(find_dotenv())
|
| 29 |
+
|
| 30 |
+
logger = logging.getLogger(__name__)
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
@dataclass
|
| 34 |
+
class ConversationTiming:
|
| 35 |
+
"""Tracks timing for a single conversation turn."""
|
| 36 |
+
|
| 37 |
+
# Speech detection
|
| 38 |
+
speech_start: float = 0.0
|
| 39 |
+
speech_end: float = 0.0
|
| 40 |
+
|
| 41 |
+
# Transcription
|
| 42 |
+
transcription_start: float = 0.0
|
| 43 |
+
transcription_end: float = 0.0
|
| 44 |
+
|
| 45 |
+
# LLM
|
| 46 |
+
llm_request_start: float = 0.0
|
| 47 |
+
llm_first_token: float = 0.0
|
| 48 |
+
llm_last_token: float = 0.0
|
| 49 |
+
llm_token_count: int = 0
|
| 50 |
+
|
| 51 |
+
# TTS
|
| 52 |
+
tts_websocket_open: float = 0.0
|
| 53 |
+
tts_first_audio: float = 0.0
|
| 54 |
+
tts_last_audio: float = 0.0
|
| 55 |
+
tts_audio_chunks: int = 0
|
| 56 |
+
|
| 57 |
+
# Overall
|
| 58 |
+
turn_start: float = 0.0
|
| 59 |
+
turn_end: float = 0.0
|
| 60 |
+
|
| 61 |
+
# Content
|
| 62 |
+
user_text: str = ""
|
| 63 |
+
assistant_text: str = ""
|
| 64 |
+
|
| 65 |
+
def print_summary(self) -> None:
|
| 66 |
+
"""Print a formatted timing summary."""
|
| 67 |
+
print("\n" + "=" * 60)
|
| 68 |
+
print("CONVERSATION TIMING PROFILE")
|
| 69 |
+
print("=" * 60)
|
| 70 |
+
|
| 71 |
+
print(f"\n📝 User: \"{self.user_text[:80]}{'...' if len(self.user_text) > 80 else ''}\"")
|
| 72 |
+
print(f"🤖 Assistant: \"{self.assistant_text[:80]}{'...' if len(self.assistant_text) > 80 else ''}\"")
|
| 73 |
+
|
| 74 |
+
print("\n" + "-" * 60)
|
| 75 |
+
print("TIMING BREAKDOWN")
|
| 76 |
+
print("-" * 60)
|
| 77 |
+
|
| 78 |
+
# Speech duration
|
| 79 |
+
speech_duration = self.speech_end - self.speech_start if self.speech_end else 0
|
| 80 |
+
print(f"\n🎤 Speech Detection:")
|
| 81 |
+
print(f" Duration spoken: {speech_duration:.2f}s")
|
| 82 |
+
|
| 83 |
+
# Transcription
|
| 84 |
+
transcription_time = self.transcription_end - self.transcription_start if self.transcription_end else 0
|
| 85 |
+
print(f"\n📜 Whisper Transcription:")
|
| 86 |
+
print(f" Time: {transcription_time:.2f}s")
|
| 87 |
+
|
| 88 |
+
# LLM
|
| 89 |
+
if self.llm_first_token:
|
| 90 |
+
llm_ttft = self.llm_first_token - self.llm_request_start
|
| 91 |
+
llm_total = self.llm_last_token - self.llm_request_start if self.llm_last_token else 0
|
| 92 |
+
llm_streaming = self.llm_last_token - self.llm_first_token if self.llm_last_token else 0
|
| 93 |
+
tokens_per_sec = self.llm_token_count / llm_streaming if llm_streaming > 0 else 0
|
| 94 |
+
print(f"\n🧠 LLM (Clawdbot):")
|
| 95 |
+
print(f" Time to first token: {llm_ttft:.2f}s")
|
| 96 |
+
print(f" Streaming time: {llm_streaming:.2f}s")
|
| 97 |
+
print(f" Total time: {llm_total:.2f}s")
|
| 98 |
+
print(f" Tokens: {self.llm_token_count} ({tokens_per_sec:.1f} tok/s)")
|
| 99 |
+
|
| 100 |
+
# TTS
|
| 101 |
+
if self.tts_first_audio:
|
| 102 |
+
tts_ttfa = self.tts_first_audio - self.tts_websocket_open
|
| 103 |
+
tts_total = self.tts_last_audio - self.tts_websocket_open if self.tts_last_audio else 0
|
| 104 |
+
print(f"\n🔊 TTS (ElevenLabs):")
|
| 105 |
+
print(f" Time to first audio: {tts_ttfa:.2f}s")
|
| 106 |
+
print(f" Total streaming: {tts_total:.2f}s")
|
| 107 |
+
print(f" Audio chunks: {self.tts_audio_chunks}")
|
| 108 |
+
|
| 109 |
+
# End-to-end
|
| 110 |
+
print("\n" + "-" * 60)
|
| 111 |
+
print("END-TO-END LATENCY")
|
| 112 |
+
print("-" * 60)
|
| 113 |
+
|
| 114 |
+
if self.tts_first_audio and self.speech_end:
|
| 115 |
+
e2e_to_first_audio = self.tts_first_audio - self.speech_end
|
| 116 |
+
print(f"\n⏱️ Speech end → First audio: {e2e_to_first_audio:.2f}s")
|
| 117 |
+
|
| 118 |
+
total_turn = self.turn_end - self.turn_start if self.turn_end else 0
|
| 119 |
+
print(f"⏱️ Total turn time: {total_turn:.2f}s")
|
| 120 |
+
|
| 121 |
+
print("\n" + "=" * 60 + "\n")
|
| 122 |
+
|
| 123 |
+
# Audio settings
|
| 124 |
+
SAMPLE_RATE = 16000 # Whisper expects 16kHz
|
| 125 |
+
SILENCE_THRESHOLD = 0.015 # RMS threshold for silence detection
|
| 126 |
+
SILENCE_DURATION = 0.8 # Seconds of silence to end utterance (reduced for responsiveness)
|
| 127 |
+
MIN_SPEECH_DURATION = 0.3 # Minimum speech duration to process
|
| 128 |
+
|
| 129 |
+
|
| 130 |
+
class ClawdbotHandler:
|
| 131 |
+
"""Handles the Clawdbot conversation loop with Whisper STT and ElevenLabs TTS."""
|
| 132 |
+
|
| 133 |
+
def __init__(
|
| 134 |
+
self,
|
| 135 |
+
gateway_url: str = "http://localhost:18789",
|
| 136 |
+
gateway_token: Optional[str] = None,
|
| 137 |
+
elevenlabs_api_key: Optional[str] = None,
|
| 138 |
+
elevenlabs_voice_id: str = "qA5SHJ9UjGlW2QwXWR7w",
|
| 139 |
+
head_wobbler: Optional["HeadWobbler"] = None,
|
| 140 |
+
on_listening: Optional[Callable[[], None]] = None,
|
| 141 |
+
on_thinking: Optional[Callable[[], None]] = None,
|
| 142 |
+
on_speaking: Optional[Callable[[], None]] = None,
|
| 143 |
+
profile_mode: bool = False,
|
| 144 |
+
on_profile_complete: Optional[Callable[[ConversationTiming], None]] = None,
|
| 145 |
+
):
|
| 146 |
+
"""Initialize the handler.
|
| 147 |
+
|
| 148 |
+
Args:
|
| 149 |
+
gateway_url: Clawdbot gateway URL
|
| 150 |
+
gateway_token: Gateway auth token
|
| 151 |
+
elevenlabs_api_key: ElevenLabs API key
|
| 152 |
+
elevenlabs_voice_id: ElevenLabs voice ID
|
| 153 |
+
head_wobbler: HeadWobbler instance for audio-driven head movement
|
| 154 |
+
on_listening: Callback when listening starts
|
| 155 |
+
on_thinking: Callback when processing/thinking
|
| 156 |
+
on_speaking: Callback when speaking starts
|
| 157 |
+
profile_mode: If True, print timing summary after each turn
|
| 158 |
+
on_profile_complete: Callback with timing data after each turn completes
|
| 159 |
+
"""
|
| 160 |
+
self.gateway_url = gateway_url
|
| 161 |
+
self.gateway_token = gateway_token or os.getenv("CLAWDBOT_TOKEN")
|
| 162 |
+
self.elevenlabs_api_key = elevenlabs_api_key or os.getenv("ELEVENLABS_API_KEY")
|
| 163 |
+
self.elevenlabs_voice_id = elevenlabs_voice_id
|
| 164 |
+
self.head_wobbler = head_wobbler
|
| 165 |
+
|
| 166 |
+
# State callbacks
|
| 167 |
+
self.on_listening = on_listening
|
| 168 |
+
self.on_thinking = on_thinking
|
| 169 |
+
self.on_speaking = on_speaking
|
| 170 |
+
|
| 171 |
+
# Profiling
|
| 172 |
+
self.profile_mode = profile_mode
|
| 173 |
+
self.on_profile_complete = on_profile_complete
|
| 174 |
+
self._current_timing: Optional[ConversationTiming] = None
|
| 175 |
+
self._timing_history: list[ConversationTiming] = []
|
| 176 |
+
|
| 177 |
+
# Audio buffers
|
| 178 |
+
self._audio_buffer: list[NDArray[np.float32]] = []
|
| 179 |
+
self._buffer_lock = threading.Lock()
|
| 180 |
+
|
| 181 |
+
# Speech detection state
|
| 182 |
+
self._is_speaking = False
|
| 183 |
+
self._silence_start: Optional[float] = None
|
| 184 |
+
self._speech_start: Optional[float] = None
|
| 185 |
+
|
| 186 |
+
# Output queue for TTS audio
|
| 187 |
+
self.output_queue: asyncio.Queue[Tuple[int, NDArray[np.float32]]] = asyncio.Queue()
|
| 188 |
+
|
| 189 |
+
# Whisper model (lazy load)
|
| 190 |
+
self._whisper_model = None
|
| 191 |
+
|
| 192 |
+
# Processing state
|
| 193 |
+
self._processing = False
|
| 194 |
+
self._stop_event = threading.Event()
|
| 195 |
+
|
| 196 |
+
def _load_whisper(self):
|
| 197 |
+
"""Lazy load Whisper model."""
|
| 198 |
+
if self._whisper_model is None:
|
| 199 |
+
from faster_whisper import WhisperModel
|
| 200 |
+
logger.info("Loading Whisper model...")
|
| 201 |
+
self._whisper_model = WhisperModel("small.en")
|
| 202 |
+
logger.info("Whisper model loaded")
|
| 203 |
+
return self._whisper_model
|
| 204 |
+
|
| 205 |
+
def _compute_rms(self, audio: NDArray[np.float32]) -> float:
|
| 206 |
+
"""Compute RMS of audio signal."""
|
| 207 |
+
return float(np.sqrt(np.mean(audio ** 2)))
|
| 208 |
+
|
| 209 |
+
async def receive(self, audio_frame: Tuple[int, NDArray]) -> None:
|
| 210 |
+
"""Receive an audio frame from the microphone.
|
| 211 |
+
|
| 212 |
+
Args:
|
| 213 |
+
audio_frame: Tuple of (sample_rate, audio_data)
|
| 214 |
+
"""
|
| 215 |
+
input_sr, audio_data = audio_frame
|
| 216 |
+
|
| 217 |
+
# Convert to float32 if needed
|
| 218 |
+
if audio_data.dtype == np.int16:
|
| 219 |
+
audio_data = audio_data.astype(np.float32) / 32768.0
|
| 220 |
+
|
| 221 |
+
# Resample to 16kHz if needed
|
| 222 |
+
if input_sr != SAMPLE_RATE:
|
| 223 |
+
from scipy.signal import resample
|
| 224 |
+
num_samples = int(len(audio_data) * SAMPLE_RATE / input_sr)
|
| 225 |
+
audio_data = resample(audio_data, num_samples).astype(np.float32)
|
| 226 |
+
|
| 227 |
+
# Check for speech
|
| 228 |
+
rms = self._compute_rms(audio_data)
|
| 229 |
+
is_speech = rms > SILENCE_THRESHOLD
|
| 230 |
+
|
| 231 |
+
now = time.time()
|
| 232 |
+
|
| 233 |
+
with self._buffer_lock:
|
| 234 |
+
if is_speech:
|
| 235 |
+
# Speech detected
|
| 236 |
+
if not self._is_speaking:
|
| 237 |
+
self._is_speaking = True
|
| 238 |
+
self._speech_start = now
|
| 239 |
+
self._silence_start = None
|
| 240 |
+
if self.on_listening:
|
| 241 |
+
self.on_listening()
|
| 242 |
+
logger.debug("Speech started")
|
| 243 |
+
|
| 244 |
+
self._audio_buffer.append(audio_data)
|
| 245 |
+
self._silence_start = None
|
| 246 |
+
|
| 247 |
+
else:
|
| 248 |
+
# Silence
|
| 249 |
+
if self._is_speaking:
|
| 250 |
+
# Still accumulating (might resume speaking)
|
| 251 |
+
self._audio_buffer.append(audio_data)
|
| 252 |
+
|
| 253 |
+
if self._silence_start is None:
|
| 254 |
+
self._silence_start = now
|
| 255 |
+
elif now - self._silence_start > SILENCE_DURATION:
|
| 256 |
+
# End of utterance
|
| 257 |
+
speech_duration = now - (self._speech_start or now)
|
| 258 |
+
if speech_duration >= MIN_SPEECH_DURATION:
|
| 259 |
+
# Process the utterance
|
| 260 |
+
audio_to_process = np.concatenate(self._audio_buffer)
|
| 261 |
+
speech_start_time = self._speech_start
|
| 262 |
+
speech_end_time = now
|
| 263 |
+
self._audio_buffer.clear()
|
| 264 |
+
self._is_speaking = False
|
| 265 |
+
self._silence_start = None
|
| 266 |
+
|
| 267 |
+
# Process in background with timing info
|
| 268 |
+
asyncio.create_task(self._process_utterance(
|
| 269 |
+
audio_to_process,
|
| 270 |
+
speech_start_time,
|
| 271 |
+
speech_end_time
|
| 272 |
+
))
|
| 273 |
+
else:
|
| 274 |
+
# Too short, discard
|
| 275 |
+
self._audio_buffer.clear()
|
| 276 |
+
self._is_speaking = False
|
| 277 |
+
self._silence_start = None
|
| 278 |
+
logger.debug("Utterance too short, discarding")
|
| 279 |
+
|
| 280 |
+
async def _process_utterance(
|
| 281 |
+
self,
|
| 282 |
+
audio: NDArray[np.float32],
|
| 283 |
+
speech_start: Optional[float] = None,
|
| 284 |
+
speech_end: Optional[float] = None,
|
| 285 |
+
) -> None:
|
| 286 |
+
"""Process a complete utterance: transcribe, stream to Clawdbot, stream TTS."""
|
| 287 |
+
if self._processing:
|
| 288 |
+
logger.warning("Already processing, skipping utterance")
|
| 289 |
+
return
|
| 290 |
+
|
| 291 |
+
self._processing = True
|
| 292 |
+
|
| 293 |
+
# Initialize timing for this turn
|
| 294 |
+
timing = ConversationTiming()
|
| 295 |
+
timing.turn_start = time.time()
|
| 296 |
+
timing.speech_start = speech_start or timing.turn_start
|
| 297 |
+
timing.speech_end = speech_end or timing.turn_start
|
| 298 |
+
self._current_timing = timing
|
| 299 |
+
|
| 300 |
+
try:
|
| 301 |
+
if self.on_thinking:
|
| 302 |
+
self.on_thinking()
|
| 303 |
+
|
| 304 |
+
# 1. Transcribe with Whisper
|
| 305 |
+
logger.info("Transcribing...")
|
| 306 |
+
timing.transcription_start = time.time()
|
| 307 |
+
transcript = await self._transcribe(audio)
|
| 308 |
+
timing.transcription_end = time.time()
|
| 309 |
+
|
| 310 |
+
if not transcript or transcript.strip() == "":
|
| 311 |
+
logger.debug("Empty transcript, skipping")
|
| 312 |
+
return
|
| 313 |
+
|
| 314 |
+
timing.user_text = transcript
|
| 315 |
+
logger.info(f"User said: {transcript}")
|
| 316 |
+
|
| 317 |
+
# 2. Stream from Clawdbot directly to TTS via WebSocket
|
| 318 |
+
# This starts speaking as soon as LLM tokens arrive
|
| 319 |
+
logger.info("Streaming response...")
|
| 320 |
+
if self.on_speaking:
|
| 321 |
+
self.on_speaking()
|
| 322 |
+
|
| 323 |
+
await self._stream_llm_to_tts(transcript, timing)
|
| 324 |
+
|
| 325 |
+
timing.turn_end = time.time()
|
| 326 |
+
|
| 327 |
+
# Print/record timing summary
|
| 328 |
+
if self.profile_mode:
|
| 329 |
+
timing.print_summary()
|
| 330 |
+
|
| 331 |
+
self._timing_history.append(timing)
|
| 332 |
+
|
| 333 |
+
if self.on_profile_complete:
|
| 334 |
+
self.on_profile_complete(timing)
|
| 335 |
+
|
| 336 |
+
except Exception as e:
|
| 337 |
+
logger.error(f"Error processing utterance: {e}", exc_info=True)
|
| 338 |
+
finally:
|
| 339 |
+
self._processing = False
|
| 340 |
+
self._current_timing = None
|
| 341 |
+
|
| 342 |
+
async def _transcribe(self, audio: NDArray[np.float32]) -> str:
|
| 343 |
+
"""Transcribe audio using Whisper."""
|
| 344 |
+
model = self._load_whisper()
|
| 345 |
+
|
| 346 |
+
# Write to temp file (Whisper expects a file path)
|
| 347 |
+
with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as f:
|
| 348 |
+
sf.write(f.name, audio, SAMPLE_RATE)
|
| 349 |
+
temp_path = f.name
|
| 350 |
+
|
| 351 |
+
try:
|
| 352 |
+
# Run in executor to not block
|
| 353 |
+
loop = asyncio.get_event_loop()
|
| 354 |
+
segments, _ = await loop.run_in_executor(
|
| 355 |
+
None,
|
| 356 |
+
lambda: model.transcribe(temp_path, language="en")
|
| 357 |
+
)
|
| 358 |
+
# faster-whisper returns an iterator of segments
|
| 359 |
+
return "".join(segment.text for segment in segments).strip()
|
| 360 |
+
finally:
|
| 361 |
+
Path(temp_path).unlink(missing_ok=True)
|
| 362 |
+
|
| 363 |
+
async def _stream_clawdbot(self, message: str) -> AsyncIterator[str]:
|
| 364 |
+
"""Stream response from Clawdbot via OpenAI-compatible SSE endpoint.
|
| 365 |
+
|
| 366 |
+
Uses httpx-sse for proper SSE parsing without buffering issues.
|
| 367 |
+
"""
|
| 368 |
+
async with httpx.AsyncClient(timeout=httpx.Timeout(120.0)) as client:
|
| 369 |
+
headers = {
|
| 370 |
+
"Content-Type": "application/json",
|
| 371 |
+
"x-clawdbot-agent-id": "main",
|
| 372 |
+
}
|
| 373 |
+
if self.gateway_token:
|
| 374 |
+
headers["Authorization"] = f"Bearer {self.gateway_token}"
|
| 375 |
+
|
| 376 |
+
url = f"{self.gateway_url}/v1/chat/completions"
|
| 377 |
+
payload = {
|
| 378 |
+
"model": "clawdbot:main",
|
| 379 |
+
"messages": [{"role": "user", "content": message}],
|
| 380 |
+
"user": "moltbot-body",
|
| 381 |
+
"stream": True,
|
| 382 |
+
}
|
| 383 |
+
|
| 384 |
+
stream_start_time = time.time()
|
| 385 |
+
logger.info(f"[STREAM] Opening SSE connection to {url}")
|
| 386 |
+
|
| 387 |
+
try:
|
| 388 |
+
async with aconnect_sse(
|
| 389 |
+
client,
|
| 390 |
+
"POST",
|
| 391 |
+
url,
|
| 392 |
+
json=payload,
|
| 393 |
+
headers=headers
|
| 394 |
+
) as event_source:
|
| 395 |
+
# Check response status
|
| 396 |
+
event_source.response.raise_for_status()
|
| 397 |
+
|
| 398 |
+
connection_time = time.time() - stream_start_time
|
| 399 |
+
content_type = event_source.response.headers.get("content-type", "")
|
| 400 |
+
logger.info(f"[STREAM] SSE connected in {connection_time:.2f}s, content-type: {content_type}")
|
| 401 |
+
|
| 402 |
+
first_event_time = None
|
| 403 |
+
event_count = 0
|
| 404 |
+
|
| 405 |
+
# Iterate over SSE events (no buffering!)
|
| 406 |
+
async for sse in event_source.aiter_sse():
|
| 407 |
+
event_count += 1
|
| 408 |
+
now = time.time()
|
| 409 |
+
elapsed = now - stream_start_time
|
| 410 |
+
|
| 411 |
+
if first_event_time is None:
|
| 412 |
+
first_event_time = now
|
| 413 |
+
logger.info(f"[STREAM] First SSE event at {elapsed:.2f}s")
|
| 414 |
+
|
| 415 |
+
# Check for stream end
|
| 416 |
+
if sse.data == "[DONE]":
|
| 417 |
+
break
|
| 418 |
+
|
| 419 |
+
# Parse the JSON data
|
| 420 |
+
try:
|
| 421 |
+
data = json.loads(sse.data)
|
| 422 |
+
choices = data.get("choices", [])
|
| 423 |
+
if choices:
|
| 424 |
+
delta = choices[0].get("delta", {})
|
| 425 |
+
content = delta.get("content", "")
|
| 426 |
+
if content:
|
| 427 |
+
logger.debug(f"[STREAM] Event {event_count} at {elapsed:.2f}s: {content[:50]}")
|
| 428 |
+
yield content
|
| 429 |
+
except json.JSONDecodeError:
|
| 430 |
+
logger.debug(f"[STREAM] Non-JSON SSE data: {sse.data[:50]}")
|
| 431 |
+
continue
|
| 432 |
+
|
| 433 |
+
# Log stream completion stats
|
| 434 |
+
stream_end_time = time.time()
|
| 435 |
+
total_stream_time = stream_end_time - stream_start_time
|
| 436 |
+
if first_event_time:
|
| 437 |
+
time_to_first = first_event_time - stream_start_time
|
| 438 |
+
streaming_duration = stream_end_time - first_event_time
|
| 439 |
+
logger.info(f"[STREAM] Complete: {event_count} events in {total_stream_time:.2f}s "
|
| 440 |
+
f"(TTFE: {time_to_first:.2f}s, streaming: {streaming_duration:.2f}s)")
|
| 441 |
+
else:
|
| 442 |
+
logger.warning(f"[STREAM] Complete: No events received in {total_stream_time:.2f}s")
|
| 443 |
+
|
| 444 |
+
except httpx.HTTPStatusError as e:
|
| 445 |
+
logger.error(f"Clawdbot HTTP error: {e.response.status_code} - {e.response.text[:200]}")
|
| 446 |
+
except Exception as e:
|
| 447 |
+
logger.error(f"Clawdbot streaming error: {e}")
|
| 448 |
+
|
| 449 |
+
async def _stream_llm_to_tts(
|
| 450 |
+
self,
|
| 451 |
+
message: str,
|
| 452 |
+
timing: Optional[ConversationTiming] = None
|
| 453 |
+
) -> None:
|
| 454 |
+
"""Stream LLM response directly to ElevenLabs WebSocket TTS for minimal latency.
|
| 455 |
+
|
| 456 |
+
Waits for first LLM token before opening WebSocket to avoid 20-second timeout,
|
| 457 |
+
then streams remaining tokens as they arrive.
|
| 458 |
+
"""
|
| 459 |
+
if not self.elevenlabs_api_key:
|
| 460 |
+
logger.warning("No ElevenLabs API key, skipping TTS")
|
| 461 |
+
return
|
| 462 |
+
|
| 463 |
+
tts_sample_rate = 22050
|
| 464 |
+
ws_url = f"wss://api.elevenlabs.io/v1/text-to-speech/{self.elevenlabs_voice_id}/stream-input?model_id=eleven_flash_v2_5&output_format=pcm_22050"
|
| 465 |
+
|
| 466 |
+
full_response = [] # Collect for logging and fallback
|
| 467 |
+
receive_task = None
|
| 468 |
+
ws = None
|
| 469 |
+
|
| 470 |
+
# Track timing for TTS audio reception
|
| 471 |
+
first_audio_received = False
|
| 472 |
+
|
| 473 |
+
try:
|
| 474 |
+
# Get async iterator for LLM tokens
|
| 475 |
+
if timing:
|
| 476 |
+
timing.llm_request_start = time.time()
|
| 477 |
+
|
| 478 |
+
llm_stream = self._stream_clawdbot(message)
|
| 479 |
+
|
| 480 |
+
# Wait for first token before opening WebSocket
|
| 481 |
+
logger.info("Waiting for first LLM token...")
|
| 482 |
+
first_token = None
|
| 483 |
+
async for token in llm_stream:
|
| 484 |
+
first_token = token
|
| 485 |
+
full_response.append(token)
|
| 486 |
+
if timing:
|
| 487 |
+
timing.llm_first_token = time.time()
|
| 488 |
+
logger.debug(f"First token received: {token[:50] if len(token) > 50 else token}")
|
| 489 |
+
break
|
| 490 |
+
|
| 491 |
+
if first_token is None:
|
| 492 |
+
logger.warning("No tokens received from LLM")
|
| 493 |
+
return
|
| 494 |
+
|
| 495 |
+
# Now open WebSocket - we have tokens to send
|
| 496 |
+
logger.info("Opening TTS WebSocket...")
|
| 497 |
+
ws = await websockets.connect(ws_url)
|
| 498 |
+
|
| 499 |
+
if timing:
|
| 500 |
+
timing.tts_websocket_open = time.time()
|
| 501 |
+
|
| 502 |
+
# Initialize the WebSocket connection
|
| 503 |
+
init_message = {
|
| 504 |
+
"text": " ", # Initial space to start the stream
|
| 505 |
+
"voice_settings": {
|
| 506 |
+
"stability": 0.5,
|
| 507 |
+
"similarity_boost": 0.75,
|
| 508 |
+
},
|
| 509 |
+
"xi_api_key": self.elevenlabs_api_key,
|
| 510 |
+
"auto_mode": True, # Let ElevenLabs handle chunk timing
|
| 511 |
+
}
|
| 512 |
+
await ws.send(json.dumps(init_message))
|
| 513 |
+
logger.debug("ElevenLabs WebSocket initialized")
|
| 514 |
+
|
| 515 |
+
# Task to receive audio from WebSocket and queue for playback
|
| 516 |
+
async def receive_audio():
|
| 517 |
+
nonlocal first_audio_received
|
| 518 |
+
try:
|
| 519 |
+
async for msg in ws:
|
| 520 |
+
try:
|
| 521 |
+
data = json.loads(msg)
|
| 522 |
+
audio_b64 = data.get("audio")
|
| 523 |
+
if audio_b64:
|
| 524 |
+
# Track first audio timing
|
| 525 |
+
if timing and not first_audio_received:
|
| 526 |
+
timing.tts_first_audio = time.time()
|
| 527 |
+
first_audio_received = True
|
| 528 |
+
|
| 529 |
+
if timing:
|
| 530 |
+
timing.tts_audio_chunks += 1
|
| 531 |
+
timing.tts_last_audio = time.time()
|
| 532 |
+
|
| 533 |
+
# Decode base64 PCM audio
|
| 534 |
+
audio_bytes = base64.b64decode(audio_b64)
|
| 535 |
+
audio_int16 = np.frombuffer(audio_bytes, dtype=np.int16)
|
| 536 |
+
audio_data = audio_int16.astype(np.float32) / 32768.0
|
| 537 |
+
|
| 538 |
+
# Feed to head wobbler
|
| 539 |
+
if self.head_wobbler is not None:
|
| 540 |
+
self.head_wobbler.feed(audio_b64)
|
| 541 |
+
|
| 542 |
+
# Queue for playback
|
| 543 |
+
await self.output_queue.put((tts_sample_rate, audio_data))
|
| 544 |
+
|
| 545 |
+
# Check if stream is done
|
| 546 |
+
if data.get("isFinal"):
|
| 547 |
+
logger.debug("ElevenLabs stream complete")
|
| 548 |
+
break
|
| 549 |
+
except json.JSONDecodeError:
|
| 550 |
+
continue
|
| 551 |
+
except websockets.exceptions.ConnectionClosed as e:
|
| 552 |
+
logger.debug(f"WebSocket closed during receive: {e}")
|
| 553 |
+
|
| 554 |
+
# Start receiving audio in background
|
| 555 |
+
receive_task = asyncio.create_task(receive_audio())
|
| 556 |
+
|
| 557 |
+
# Send first token
|
| 558 |
+
logger.debug(f"Sending token 1 to TTS: {first_token[:50] if len(first_token) > 50 else first_token}")
|
| 559 |
+
await ws.send(json.dumps({"text": first_token}))
|
| 560 |
+
|
| 561 |
+
# Continue streaming remaining tokens
|
| 562 |
+
token_count = 1
|
| 563 |
+
async for token in llm_stream:
|
| 564 |
+
full_response.append(token)
|
| 565 |
+
token_count += 1
|
| 566 |
+
if timing:
|
| 567 |
+
timing.llm_last_token = time.time()
|
| 568 |
+
logger.debug(f"Sending token {token_count} to TTS: {token[:50] if len(token) > 50 else token}")
|
| 569 |
+
await ws.send(json.dumps({"text": token}))
|
| 570 |
+
|
| 571 |
+
if timing:
|
| 572 |
+
timing.llm_token_count = token_count
|
| 573 |
+
if not timing.llm_last_token:
|
| 574 |
+
timing.llm_last_token = time.time()
|
| 575 |
+
|
| 576 |
+
logger.info(f"Sent {token_count} tokens to TTS")
|
| 577 |
+
|
| 578 |
+
# Signal end of text
|
| 579 |
+
await ws.send(json.dumps({"text": ""}))
|
| 580 |
+
|
| 581 |
+
# Wait for audio to finish with timeout
|
| 582 |
+
try:
|
| 583 |
+
await asyncio.wait_for(receive_task, timeout=60.0)
|
| 584 |
+
except asyncio.TimeoutError:
|
| 585 |
+
logger.warning("Timeout waiting for TTS audio, continuing")
|
| 586 |
+
receive_task.cancel()
|
| 587 |
+
|
| 588 |
+
response_text = "".join(full_response)
|
| 589 |
+
if timing:
|
| 590 |
+
timing.assistant_text = response_text
|
| 591 |
+
logger.info(f"Clawdbot response: {response_text[:100]}...")
|
| 592 |
+
|
| 593 |
+
except websockets.exceptions.ConnectionClosedError as e:
|
| 594 |
+
logger.warning(f"WebSocket closed: {e}")
|
| 595 |
+
# Fallback to HTTP streaming with accumulated response
|
| 596 |
+
if full_response:
|
| 597 |
+
if timing:
|
| 598 |
+
timing.assistant_text = "".join(full_response)
|
| 599 |
+
logger.info("Falling back to HTTP streaming TTS")
|
| 600 |
+
await self._generate_tts_fallback("".join(full_response))
|
| 601 |
+
except Exception as e:
|
| 602 |
+
logger.error(f"LLM-to-TTS pipeline error: {e}", exc_info=True)
|
| 603 |
+
# Fallback: if WebSocket fails, try to use accumulated response with HTTP streaming
|
| 604 |
+
if full_response:
|
| 605 |
+
if timing:
|
| 606 |
+
timing.assistant_text = "".join(full_response)
|
| 607 |
+
logger.info("Falling back to HTTP streaming TTS")
|
| 608 |
+
await self._generate_tts_fallback("".join(full_response))
|
| 609 |
+
finally:
|
| 610 |
+
if receive_task and not receive_task.done():
|
| 611 |
+
receive_task.cancel()
|
| 612 |
+
if ws:
|
| 613 |
+
await ws.close()
|
| 614 |
+
|
| 615 |
+
async def _generate_tts_fallback(self, text: str) -> None:
|
| 616 |
+
"""Fallback TTS using HTTP streaming (used if WebSocket fails)."""
|
| 617 |
+
if not self.elevenlabs_api_key or not text:
|
| 618 |
+
return
|
| 619 |
+
|
| 620 |
+
tts_sample_rate = 22050
|
| 621 |
+
|
| 622 |
+
async with httpx.AsyncClient() as client:
|
| 623 |
+
try:
|
| 624 |
+
async with client.stream(
|
| 625 |
+
"POST",
|
| 626 |
+
f"https://api.elevenlabs.io/v1/text-to-speech/{self.elevenlabs_voice_id}/stream",
|
| 627 |
+
params={
|
| 628 |
+
"output_format": "pcm_22050",
|
| 629 |
+
"optimize_streaming_latency": "3",
|
| 630 |
+
},
|
| 631 |
+
json={
|
| 632 |
+
"text": text,
|
| 633 |
+
"model_id": "eleven_flash_v2_5",
|
| 634 |
+
"voice_settings": {
|
| 635 |
+
"stability": 0.5,
|
| 636 |
+
"similarity_boost": 0.75,
|
| 637 |
+
}
|
| 638 |
+
},
|
| 639 |
+
headers={
|
| 640 |
+
"xi-api-key": self.elevenlabs_api_key,
|
| 641 |
+
"Content-Type": "application/json",
|
| 642 |
+
},
|
| 643 |
+
timeout=60.0,
|
| 644 |
+
) as response:
|
| 645 |
+
response.raise_for_status()
|
| 646 |
+
|
| 647 |
+
async for chunk in response.aiter_bytes(chunk_size=4096):
|
| 648 |
+
if not chunk:
|
| 649 |
+
continue
|
| 650 |
+
|
| 651 |
+
audio_int16 = np.frombuffer(chunk, dtype=np.int16)
|
| 652 |
+
audio_data = audio_int16.astype(np.float32) / 32768.0
|
| 653 |
+
|
| 654 |
+
if self.head_wobbler is not None:
|
| 655 |
+
audio_b64 = base64.b64encode(audio_int16.tobytes()).decode()
|
| 656 |
+
self.head_wobbler.feed(audio_b64)
|
| 657 |
+
|
| 658 |
+
await self.output_queue.put((tts_sample_rate, audio_data))
|
| 659 |
+
|
| 660 |
+
except Exception as e:
|
| 661 |
+
logger.error(f"TTS fallback error: {e}")
|
| 662 |
+
|
| 663 |
+
async def emit(self) -> Optional[Tuple[int, NDArray[np.float32]]]:
|
| 664 |
+
"""Get the next audio chunk for playback."""
|
| 665 |
+
try:
|
| 666 |
+
return await asyncio.wait_for(self.output_queue.get(), timeout=0.1)
|
| 667 |
+
except asyncio.TimeoutError:
|
| 668 |
+
return None
|
| 669 |
+
|
| 670 |
+
def stop(self) -> None:
|
| 671 |
+
"""Stop the handler."""
|
| 672 |
+
self._stop_event.set()
|
moltbot_body/main.py
ADDED
|
@@ -0,0 +1,322 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Main entry point for Moltbot's body control."""
|
| 2 |
+
|
| 3 |
+
import os
|
| 4 |
+
import sys
|
| 5 |
+
import time
|
| 6 |
+
import asyncio
|
| 7 |
+
import logging
|
| 8 |
+
import argparse
|
| 9 |
+
import threading
|
| 10 |
+
from pathlib import Path
|
| 11 |
+
from typing import Optional
|
| 12 |
+
|
| 13 |
+
from dotenv import load_dotenv
|
| 14 |
+
from reachy_mini import ReachyMini, ReachyMiniApp
|
| 15 |
+
|
| 16 |
+
# Load environment from project root (.env next to pyproject.toml)
|
| 17 |
+
_project_root = Path(__file__).parent.parent
|
| 18 |
+
load_dotenv(_project_root / ".env")
|
| 19 |
+
|
| 20 |
+
logger = logging.getLogger(__name__)
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
def setup_logging(debug: bool = False) -> None:
|
| 24 |
+
"""Configure logging."""
|
| 25 |
+
level = logging.DEBUG if debug else logging.INFO
|
| 26 |
+
logging.basicConfig(
|
| 27 |
+
level=level,
|
| 28 |
+
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s",
|
| 29 |
+
datefmt="%H:%M:%S",
|
| 30 |
+
)
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
def parse_args() -> argparse.Namespace:
|
| 34 |
+
"""Parse command line arguments."""
|
| 35 |
+
parser = argparse.ArgumentParser(description="Moltbot's body control")
|
| 36 |
+
parser.add_argument("--debug", action="store_true", help="Enable debug logging")
|
| 37 |
+
parser.add_argument("--robot-name", type=str, help="Robot name for connection")
|
| 38 |
+
parser.add_argument(
|
| 39 |
+
"--gateway-url",
|
| 40 |
+
type=str,
|
| 41 |
+
default="http://localhost:18789",
|
| 42 |
+
help="Clawdbot gateway URL",
|
| 43 |
+
)
|
| 44 |
+
parser.add_argument(
|
| 45 |
+
"--profile",
|
| 46 |
+
action="store_true",
|
| 47 |
+
help="Enable timing profiler - prints detailed timing after each turn",
|
| 48 |
+
)
|
| 49 |
+
parser.add_argument(
|
| 50 |
+
"--profile-once",
|
| 51 |
+
action="store_true",
|
| 52 |
+
help="Profile one conversation turn then exit (implies --profile)",
|
| 53 |
+
)
|
| 54 |
+
return parser.parse_args()
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
class MoltbotBodyCore:
|
| 58 |
+
"""Main class controlling Moltbot's physical body."""
|
| 59 |
+
|
| 60 |
+
def __init__(
|
| 61 |
+
self,
|
| 62 |
+
gateway_url: str = "http://localhost:18789",
|
| 63 |
+
robot_name: Optional[str] = None,
|
| 64 |
+
profile_mode: bool = False,
|
| 65 |
+
profile_once: bool = False,
|
| 66 |
+
robot: Optional[ReachyMini] = None,
|
| 67 |
+
external_stop_event: Optional[threading.Event] = None,
|
| 68 |
+
):
|
| 69 |
+
"""Initialize Moltbot's body.
|
| 70 |
+
|
| 71 |
+
Args:
|
| 72 |
+
gateway_url: Clawdbot gateway URL
|
| 73 |
+
robot_name: Optional robot name for connection
|
| 74 |
+
profile_mode: Enable timing profiler
|
| 75 |
+
profile_once: Exit after one conversation turn (implies profile_mode)
|
| 76 |
+
robot: Optional pre-initialized ReachyMini instance (for app framework)
|
| 77 |
+
external_stop_event: Optional external stop event (for app framework)
|
| 78 |
+
"""
|
| 79 |
+
from moltbot_body.clawdbot_handler import ClawdbotHandler
|
| 80 |
+
from moltbot_body.moves import MovementManager
|
| 81 |
+
from moltbot_body.audio.head_wobbler import HeadWobbler
|
| 82 |
+
|
| 83 |
+
self.gateway_url = gateway_url
|
| 84 |
+
self.profile_once = profile_once
|
| 85 |
+
self._external_stop_event = external_stop_event
|
| 86 |
+
self._owns_robot = robot is None # Track if we created the robot
|
| 87 |
+
|
| 88 |
+
# Use provided robot or create one
|
| 89 |
+
if robot is not None:
|
| 90 |
+
self.robot = robot
|
| 91 |
+
logger.info("Using provided Reachy Mini instance")
|
| 92 |
+
else:
|
| 93 |
+
# Connect to robot
|
| 94 |
+
logger.info("Connecting to Reachy Mini...")
|
| 95 |
+
robot_kwargs = {}
|
| 96 |
+
if robot_name:
|
| 97 |
+
robot_kwargs["robot_name"] = robot_name
|
| 98 |
+
|
| 99 |
+
try:
|
| 100 |
+
self.robot = ReachyMini(**robot_kwargs)
|
| 101 |
+
except TimeoutError as e:
|
| 102 |
+
logger.error(f"Connection timeout: Failed to connect to Reachy Mini. Details: {e}")
|
| 103 |
+
logger.error("Check that the robot is powered on and reachable on the network.")
|
| 104 |
+
sys.exit(1)
|
| 105 |
+
except ConnectionError as e:
|
| 106 |
+
logger.error(f"Connection failed: Unable to establish connection. Details: {e}")
|
| 107 |
+
sys.exit(1)
|
| 108 |
+
except Exception as e:
|
| 109 |
+
logger.error(f"Unexpected error during robot initialization: {type(e).__name__}: {e}")
|
| 110 |
+
sys.exit(1)
|
| 111 |
+
|
| 112 |
+
logger.info(f"Connected to robot: {self.robot.client.get_status()}")
|
| 113 |
+
|
| 114 |
+
# Initialize movement system
|
| 115 |
+
logger.info("Initializing movement manager...")
|
| 116 |
+
self.movement_manager = MovementManager(current_robot=self.robot)
|
| 117 |
+
self.head_wobbler = HeadWobbler(set_speech_offsets=self.movement_manager.set_speech_offsets)
|
| 118 |
+
|
| 119 |
+
# Initialize handler
|
| 120 |
+
gateway_token = os.getenv("CLAWDBOT_TOKEN")
|
| 121 |
+
if not gateway_token:
|
| 122 |
+
logger.warning("CLAWDBOT_TOKEN not found in environment - auth may fail")
|
| 123 |
+
else:
|
| 124 |
+
logger.debug(f"Gateway token loaded ({len(gateway_token)} chars)")
|
| 125 |
+
|
| 126 |
+
# Callback to handle profile completion
|
| 127 |
+
def on_profile_complete(timing):
|
| 128 |
+
if self.profile_once:
|
| 129 |
+
logger.info("Profile complete - scheduling shutdown...")
|
| 130 |
+
self._stop_event.set()
|
| 131 |
+
|
| 132 |
+
self.handler = ClawdbotHandler(
|
| 133 |
+
gateway_url=gateway_url,
|
| 134 |
+
gateway_token=gateway_token,
|
| 135 |
+
elevenlabs_api_key=os.getenv("ELEVENLABS_API_KEY"),
|
| 136 |
+
head_wobbler=self.head_wobbler,
|
| 137 |
+
on_listening=self._on_listening,
|
| 138 |
+
on_thinking=self._on_thinking,
|
| 139 |
+
on_speaking=self._on_speaking,
|
| 140 |
+
profile_mode=profile_mode or profile_once,
|
| 141 |
+
on_profile_complete=on_profile_complete if profile_once else None,
|
| 142 |
+
)
|
| 143 |
+
|
| 144 |
+
# State
|
| 145 |
+
self._stop_event = asyncio.Event()
|
| 146 |
+
self._tasks: list[asyncio.Task] = []
|
| 147 |
+
|
| 148 |
+
def _on_listening(self) -> None:
|
| 149 |
+
"""Callback when listening starts."""
|
| 150 |
+
logger.info("Listening...")
|
| 151 |
+
self.movement_manager.set_listening(True)
|
| 152 |
+
|
| 153 |
+
def _on_thinking(self) -> None:
|
| 154 |
+
"""Callback when thinking/processing."""
|
| 155 |
+
logger.info("Thinking...")
|
| 156 |
+
self.movement_manager.set_listening(False)
|
| 157 |
+
|
| 158 |
+
def _on_speaking(self) -> None:
|
| 159 |
+
"""Callback when speaking starts."""
|
| 160 |
+
logger.info("Speaking...")
|
| 161 |
+
self.head_wobbler.reset() # Clear any stale audio from previous utterance
|
| 162 |
+
|
| 163 |
+
def _should_stop(self) -> bool:
|
| 164 |
+
"""Check if we should stop (internal or external stop event)."""
|
| 165 |
+
if self._stop_event.is_set():
|
| 166 |
+
return True
|
| 167 |
+
if self._external_stop_event is not None and self._external_stop_event.is_set():
|
| 168 |
+
return True
|
| 169 |
+
return False
|
| 170 |
+
|
| 171 |
+
async def record_loop(self) -> None:
|
| 172 |
+
"""Read audio from robot microphone and send to handler."""
|
| 173 |
+
input_sample_rate = self.robot.media.get_input_audio_samplerate()
|
| 174 |
+
logger.info(f"Recording at {input_sample_rate} Hz")
|
| 175 |
+
|
| 176 |
+
while not self._should_stop():
|
| 177 |
+
audio_frame = self.robot.media.get_audio_sample()
|
| 178 |
+
if audio_frame is not None:
|
| 179 |
+
await self.handler.receive((input_sample_rate, audio_frame))
|
| 180 |
+
await asyncio.sleep(0.01) # ~100Hz polling
|
| 181 |
+
|
| 182 |
+
async def play_loop(self) -> None:
|
| 183 |
+
"""Play audio from handler through robot speakers."""
|
| 184 |
+
output_sample_rate = self.robot.media.get_output_audio_samplerate()
|
| 185 |
+
logger.info(f"Playing at {output_sample_rate} Hz")
|
| 186 |
+
|
| 187 |
+
while not self._should_stop():
|
| 188 |
+
output = await self.handler.emit()
|
| 189 |
+
if output is not None:
|
| 190 |
+
input_sr, audio_data = output
|
| 191 |
+
|
| 192 |
+
# Resample if needed
|
| 193 |
+
if input_sr != output_sample_rate:
|
| 194 |
+
from scipy.signal import resample
|
| 195 |
+
num_samples = int(len(audio_data) * output_sample_rate / input_sr)
|
| 196 |
+
audio_data = resample(audio_data, num_samples).astype("float32")
|
| 197 |
+
|
| 198 |
+
self.robot.media.push_audio_sample(audio_data)
|
| 199 |
+
|
| 200 |
+
await asyncio.sleep(0.01)
|
| 201 |
+
|
| 202 |
+
async def run(self) -> None:
|
| 203 |
+
"""Run the main loop."""
|
| 204 |
+
# Start movement system
|
| 205 |
+
logger.info("Starting movement manager...")
|
| 206 |
+
self.movement_manager.start()
|
| 207 |
+
self.head_wobbler.start()
|
| 208 |
+
|
| 209 |
+
# Start media
|
| 210 |
+
logger.info("Starting audio...")
|
| 211 |
+
self.robot.media.start_recording()
|
| 212 |
+
self.robot.media.start_playing()
|
| 213 |
+
time.sleep(1) # Let pipelines initialize
|
| 214 |
+
|
| 215 |
+
logger.info("Ready! Speak to me...")
|
| 216 |
+
|
| 217 |
+
# Start tasks
|
| 218 |
+
self._tasks = [
|
| 219 |
+
asyncio.create_task(self.record_loop(), name="record-loop"),
|
| 220 |
+
asyncio.create_task(self.play_loop(), name="play-loop"),
|
| 221 |
+
]
|
| 222 |
+
|
| 223 |
+
try:
|
| 224 |
+
await asyncio.gather(*self._tasks)
|
| 225 |
+
except asyncio.CancelledError:
|
| 226 |
+
logger.info("Tasks cancelled")
|
| 227 |
+
|
| 228 |
+
def stop(self) -> None:
|
| 229 |
+
"""Stop everything."""
|
| 230 |
+
logger.info("Stopping...")
|
| 231 |
+
self._stop_event.set()
|
| 232 |
+
|
| 233 |
+
# Cancel tasks
|
| 234 |
+
for task in self._tasks:
|
| 235 |
+
if not task.done():
|
| 236 |
+
task.cancel()
|
| 237 |
+
|
| 238 |
+
# Stop movement system (MovementManager resets to neutral on stop)
|
| 239 |
+
self.head_wobbler.stop()
|
| 240 |
+
self.movement_manager.stop()
|
| 241 |
+
|
| 242 |
+
# Only manage robot resources if we created it
|
| 243 |
+
if self._owns_robot:
|
| 244 |
+
# Close media
|
| 245 |
+
try:
|
| 246 |
+
self.robot.media.close()
|
| 247 |
+
except Exception as e:
|
| 248 |
+
logger.debug(f"Error closing media: {e}")
|
| 249 |
+
|
| 250 |
+
# Disconnect
|
| 251 |
+
self.robot.client.disconnect()
|
| 252 |
+
|
| 253 |
+
self.handler.stop()
|
| 254 |
+
|
| 255 |
+
logger.info("Stopped")
|
| 256 |
+
|
| 257 |
+
|
| 258 |
+
class MoltbotBody(ReachyMiniApp):
|
| 259 |
+
"""Reachy Mini Apps entry point for Moltbot Body.
|
| 260 |
+
|
| 261 |
+
This class allows Moltbot Body to be installed and run from the
|
| 262 |
+
Reachy Mini dashboard as a Reachy Mini App.
|
| 263 |
+
"""
|
| 264 |
+
|
| 265 |
+
# No custom settings UI for now
|
| 266 |
+
custom_app_url: str | None = None
|
| 267 |
+
|
| 268 |
+
def run(self, reachy_mini: ReachyMini, stop_event: threading.Event) -> None:
|
| 269 |
+
"""Run Moltbot Body as a Reachy Mini App.
|
| 270 |
+
|
| 271 |
+
Args:
|
| 272 |
+
reachy_mini: Pre-initialized ReachyMini instance from the framework
|
| 273 |
+
stop_event: Threading event to signal when the app should stop
|
| 274 |
+
"""
|
| 275 |
+
# Create a new event loop for async operations
|
| 276 |
+
loop = asyncio.new_event_loop()
|
| 277 |
+
asyncio.set_event_loop(loop)
|
| 278 |
+
|
| 279 |
+
# Get gateway URL from environment
|
| 280 |
+
gateway_url = os.getenv("CLAWDBOT_GATEWAY_URL", "http://localhost:18789")
|
| 281 |
+
|
| 282 |
+
# Create the body controller with the provided robot instance
|
| 283 |
+
body = MoltbotBodyCore(
|
| 284 |
+
gateway_url=gateway_url,
|
| 285 |
+
robot=reachy_mini,
|
| 286 |
+
external_stop_event=stop_event,
|
| 287 |
+
)
|
| 288 |
+
|
| 289 |
+
try:
|
| 290 |
+
loop.run_until_complete(body.run())
|
| 291 |
+
except Exception as e:
|
| 292 |
+
logger.error(f"Error running Moltbot Body: {e}")
|
| 293 |
+
finally:
|
| 294 |
+
body.stop()
|
| 295 |
+
loop.close()
|
| 296 |
+
|
| 297 |
+
|
| 298 |
+
def main() -> None:
|
| 299 |
+
"""Entry point."""
|
| 300 |
+
args = parse_args()
|
| 301 |
+
setup_logging(args.debug)
|
| 302 |
+
|
| 303 |
+
if args.profile or args.profile_once:
|
| 304 |
+
logger.info("Profiling mode enabled")
|
| 305 |
+
|
| 306 |
+
body = MoltbotBodyCore(
|
| 307 |
+
gateway_url=args.gateway_url,
|
| 308 |
+
robot_name=args.robot_name,
|
| 309 |
+
profile_mode=args.profile,
|
| 310 |
+
profile_once=args.profile_once,
|
| 311 |
+
)
|
| 312 |
+
|
| 313 |
+
try:
|
| 314 |
+
asyncio.run(body.run())
|
| 315 |
+
except KeyboardInterrupt:
|
| 316 |
+
logger.info("Interrupted")
|
| 317 |
+
finally:
|
| 318 |
+
body.stop()
|
| 319 |
+
|
| 320 |
+
|
| 321 |
+
if __name__ == "__main__":
|
| 322 |
+
main()
|
moltbot_body/moves.py
ADDED
|
@@ -0,0 +1,849 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Movement system with sequential primary moves and additive secondary moves.
|
| 2 |
+
|
| 3 |
+
Design overview
|
| 4 |
+
- Primary moves (emotions, dances, goto, breathing) are mutually exclusive and run
|
| 5 |
+
sequentially.
|
| 6 |
+
- Secondary moves (speech sway, face tracking) are additive offsets applied on top
|
| 7 |
+
of the current primary pose.
|
| 8 |
+
- There is a single control point to the robot: `ReachyMini.set_target`.
|
| 9 |
+
- The control loop runs near 100 Hz and is phase-aligned via a monotonic clock.
|
| 10 |
+
- Idle behaviour starts an infinite `BreathingMove` after a short inactivity delay
|
| 11 |
+
unless listening is active.
|
| 12 |
+
|
| 13 |
+
Threading model
|
| 14 |
+
- A dedicated worker thread owns all real-time state and issues `set_target`
|
| 15 |
+
commands.
|
| 16 |
+
- Other threads communicate via a command queue (enqueue moves, mark activity,
|
| 17 |
+
toggle listening).
|
| 18 |
+
- Secondary offset producers set pending values guarded by locks; the worker
|
| 19 |
+
snaps them atomically.
|
| 20 |
+
|
| 21 |
+
Units and frames
|
| 22 |
+
- Secondary offsets are interpreted as metres for x/y/z and radians for
|
| 23 |
+
roll/pitch/yaw in the world frame (unless noted by `compose_world_offset`).
|
| 24 |
+
- Antennas and `body_yaw` are in radians.
|
| 25 |
+
- Head pose composition uses `compose_world_offset(primary_head, secondary_head)`;
|
| 26 |
+
the secondary offset must therefore be expressed in the world frame.
|
| 27 |
+
|
| 28 |
+
Safety
|
| 29 |
+
- Listening freezes antennas, then blends them back on unfreeze.
|
| 30 |
+
- Interpolations and blends are used to avoid jumps at all times.
|
| 31 |
+
- `set_target` errors are rate-limited in logs.
|
| 32 |
+
"""
|
| 33 |
+
|
| 34 |
+
from __future__ import annotations
|
| 35 |
+
import time
|
| 36 |
+
import logging
|
| 37 |
+
import threading
|
| 38 |
+
from queue import Empty, Queue
|
| 39 |
+
from typing import Any, Dict, Tuple
|
| 40 |
+
from collections import deque
|
| 41 |
+
from dataclasses import dataclass
|
| 42 |
+
|
| 43 |
+
import numpy as np
|
| 44 |
+
from numpy.typing import NDArray
|
| 45 |
+
|
| 46 |
+
from reachy_mini import ReachyMini
|
| 47 |
+
from reachy_mini.utils import create_head_pose
|
| 48 |
+
from reachy_mini.motion.move import Move
|
| 49 |
+
from reachy_mini.utils.interpolation import (
|
| 50 |
+
compose_world_offset,
|
| 51 |
+
linear_pose_interpolation,
|
| 52 |
+
)
|
| 53 |
+
|
| 54 |
+
|
| 55 |
+
logger = logging.getLogger(__name__)
|
| 56 |
+
|
| 57 |
+
# Configuration constants
|
| 58 |
+
CONTROL_LOOP_FREQUENCY_HZ = 100.0 # Hz - Target frequency for the movement control loop
|
| 59 |
+
|
| 60 |
+
# Type definitions
|
| 61 |
+
FullBodyPose = Tuple[NDArray[np.float32], Tuple[float, float], float] # (head_pose_4x4, antennas, body_yaw)
|
| 62 |
+
|
| 63 |
+
|
| 64 |
+
class BreathingMove(Move): # type: ignore
|
| 65 |
+
"""Breathing move with interpolation to neutral and then continuous breathing patterns."""
|
| 66 |
+
|
| 67 |
+
def __init__(
|
| 68 |
+
self,
|
| 69 |
+
interpolation_start_pose: NDArray[np.float32],
|
| 70 |
+
interpolation_start_antennas: Tuple[float, float],
|
| 71 |
+
interpolation_duration: float = 1.0,
|
| 72 |
+
):
|
| 73 |
+
"""Initialize breathing move.
|
| 74 |
+
|
| 75 |
+
Args:
|
| 76 |
+
interpolation_start_pose: 4x4 matrix of current head pose to interpolate from
|
| 77 |
+
interpolation_start_antennas: Current antenna positions to interpolate from
|
| 78 |
+
interpolation_duration: Duration of interpolation to neutral (seconds)
|
| 79 |
+
|
| 80 |
+
"""
|
| 81 |
+
self.interpolation_start_pose = interpolation_start_pose
|
| 82 |
+
self.interpolation_start_antennas = np.array(interpolation_start_antennas)
|
| 83 |
+
self.interpolation_duration = interpolation_duration
|
| 84 |
+
|
| 85 |
+
# Neutral positions for breathing base
|
| 86 |
+
self.neutral_head_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
|
| 87 |
+
self.neutral_antennas = np.array([0.0, 0.0])
|
| 88 |
+
|
| 89 |
+
# Breathing parameters
|
| 90 |
+
self.breathing_z_amplitude = 0.005 # 5mm gentle breathing
|
| 91 |
+
self.breathing_frequency = 0.1 # Hz (6 breaths per minute)
|
| 92 |
+
self.antenna_sway_amplitude = np.deg2rad(15) # 15 degrees
|
| 93 |
+
self.antenna_frequency = 0.5 # Hz (faster antenna sway)
|
| 94 |
+
|
| 95 |
+
@property
|
| 96 |
+
def duration(self) -> float:
|
| 97 |
+
"""Duration property required by official Move interface."""
|
| 98 |
+
return float("inf") # Continuous breathing (never ends naturally)
|
| 99 |
+
|
| 100 |
+
def evaluate(self, t: float) -> tuple[NDArray[np.float64] | None, NDArray[np.float64] | None, float | None]:
|
| 101 |
+
"""Evaluate breathing move at time t."""
|
| 102 |
+
if t < self.interpolation_duration:
|
| 103 |
+
# Phase 1: Interpolate to neutral base position
|
| 104 |
+
interpolation_t = t / self.interpolation_duration
|
| 105 |
+
|
| 106 |
+
# Interpolate head pose
|
| 107 |
+
head_pose = linear_pose_interpolation(
|
| 108 |
+
self.interpolation_start_pose, self.neutral_head_pose, interpolation_t,
|
| 109 |
+
)
|
| 110 |
+
|
| 111 |
+
# Interpolate antennas
|
| 112 |
+
antennas_interp = (
|
| 113 |
+
1 - interpolation_t
|
| 114 |
+
) * self.interpolation_start_antennas + interpolation_t * self.neutral_antennas
|
| 115 |
+
antennas = antennas_interp.astype(np.float64)
|
| 116 |
+
|
| 117 |
+
else:
|
| 118 |
+
# Phase 2: Breathing patterns from neutral base
|
| 119 |
+
breathing_time = t - self.interpolation_duration
|
| 120 |
+
|
| 121 |
+
# Gentle z-axis breathing
|
| 122 |
+
z_offset = self.breathing_z_amplitude * np.sin(2 * np.pi * self.breathing_frequency * breathing_time)
|
| 123 |
+
head_pose = create_head_pose(x=0, y=0, z=z_offset, roll=0, pitch=0, yaw=0, degrees=True, mm=False)
|
| 124 |
+
|
| 125 |
+
# Antenna sway (opposite directions)
|
| 126 |
+
antenna_sway = self.antenna_sway_amplitude * np.sin(2 * np.pi * self.antenna_frequency * breathing_time)
|
| 127 |
+
antennas = np.array([antenna_sway, -antenna_sway], dtype=np.float64)
|
| 128 |
+
|
| 129 |
+
# Return in official Move interface format: (head_pose, antennas_array, body_yaw)
|
| 130 |
+
return (head_pose, antennas, 0.0)
|
| 131 |
+
|
| 132 |
+
|
| 133 |
+
def combine_full_body(primary_pose: FullBodyPose, secondary_pose: FullBodyPose) -> FullBodyPose:
|
| 134 |
+
"""Combine primary and secondary full body poses.
|
| 135 |
+
|
| 136 |
+
Args:
|
| 137 |
+
primary_pose: (head_pose, antennas, body_yaw) - primary move
|
| 138 |
+
secondary_pose: (head_pose, antennas, body_yaw) - secondary offsets
|
| 139 |
+
|
| 140 |
+
Returns:
|
| 141 |
+
Combined full body pose (head_pose, antennas, body_yaw)
|
| 142 |
+
|
| 143 |
+
"""
|
| 144 |
+
primary_head, primary_antennas, primary_body_yaw = primary_pose
|
| 145 |
+
secondary_head, secondary_antennas, secondary_body_yaw = secondary_pose
|
| 146 |
+
|
| 147 |
+
# Combine head poses using compose_world_offset; the secondary pose must be an
|
| 148 |
+
# offset expressed in the world frame (T_off_world) applied to the absolute
|
| 149 |
+
# primary transform (T_abs).
|
| 150 |
+
combined_head = compose_world_offset(primary_head, secondary_head, reorthonormalize=True)
|
| 151 |
+
|
| 152 |
+
# Sum antennas and body_yaw
|
| 153 |
+
combined_antennas = (
|
| 154 |
+
primary_antennas[0] + secondary_antennas[0],
|
| 155 |
+
primary_antennas[1] + secondary_antennas[1],
|
| 156 |
+
)
|
| 157 |
+
combined_body_yaw = primary_body_yaw + secondary_body_yaw
|
| 158 |
+
|
| 159 |
+
return (combined_head, combined_antennas, combined_body_yaw)
|
| 160 |
+
|
| 161 |
+
|
| 162 |
+
def clone_full_body_pose(pose: FullBodyPose) -> FullBodyPose:
|
| 163 |
+
"""Create a deep copy of a full body pose tuple."""
|
| 164 |
+
head, antennas, body_yaw = pose
|
| 165 |
+
return (head.copy(), (float(antennas[0]), float(antennas[1])), float(body_yaw))
|
| 166 |
+
|
| 167 |
+
|
| 168 |
+
@dataclass
|
| 169 |
+
class MovementState:
|
| 170 |
+
"""State tracking for the movement system."""
|
| 171 |
+
|
| 172 |
+
# Primary move state
|
| 173 |
+
current_move: Move | None = None
|
| 174 |
+
move_start_time: float | None = None
|
| 175 |
+
last_activity_time: float = 0.0
|
| 176 |
+
|
| 177 |
+
# Secondary move state (offsets)
|
| 178 |
+
speech_offsets: Tuple[float, float, float, float, float, float] = (
|
| 179 |
+
0.0,
|
| 180 |
+
0.0,
|
| 181 |
+
0.0,
|
| 182 |
+
0.0,
|
| 183 |
+
0.0,
|
| 184 |
+
0.0,
|
| 185 |
+
)
|
| 186 |
+
face_tracking_offsets: Tuple[float, float, float, float, float, float] = (
|
| 187 |
+
0.0,
|
| 188 |
+
0.0,
|
| 189 |
+
0.0,
|
| 190 |
+
0.0,
|
| 191 |
+
0.0,
|
| 192 |
+
0.0,
|
| 193 |
+
)
|
| 194 |
+
|
| 195 |
+
# Status flags
|
| 196 |
+
last_primary_pose: FullBodyPose | None = None
|
| 197 |
+
|
| 198 |
+
def update_activity(self) -> None:
|
| 199 |
+
"""Update the last activity time."""
|
| 200 |
+
self.last_activity_time = time.monotonic()
|
| 201 |
+
|
| 202 |
+
|
| 203 |
+
@dataclass
|
| 204 |
+
class LoopFrequencyStats:
|
| 205 |
+
"""Track rolling loop frequency statistics."""
|
| 206 |
+
|
| 207 |
+
mean: float = 0.0
|
| 208 |
+
m2: float = 0.0
|
| 209 |
+
min_freq: float = float("inf")
|
| 210 |
+
count: int = 0
|
| 211 |
+
last_freq: float = 0.0
|
| 212 |
+
potential_freq: float = 0.0
|
| 213 |
+
|
| 214 |
+
def reset(self) -> None:
|
| 215 |
+
"""Reset accumulators while keeping the last potential frequency."""
|
| 216 |
+
self.mean = 0.0
|
| 217 |
+
self.m2 = 0.0
|
| 218 |
+
self.min_freq = float("inf")
|
| 219 |
+
self.count = 0
|
| 220 |
+
|
| 221 |
+
|
| 222 |
+
class MovementManager:
|
| 223 |
+
"""Coordinate sequential moves, additive offsets, and robot output at 100 Hz.
|
| 224 |
+
|
| 225 |
+
Responsibilities:
|
| 226 |
+
- Own a real-time loop that samples the current primary move (if any), fuses
|
| 227 |
+
secondary offsets, and calls `set_target` exactly once per tick.
|
| 228 |
+
- Start an idle `BreathingMove` after `idle_inactivity_delay` when not
|
| 229 |
+
listening and no moves are queued.
|
| 230 |
+
- Expose thread-safe APIs so other threads can enqueue moves, mark activity,
|
| 231 |
+
or feed secondary offsets without touching internal state.
|
| 232 |
+
|
| 233 |
+
Timing:
|
| 234 |
+
- All elapsed-time calculations rely on `time.monotonic()` through `self._now`
|
| 235 |
+
to avoid wall-clock jumps.
|
| 236 |
+
- The loop attempts 100 Hz
|
| 237 |
+
|
| 238 |
+
Concurrency:
|
| 239 |
+
- External threads communicate via `_command_queue` messages.
|
| 240 |
+
- Secondary offsets are staged via dirty flags guarded by locks and consumed
|
| 241 |
+
atomically inside the worker loop.
|
| 242 |
+
"""
|
| 243 |
+
|
| 244 |
+
def __init__(
|
| 245 |
+
self,
|
| 246 |
+
current_robot: ReachyMini,
|
| 247 |
+
camera_worker: "Any" = None,
|
| 248 |
+
):
|
| 249 |
+
"""Initialize movement manager."""
|
| 250 |
+
self.current_robot = current_robot
|
| 251 |
+
self.camera_worker = camera_worker
|
| 252 |
+
|
| 253 |
+
# Single timing source for durations
|
| 254 |
+
self._now = time.monotonic
|
| 255 |
+
|
| 256 |
+
# Movement state
|
| 257 |
+
self.state = MovementState()
|
| 258 |
+
self.state.last_activity_time = self._now()
|
| 259 |
+
neutral_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
|
| 260 |
+
self.state.last_primary_pose = (neutral_pose, (0.0, 0.0), 0.0)
|
| 261 |
+
|
| 262 |
+
# Move queue (primary moves)
|
| 263 |
+
self.move_queue: deque[Move] = deque()
|
| 264 |
+
|
| 265 |
+
# Configuration
|
| 266 |
+
self.idle_inactivity_delay = 0.3 # seconds
|
| 267 |
+
self.target_frequency = CONTROL_LOOP_FREQUENCY_HZ
|
| 268 |
+
self.target_period = 1.0 / self.target_frequency
|
| 269 |
+
|
| 270 |
+
self._stop_event = threading.Event()
|
| 271 |
+
self._thread: threading.Thread | None = None
|
| 272 |
+
self._is_listening = False
|
| 273 |
+
self._last_commanded_pose: FullBodyPose = clone_full_body_pose(self.state.last_primary_pose)
|
| 274 |
+
self._listening_antennas: Tuple[float, float] = self._last_commanded_pose[1]
|
| 275 |
+
self._antenna_unfreeze_blend = 1.0
|
| 276 |
+
self._antenna_blend_duration = 0.4 # seconds to blend back after listening
|
| 277 |
+
self._last_listening_blend_time = self._now()
|
| 278 |
+
self._breathing_active = False # true when breathing move is running or queued
|
| 279 |
+
self._listening_debounce_s = 0.15
|
| 280 |
+
self._last_listening_toggle_time = self._now()
|
| 281 |
+
self._last_set_target_err = 0.0
|
| 282 |
+
self._set_target_err_interval = 1.0 # seconds between error logs
|
| 283 |
+
self._set_target_err_suppressed = 0
|
| 284 |
+
|
| 285 |
+
# Cross-thread signalling
|
| 286 |
+
self._command_queue: "Queue[Tuple[str, Any]]" = Queue()
|
| 287 |
+
self._speech_offsets_lock = threading.Lock()
|
| 288 |
+
self._pending_speech_offsets: Tuple[float, float, float, float, float, float] = (
|
| 289 |
+
0.0,
|
| 290 |
+
0.0,
|
| 291 |
+
0.0,
|
| 292 |
+
0.0,
|
| 293 |
+
0.0,
|
| 294 |
+
0.0,
|
| 295 |
+
)
|
| 296 |
+
self._speech_offsets_dirty = False
|
| 297 |
+
|
| 298 |
+
self._face_offsets_lock = threading.Lock()
|
| 299 |
+
self._pending_face_offsets: Tuple[float, float, float, float, float, float] = (
|
| 300 |
+
0.0,
|
| 301 |
+
0.0,
|
| 302 |
+
0.0,
|
| 303 |
+
0.0,
|
| 304 |
+
0.0,
|
| 305 |
+
0.0,
|
| 306 |
+
)
|
| 307 |
+
self._face_offsets_dirty = False
|
| 308 |
+
|
| 309 |
+
self._shared_state_lock = threading.Lock()
|
| 310 |
+
self._shared_last_activity_time = self.state.last_activity_time
|
| 311 |
+
self._shared_is_listening = self._is_listening
|
| 312 |
+
self._status_lock = threading.Lock()
|
| 313 |
+
self._freq_stats = LoopFrequencyStats()
|
| 314 |
+
self._freq_snapshot = LoopFrequencyStats()
|
| 315 |
+
|
| 316 |
+
def queue_move(self, move: Move) -> None:
|
| 317 |
+
"""Queue a primary move to run after the currently executing one.
|
| 318 |
+
|
| 319 |
+
Thread-safe: the move is enqueued via the worker command queue so the
|
| 320 |
+
control loop remains the sole mutator of movement state.
|
| 321 |
+
"""
|
| 322 |
+
self._command_queue.put(("queue_move", move))
|
| 323 |
+
|
| 324 |
+
def clear_move_queue(self) -> None:
|
| 325 |
+
"""Stop the active move and discard any queued primary moves.
|
| 326 |
+
|
| 327 |
+
Thread-safe: executed by the worker thread via the command queue.
|
| 328 |
+
"""
|
| 329 |
+
self._command_queue.put(("clear_queue", None))
|
| 330 |
+
|
| 331 |
+
def set_speech_offsets(self, offsets: Tuple[float, float, float, float, float, float]) -> None:
|
| 332 |
+
"""Update speech-induced secondary offsets (x, y, z, roll, pitch, yaw).
|
| 333 |
+
|
| 334 |
+
Offsets are interpreted as metres for translation and radians for
|
| 335 |
+
rotation in the world frame. Thread-safe via a pending snapshot.
|
| 336 |
+
"""
|
| 337 |
+
with self._speech_offsets_lock:
|
| 338 |
+
self._pending_speech_offsets = offsets
|
| 339 |
+
self._speech_offsets_dirty = True
|
| 340 |
+
|
| 341 |
+
def set_moving_state(self, duration: float) -> None:
|
| 342 |
+
"""Mark the robot as actively moving for the provided duration.
|
| 343 |
+
|
| 344 |
+
Legacy hook used by goto helpers to keep inactivity and breathing logic
|
| 345 |
+
aware of manual motions. Thread-safe via the command queue.
|
| 346 |
+
"""
|
| 347 |
+
self._command_queue.put(("set_moving_state", duration))
|
| 348 |
+
|
| 349 |
+
def is_idle(self) -> bool:
|
| 350 |
+
"""Return True when the robot has been inactive longer than the idle delay."""
|
| 351 |
+
with self._shared_state_lock:
|
| 352 |
+
last_activity = self._shared_last_activity_time
|
| 353 |
+
listening = self._shared_is_listening
|
| 354 |
+
|
| 355 |
+
if listening:
|
| 356 |
+
return False
|
| 357 |
+
|
| 358 |
+
return self._now() - last_activity >= self.idle_inactivity_delay
|
| 359 |
+
|
| 360 |
+
def set_listening(self, listening: bool) -> None:
|
| 361 |
+
"""Enable or disable listening mode without touching shared state directly.
|
| 362 |
+
|
| 363 |
+
While listening:
|
| 364 |
+
- Antenna positions are frozen at the last commanded values.
|
| 365 |
+
- Blending is reset so that upon unfreezing the antennas return smoothly.
|
| 366 |
+
- Idle breathing is suppressed.
|
| 367 |
+
|
| 368 |
+
Thread-safe: the change is posted to the worker command queue.
|
| 369 |
+
"""
|
| 370 |
+
with self._shared_state_lock:
|
| 371 |
+
if self._shared_is_listening == listening:
|
| 372 |
+
return
|
| 373 |
+
self._command_queue.put(("set_listening", listening))
|
| 374 |
+
|
| 375 |
+
def _poll_signals(self, current_time: float) -> None:
|
| 376 |
+
"""Apply queued commands and pending offset updates."""
|
| 377 |
+
self._apply_pending_offsets()
|
| 378 |
+
|
| 379 |
+
while True:
|
| 380 |
+
try:
|
| 381 |
+
command, payload = self._command_queue.get_nowait()
|
| 382 |
+
except Empty:
|
| 383 |
+
break
|
| 384 |
+
self._handle_command(command, payload, current_time)
|
| 385 |
+
|
| 386 |
+
def _apply_pending_offsets(self) -> None:
|
| 387 |
+
"""Apply the most recent speech/face offset updates."""
|
| 388 |
+
speech_offsets: Tuple[float, float, float, float, float, float] | None = None
|
| 389 |
+
with self._speech_offsets_lock:
|
| 390 |
+
if self._speech_offsets_dirty:
|
| 391 |
+
speech_offsets = self._pending_speech_offsets
|
| 392 |
+
self._speech_offsets_dirty = False
|
| 393 |
+
|
| 394 |
+
if speech_offsets is not None:
|
| 395 |
+
self.state.speech_offsets = speech_offsets
|
| 396 |
+
self.state.update_activity()
|
| 397 |
+
|
| 398 |
+
face_offsets: Tuple[float, float, float, float, float, float] | None = None
|
| 399 |
+
with self._face_offsets_lock:
|
| 400 |
+
if self._face_offsets_dirty:
|
| 401 |
+
face_offsets = self._pending_face_offsets
|
| 402 |
+
self._face_offsets_dirty = False
|
| 403 |
+
|
| 404 |
+
if face_offsets is not None:
|
| 405 |
+
self.state.face_tracking_offsets = face_offsets
|
| 406 |
+
self.state.update_activity()
|
| 407 |
+
|
| 408 |
+
def _handle_command(self, command: str, payload: Any, current_time: float) -> None:
|
| 409 |
+
"""Handle a single cross-thread command."""
|
| 410 |
+
if command == "queue_move":
|
| 411 |
+
if isinstance(payload, Move):
|
| 412 |
+
self.move_queue.append(payload)
|
| 413 |
+
self.state.update_activity()
|
| 414 |
+
duration = getattr(payload, "duration", None)
|
| 415 |
+
if duration is not None:
|
| 416 |
+
try:
|
| 417 |
+
duration_str = f"{float(duration):.2f}"
|
| 418 |
+
except (TypeError, ValueError):
|
| 419 |
+
duration_str = str(duration)
|
| 420 |
+
else:
|
| 421 |
+
duration_str = "?"
|
| 422 |
+
logger.debug(
|
| 423 |
+
"Queued move with duration %ss, queue size: %s",
|
| 424 |
+
duration_str,
|
| 425 |
+
len(self.move_queue),
|
| 426 |
+
)
|
| 427 |
+
else:
|
| 428 |
+
logger.warning("Ignored queue_move command with invalid payload: %s", payload)
|
| 429 |
+
elif command == "clear_queue":
|
| 430 |
+
self.move_queue.clear()
|
| 431 |
+
self.state.current_move = None
|
| 432 |
+
self.state.move_start_time = None
|
| 433 |
+
self._breathing_active = False
|
| 434 |
+
logger.info("Cleared move queue and stopped current move")
|
| 435 |
+
elif command == "set_moving_state":
|
| 436 |
+
try:
|
| 437 |
+
duration = float(payload)
|
| 438 |
+
except (TypeError, ValueError):
|
| 439 |
+
logger.warning("Invalid moving state duration: %s", payload)
|
| 440 |
+
return
|
| 441 |
+
self.state.update_activity()
|
| 442 |
+
elif command == "mark_activity":
|
| 443 |
+
self.state.update_activity()
|
| 444 |
+
elif command == "set_listening":
|
| 445 |
+
desired_state = bool(payload)
|
| 446 |
+
now = self._now()
|
| 447 |
+
if now - self._last_listening_toggle_time < self._listening_debounce_s:
|
| 448 |
+
return
|
| 449 |
+
self._last_listening_toggle_time = now
|
| 450 |
+
|
| 451 |
+
if self._is_listening == desired_state:
|
| 452 |
+
return
|
| 453 |
+
|
| 454 |
+
self._is_listening = desired_state
|
| 455 |
+
self._last_listening_blend_time = now
|
| 456 |
+
if desired_state:
|
| 457 |
+
# Freeze: snapshot current commanded antennas and reset blend
|
| 458 |
+
self._listening_antennas = (
|
| 459 |
+
float(self._last_commanded_pose[1][0]),
|
| 460 |
+
float(self._last_commanded_pose[1][1]),
|
| 461 |
+
)
|
| 462 |
+
self._antenna_unfreeze_blend = 0.0
|
| 463 |
+
else:
|
| 464 |
+
# Unfreeze: restart blending from frozen pose
|
| 465 |
+
self._antenna_unfreeze_blend = 0.0
|
| 466 |
+
self.state.update_activity()
|
| 467 |
+
else:
|
| 468 |
+
logger.warning("Unknown command received by MovementManager: %s", command)
|
| 469 |
+
|
| 470 |
+
def _publish_shared_state(self) -> None:
|
| 471 |
+
"""Expose idle-related state for external threads."""
|
| 472 |
+
with self._shared_state_lock:
|
| 473 |
+
self._shared_last_activity_time = self.state.last_activity_time
|
| 474 |
+
self._shared_is_listening = self._is_listening
|
| 475 |
+
|
| 476 |
+
def _manage_move_queue(self, current_time: float) -> None:
|
| 477 |
+
"""Manage the primary move queue (sequential execution)."""
|
| 478 |
+
if self.state.current_move is None or (
|
| 479 |
+
self.state.move_start_time is not None
|
| 480 |
+
and current_time - self.state.move_start_time >= self.state.current_move.duration
|
| 481 |
+
):
|
| 482 |
+
self.state.current_move = None
|
| 483 |
+
self.state.move_start_time = None
|
| 484 |
+
|
| 485 |
+
if self.move_queue:
|
| 486 |
+
self.state.current_move = self.move_queue.popleft()
|
| 487 |
+
self.state.move_start_time = current_time
|
| 488 |
+
# Any real move cancels breathing mode flag
|
| 489 |
+
self._breathing_active = isinstance(self.state.current_move, BreathingMove)
|
| 490 |
+
logger.debug(f"Starting new move, duration: {self.state.current_move.duration}s")
|
| 491 |
+
|
| 492 |
+
def _manage_breathing(self, current_time: float) -> None:
|
| 493 |
+
"""Manage automatic breathing when idle."""
|
| 494 |
+
if (
|
| 495 |
+
self.state.current_move is None
|
| 496 |
+
and not self.move_queue
|
| 497 |
+
and not self._is_listening
|
| 498 |
+
and not self._breathing_active
|
| 499 |
+
):
|
| 500 |
+
idle_for = current_time - self.state.last_activity_time
|
| 501 |
+
if idle_for >= self.idle_inactivity_delay:
|
| 502 |
+
try:
|
| 503 |
+
# These 2 functions return the latest available sensor data from the robot, but don't perform I/O synchronously.
|
| 504 |
+
# Therefore, we accept calling them inside the control loop.
|
| 505 |
+
_, current_antennas = self.current_robot.get_current_joint_positions()
|
| 506 |
+
current_head_pose = self.current_robot.get_current_head_pose()
|
| 507 |
+
|
| 508 |
+
self._breathing_active = True
|
| 509 |
+
self.state.update_activity()
|
| 510 |
+
|
| 511 |
+
breathing_move = BreathingMove(
|
| 512 |
+
interpolation_start_pose=current_head_pose,
|
| 513 |
+
interpolation_start_antennas=current_antennas,
|
| 514 |
+
interpolation_duration=1.0,
|
| 515 |
+
)
|
| 516 |
+
self.move_queue.append(breathing_move)
|
| 517 |
+
logger.debug("Started breathing after %.1fs of inactivity", idle_for)
|
| 518 |
+
except Exception as e:
|
| 519 |
+
self._breathing_active = False
|
| 520 |
+
logger.error("Failed to start breathing: %s", e)
|
| 521 |
+
|
| 522 |
+
if isinstance(self.state.current_move, BreathingMove) and self.move_queue:
|
| 523 |
+
self.state.current_move = None
|
| 524 |
+
self.state.move_start_time = None
|
| 525 |
+
self._breathing_active = False
|
| 526 |
+
logger.debug("Stopping breathing due to new move activity")
|
| 527 |
+
|
| 528 |
+
if self.state.current_move is not None and not isinstance(self.state.current_move, BreathingMove):
|
| 529 |
+
self._breathing_active = False
|
| 530 |
+
|
| 531 |
+
def _get_primary_pose(self, current_time: float) -> FullBodyPose:
|
| 532 |
+
"""Get the primary full body pose from current move or neutral."""
|
| 533 |
+
# When a primary move is playing, sample it and cache the resulting pose
|
| 534 |
+
if self.state.current_move is not None and self.state.move_start_time is not None:
|
| 535 |
+
move_time = current_time - self.state.move_start_time
|
| 536 |
+
head, antennas, body_yaw = self.state.current_move.evaluate(move_time)
|
| 537 |
+
|
| 538 |
+
if head is None:
|
| 539 |
+
head = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
|
| 540 |
+
if antennas is None:
|
| 541 |
+
antennas = np.array([0.0, 0.0])
|
| 542 |
+
if body_yaw is None:
|
| 543 |
+
body_yaw = 0.0
|
| 544 |
+
|
| 545 |
+
antennas_tuple = (float(antennas[0]), float(antennas[1]))
|
| 546 |
+
head_copy = head.copy()
|
| 547 |
+
primary_full_body_pose = (
|
| 548 |
+
head_copy,
|
| 549 |
+
antennas_tuple,
|
| 550 |
+
float(body_yaw),
|
| 551 |
+
)
|
| 552 |
+
|
| 553 |
+
self.state.last_primary_pose = clone_full_body_pose(primary_full_body_pose)
|
| 554 |
+
# Otherwise reuse the last primary pose so we avoid jumps between moves
|
| 555 |
+
elif self.state.last_primary_pose is not None:
|
| 556 |
+
primary_full_body_pose = clone_full_body_pose(self.state.last_primary_pose)
|
| 557 |
+
else:
|
| 558 |
+
neutral_head_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
|
| 559 |
+
primary_full_body_pose = (neutral_head_pose, (0.0, 0.0), 0.0)
|
| 560 |
+
self.state.last_primary_pose = clone_full_body_pose(primary_full_body_pose)
|
| 561 |
+
|
| 562 |
+
return primary_full_body_pose
|
| 563 |
+
|
| 564 |
+
def _get_secondary_pose(self) -> FullBodyPose:
|
| 565 |
+
"""Get the secondary full body pose from speech and face tracking offsets."""
|
| 566 |
+
# Combine speech sway offsets + face tracking offsets for secondary pose
|
| 567 |
+
secondary_offsets = [
|
| 568 |
+
self.state.speech_offsets[0] + self.state.face_tracking_offsets[0],
|
| 569 |
+
self.state.speech_offsets[1] + self.state.face_tracking_offsets[1],
|
| 570 |
+
self.state.speech_offsets[2] + self.state.face_tracking_offsets[2],
|
| 571 |
+
self.state.speech_offsets[3] + self.state.face_tracking_offsets[3],
|
| 572 |
+
self.state.speech_offsets[4] + self.state.face_tracking_offsets[4],
|
| 573 |
+
self.state.speech_offsets[5] + self.state.face_tracking_offsets[5],
|
| 574 |
+
]
|
| 575 |
+
|
| 576 |
+
secondary_head_pose = create_head_pose(
|
| 577 |
+
x=secondary_offsets[0],
|
| 578 |
+
y=secondary_offsets[1],
|
| 579 |
+
z=secondary_offsets[2],
|
| 580 |
+
roll=secondary_offsets[3],
|
| 581 |
+
pitch=secondary_offsets[4],
|
| 582 |
+
yaw=secondary_offsets[5],
|
| 583 |
+
degrees=False,
|
| 584 |
+
mm=False,
|
| 585 |
+
)
|
| 586 |
+
return (secondary_head_pose, (0.0, 0.0), 0.0)
|
| 587 |
+
|
| 588 |
+
def _compose_full_body_pose(self, current_time: float) -> FullBodyPose:
|
| 589 |
+
"""Compose primary and secondary poses into a single command pose."""
|
| 590 |
+
primary = self._get_primary_pose(current_time)
|
| 591 |
+
secondary = self._get_secondary_pose()
|
| 592 |
+
return combine_full_body(primary, secondary)
|
| 593 |
+
|
| 594 |
+
def _update_primary_motion(self, current_time: float) -> None:
|
| 595 |
+
"""Advance queue state and idle behaviours for this tick."""
|
| 596 |
+
self._manage_move_queue(current_time)
|
| 597 |
+
self._manage_breathing(current_time)
|
| 598 |
+
|
| 599 |
+
def _calculate_blended_antennas(self, target_antennas: Tuple[float, float]) -> Tuple[float, float]:
|
| 600 |
+
"""Blend target antennas with listening freeze state and update blending."""
|
| 601 |
+
now = self._now()
|
| 602 |
+
listening = self._is_listening
|
| 603 |
+
listening_antennas = self._listening_antennas
|
| 604 |
+
blend = self._antenna_unfreeze_blend
|
| 605 |
+
blend_duration = self._antenna_blend_duration
|
| 606 |
+
last_update = self._last_listening_blend_time
|
| 607 |
+
self._last_listening_blend_time = now
|
| 608 |
+
|
| 609 |
+
if listening:
|
| 610 |
+
antennas_cmd = listening_antennas
|
| 611 |
+
new_blend = 0.0
|
| 612 |
+
else:
|
| 613 |
+
dt = max(0.0, now - last_update)
|
| 614 |
+
if blend_duration <= 0:
|
| 615 |
+
new_blend = 1.0
|
| 616 |
+
else:
|
| 617 |
+
new_blend = min(1.0, blend + dt / blend_duration)
|
| 618 |
+
antennas_cmd = (
|
| 619 |
+
listening_antennas[0] * (1.0 - new_blend) + target_antennas[0] * new_blend,
|
| 620 |
+
listening_antennas[1] * (1.0 - new_blend) + target_antennas[1] * new_blend,
|
| 621 |
+
)
|
| 622 |
+
|
| 623 |
+
if listening:
|
| 624 |
+
self._antenna_unfreeze_blend = 0.0
|
| 625 |
+
else:
|
| 626 |
+
self._antenna_unfreeze_blend = new_blend
|
| 627 |
+
if new_blend >= 1.0:
|
| 628 |
+
self._listening_antennas = (
|
| 629 |
+
float(target_antennas[0]),
|
| 630 |
+
float(target_antennas[1]),
|
| 631 |
+
)
|
| 632 |
+
|
| 633 |
+
return antennas_cmd
|
| 634 |
+
|
| 635 |
+
def _issue_control_command(self, head: NDArray[np.float32], antennas: Tuple[float, float], body_yaw: float) -> None:
|
| 636 |
+
"""Send the fused pose to the robot with throttled error logging."""
|
| 637 |
+
try:
|
| 638 |
+
self.current_robot.set_target(head=head, antennas=antennas, body_yaw=body_yaw)
|
| 639 |
+
except Exception as e:
|
| 640 |
+
now = self._now()
|
| 641 |
+
if now - self._last_set_target_err >= self._set_target_err_interval:
|
| 642 |
+
msg = f"Failed to set robot target: {e}"
|
| 643 |
+
if self._set_target_err_suppressed:
|
| 644 |
+
msg += f" (suppressed {self._set_target_err_suppressed} repeats)"
|
| 645 |
+
self._set_target_err_suppressed = 0
|
| 646 |
+
logger.error(msg)
|
| 647 |
+
self._last_set_target_err = now
|
| 648 |
+
else:
|
| 649 |
+
self._set_target_err_suppressed += 1
|
| 650 |
+
else:
|
| 651 |
+
with self._status_lock:
|
| 652 |
+
self._last_commanded_pose = clone_full_body_pose((head, antennas, body_yaw))
|
| 653 |
+
|
| 654 |
+
def _update_frequency_stats(
|
| 655 |
+
self, loop_start: float, prev_loop_start: float, stats: LoopFrequencyStats,
|
| 656 |
+
) -> LoopFrequencyStats:
|
| 657 |
+
"""Update frequency statistics based on the current loop start time."""
|
| 658 |
+
period = loop_start - prev_loop_start
|
| 659 |
+
if period > 0:
|
| 660 |
+
stats.last_freq = 1.0 / period
|
| 661 |
+
stats.count += 1
|
| 662 |
+
delta = stats.last_freq - stats.mean
|
| 663 |
+
stats.mean += delta / stats.count
|
| 664 |
+
stats.m2 += delta * (stats.last_freq - stats.mean)
|
| 665 |
+
stats.min_freq = min(stats.min_freq, stats.last_freq)
|
| 666 |
+
return stats
|
| 667 |
+
|
| 668 |
+
def _schedule_next_tick(self, loop_start: float, stats: LoopFrequencyStats) -> Tuple[float, LoopFrequencyStats]:
|
| 669 |
+
"""Compute sleep time to maintain target frequency and update potential freq."""
|
| 670 |
+
computation_time = self._now() - loop_start
|
| 671 |
+
stats.potential_freq = 1.0 / computation_time if computation_time > 0 else float("inf")
|
| 672 |
+
sleep_time = max(0.0, self.target_period - computation_time)
|
| 673 |
+
return sleep_time, stats
|
| 674 |
+
|
| 675 |
+
def _record_frequency_snapshot(self, stats: LoopFrequencyStats) -> None:
|
| 676 |
+
"""Store a thread-safe snapshot of current frequency statistics."""
|
| 677 |
+
with self._status_lock:
|
| 678 |
+
self._freq_snapshot = LoopFrequencyStats(
|
| 679 |
+
mean=stats.mean,
|
| 680 |
+
m2=stats.m2,
|
| 681 |
+
min_freq=stats.min_freq,
|
| 682 |
+
count=stats.count,
|
| 683 |
+
last_freq=stats.last_freq,
|
| 684 |
+
potential_freq=stats.potential_freq,
|
| 685 |
+
)
|
| 686 |
+
|
| 687 |
+
def _maybe_log_frequency(self, loop_count: int, print_interval_loops: int, stats: LoopFrequencyStats) -> None:
|
| 688 |
+
"""Emit frequency telemetry when enough loops have elapsed."""
|
| 689 |
+
if loop_count % print_interval_loops != 0 or stats.count == 0:
|
| 690 |
+
return
|
| 691 |
+
|
| 692 |
+
variance = stats.m2 / stats.count if stats.count > 0 else 0.0
|
| 693 |
+
lowest = stats.min_freq if stats.min_freq != float("inf") else 0.0
|
| 694 |
+
logger.debug(
|
| 695 |
+
"Loop freq - avg: %.2fHz, variance: %.4f, min: %.2fHz, last: %.2fHz, potential: %.2fHz, target: %.1fHz",
|
| 696 |
+
stats.mean,
|
| 697 |
+
variance,
|
| 698 |
+
lowest,
|
| 699 |
+
stats.last_freq,
|
| 700 |
+
stats.potential_freq,
|
| 701 |
+
self.target_frequency,
|
| 702 |
+
)
|
| 703 |
+
stats.reset()
|
| 704 |
+
|
| 705 |
+
def _update_face_tracking(self, current_time: float) -> None:
|
| 706 |
+
"""Get face tracking offsets from camera worker thread."""
|
| 707 |
+
if self.camera_worker is not None:
|
| 708 |
+
# Get face tracking offsets from camera worker thread
|
| 709 |
+
offsets = self.camera_worker.get_face_tracking_offsets()
|
| 710 |
+
self.state.face_tracking_offsets = offsets
|
| 711 |
+
else:
|
| 712 |
+
# No camera worker, use neutral offsets
|
| 713 |
+
self.state.face_tracking_offsets = (0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
|
| 714 |
+
|
| 715 |
+
def start(self) -> None:
|
| 716 |
+
"""Start the worker thread that drives the 100 Hz control loop."""
|
| 717 |
+
if self._thread is not None and self._thread.is_alive():
|
| 718 |
+
logger.warning("Move worker already running; start() ignored")
|
| 719 |
+
return
|
| 720 |
+
self._stop_event.clear()
|
| 721 |
+
self._thread = threading.Thread(target=self.working_loop, daemon=True)
|
| 722 |
+
self._thread.start()
|
| 723 |
+
logger.debug("Move worker started")
|
| 724 |
+
|
| 725 |
+
def stop(self) -> None:
|
| 726 |
+
"""Request the worker thread to stop and wait for it to exit.
|
| 727 |
+
|
| 728 |
+
Before stopping, resets the robot to a neutral position.
|
| 729 |
+
"""
|
| 730 |
+
if self._thread is None or not self._thread.is_alive():
|
| 731 |
+
logger.debug("Move worker not running; stop() ignored")
|
| 732 |
+
return
|
| 733 |
+
|
| 734 |
+
logger.info("Stopping movement manager and resetting to neutral position...")
|
| 735 |
+
|
| 736 |
+
# Clear any queued moves and stop current move
|
| 737 |
+
self.clear_move_queue()
|
| 738 |
+
|
| 739 |
+
# Stop the worker thread first so it doesn't interfere
|
| 740 |
+
self._stop_event.set()
|
| 741 |
+
if self._thread is not None:
|
| 742 |
+
self._thread.join()
|
| 743 |
+
self._thread = None
|
| 744 |
+
logger.debug("Move worker stopped")
|
| 745 |
+
|
| 746 |
+
# Reset to neutral position using goto_target (same approach as wake_up)
|
| 747 |
+
try:
|
| 748 |
+
neutral_head_pose = create_head_pose(0, 0, 0, 0, 0, 0, degrees=True)
|
| 749 |
+
neutral_antennas = [0.0, 0.0]
|
| 750 |
+
neutral_body_yaw = 0.0
|
| 751 |
+
|
| 752 |
+
# Use goto_target directly on the robot
|
| 753 |
+
self.current_robot.goto_target(
|
| 754 |
+
head=neutral_head_pose,
|
| 755 |
+
antennas=neutral_antennas,
|
| 756 |
+
duration=2.0,
|
| 757 |
+
body_yaw=neutral_body_yaw,
|
| 758 |
+
)
|
| 759 |
+
|
| 760 |
+
logger.info("Reset to neutral position completed")
|
| 761 |
+
|
| 762 |
+
except Exception as e:
|
| 763 |
+
logger.error(f"Failed to reset to neutral position: {e}")
|
| 764 |
+
|
| 765 |
+
def get_status(self) -> Dict[str, Any]:
|
| 766 |
+
"""Return a lightweight status snapshot for observability."""
|
| 767 |
+
with self._status_lock:
|
| 768 |
+
pose_snapshot = clone_full_body_pose(self._last_commanded_pose)
|
| 769 |
+
freq_snapshot = LoopFrequencyStats(
|
| 770 |
+
mean=self._freq_snapshot.mean,
|
| 771 |
+
m2=self._freq_snapshot.m2,
|
| 772 |
+
min_freq=self._freq_snapshot.min_freq,
|
| 773 |
+
count=self._freq_snapshot.count,
|
| 774 |
+
last_freq=self._freq_snapshot.last_freq,
|
| 775 |
+
potential_freq=self._freq_snapshot.potential_freq,
|
| 776 |
+
)
|
| 777 |
+
|
| 778 |
+
head_matrix = pose_snapshot[0].tolist() if pose_snapshot else None
|
| 779 |
+
antennas = pose_snapshot[1] if pose_snapshot else None
|
| 780 |
+
body_yaw = pose_snapshot[2] if pose_snapshot else None
|
| 781 |
+
|
| 782 |
+
return {
|
| 783 |
+
"queue_size": len(self.move_queue),
|
| 784 |
+
"is_listening": self._is_listening,
|
| 785 |
+
"breathing_active": self._breathing_active,
|
| 786 |
+
"last_commanded_pose": {
|
| 787 |
+
"head": head_matrix,
|
| 788 |
+
"antennas": antennas,
|
| 789 |
+
"body_yaw": body_yaw,
|
| 790 |
+
},
|
| 791 |
+
"loop_frequency": {
|
| 792 |
+
"last": freq_snapshot.last_freq,
|
| 793 |
+
"mean": freq_snapshot.mean,
|
| 794 |
+
"min": freq_snapshot.min_freq,
|
| 795 |
+
"potential": freq_snapshot.potential_freq,
|
| 796 |
+
"samples": freq_snapshot.count,
|
| 797 |
+
},
|
| 798 |
+
}
|
| 799 |
+
|
| 800 |
+
def working_loop(self) -> None:
|
| 801 |
+
"""Control loop main movements - reproduces main_works.py control architecture.
|
| 802 |
+
|
| 803 |
+
Single set_target() call with pose fusion.
|
| 804 |
+
"""
|
| 805 |
+
logger.debug("Starting enhanced movement control loop (100Hz)")
|
| 806 |
+
|
| 807 |
+
loop_count = 0
|
| 808 |
+
prev_loop_start = self._now()
|
| 809 |
+
print_interval_loops = max(1, int(self.target_frequency * 2))
|
| 810 |
+
freq_stats = self._freq_stats
|
| 811 |
+
|
| 812 |
+
while not self._stop_event.is_set():
|
| 813 |
+
loop_start = self._now()
|
| 814 |
+
loop_count += 1
|
| 815 |
+
|
| 816 |
+
if loop_count > 1:
|
| 817 |
+
freq_stats = self._update_frequency_stats(loop_start, prev_loop_start, freq_stats)
|
| 818 |
+
prev_loop_start = loop_start
|
| 819 |
+
|
| 820 |
+
# 1) Poll external commands and apply pending offsets (atomic snapshot)
|
| 821 |
+
self._poll_signals(loop_start)
|
| 822 |
+
|
| 823 |
+
# 2) Manage the primary move queue (start new move, end finished move, breathing)
|
| 824 |
+
self._update_primary_motion(loop_start)
|
| 825 |
+
|
| 826 |
+
# 3) Update vision-based secondary offsets
|
| 827 |
+
self._update_face_tracking(loop_start)
|
| 828 |
+
|
| 829 |
+
# 4) Build primary and secondary full-body poses, then fuse them
|
| 830 |
+
head, antennas, body_yaw = self._compose_full_body_pose(loop_start)
|
| 831 |
+
|
| 832 |
+
# 5) Apply listening antenna freeze or blend-back
|
| 833 |
+
antennas_cmd = self._calculate_blended_antennas(antennas)
|
| 834 |
+
|
| 835 |
+
# 6) Single set_target call - the only control point
|
| 836 |
+
self._issue_control_command(head, antennas_cmd, body_yaw)
|
| 837 |
+
|
| 838 |
+
# 7) Adaptive sleep to align to next tick, then publish shared state
|
| 839 |
+
sleep_time, freq_stats = self._schedule_next_tick(loop_start, freq_stats)
|
| 840 |
+
self._publish_shared_state()
|
| 841 |
+
self._record_frequency_snapshot(freq_stats)
|
| 842 |
+
|
| 843 |
+
# 8) Periodic telemetry on loop frequency
|
| 844 |
+
self._maybe_log_frequency(loop_count, print_interval_loops, freq_stats)
|
| 845 |
+
|
| 846 |
+
if sleep_time > 0:
|
| 847 |
+
time.sleep(sleep_time)
|
| 848 |
+
|
| 849 |
+
logger.debug("Movement control loop stopped")
|
pyproject.toml
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[build-system]
|
| 2 |
+
requires = ["setuptools"]
|
| 3 |
+
build-backend = "setuptools.build_meta"
|
| 4 |
+
|
| 5 |
+
[project]
|
| 6 |
+
name = "moltbot-body"
|
| 7 |
+
version = "0.1.0"
|
| 8 |
+
description = "Motlbot's physical body - Reachy Mini integration with Clawdbot"
|
| 9 |
+
readme = "README.md"
|
| 10 |
+
requires-python = ">=3.12"
|
| 11 |
+
dependencies = [
|
| 12 |
+
# Reachy Mini SDK
|
| 13 |
+
"reachy-mini>=1.2.13",
|
| 14 |
+
"reachy_mini_dances_library",
|
| 15 |
+
"reachy_mini_toolbox",
|
| 16 |
+
|
| 17 |
+
# Audio
|
| 18 |
+
"numpy",
|
| 19 |
+
"scipy",
|
| 20 |
+
"soundfile",
|
| 21 |
+
|
| 22 |
+
# Whisper STT (faster-whisper uses CTranslate2, no numba dependency)
|
| 23 |
+
"faster-whisper",
|
| 24 |
+
|
| 25 |
+
# HTTP client for Clawdbot gateway
|
| 26 |
+
"httpx",
|
| 27 |
+
"httpx-sse>=0.4.0",
|
| 28 |
+
|
| 29 |
+
# WebSocket for streaming TTS
|
| 30 |
+
"websockets>=12.0",
|
| 31 |
+
|
| 32 |
+
# Environment
|
| 33 |
+
"python-dotenv",
|
| 34 |
+
]
|
| 35 |
+
|
| 36 |
+
[project.optional-dependencies]
|
| 37 |
+
dev = [
|
| 38 |
+
"pytest",
|
| 39 |
+
"ruff",
|
| 40 |
+
]
|
| 41 |
+
|
| 42 |
+
[project.scripts]
|
| 43 |
+
moltbot-body = "moltbot_body.main:main"
|
| 44 |
+
|
| 45 |
+
[project.entry-points."reachy_mini_apps"]
|
| 46 |
+
moltbot-body = "moltbot_body.main:MoltbotBody"
|
| 47 |
+
|
| 48 |
+
[tool.setuptools.packages.find]
|
| 49 |
+
where = ["."]
|
style.css
ADDED
|
@@ -0,0 +1,395 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
:root {
|
| 2 |
+
--bg: #060c1d;
|
| 3 |
+
--panel: #0c172b;
|
| 4 |
+
--glass: rgba(17, 27, 48, 0.7);
|
| 5 |
+
--card: rgba(255, 255, 255, 0.04);
|
| 6 |
+
--accent: #7af5c4;
|
| 7 |
+
--accent-2: #f6c452;
|
| 8 |
+
--text: #e8edf7;
|
| 9 |
+
--muted: #9fb3ce;
|
| 10 |
+
--border: rgba(255, 255, 255, 0.08);
|
| 11 |
+
--shadow: 0 25px 70px rgba(0, 0, 0, 0.45);
|
| 12 |
+
font-family: "Space Grotesk", "Manrope", system-ui, -apple-system, sans-serif;
|
| 13 |
+
}
|
| 14 |
+
|
| 15 |
+
* {
|
| 16 |
+
margin: 0;
|
| 17 |
+
padding: 0;
|
| 18 |
+
box-sizing: border-box;
|
| 19 |
+
}
|
| 20 |
+
|
| 21 |
+
body {
|
| 22 |
+
background: radial-gradient(circle at 20% 20%, rgba(122, 245, 196, 0.12), transparent 30%),
|
| 23 |
+
radial-gradient(circle at 80% 0%, rgba(246, 196, 82, 0.14), transparent 32%),
|
| 24 |
+
radial-gradient(circle at 50% 70%, rgba(124, 142, 255, 0.1), transparent 30%),
|
| 25 |
+
var(--bg);
|
| 26 |
+
color: var(--text);
|
| 27 |
+
min-height: 100vh;
|
| 28 |
+
line-height: 1.6;
|
| 29 |
+
padding-bottom: 3rem;
|
| 30 |
+
}
|
| 31 |
+
|
| 32 |
+
a {
|
| 33 |
+
color: inherit;
|
| 34 |
+
text-decoration: none;
|
| 35 |
+
}
|
| 36 |
+
|
| 37 |
+
.hero {
|
| 38 |
+
padding: 3.5rem clamp(1.5rem, 3vw, 3rem) 2.5rem;
|
| 39 |
+
position: relative;
|
| 40 |
+
overflow: hidden;
|
| 41 |
+
}
|
| 42 |
+
|
| 43 |
+
.hero::after {
|
| 44 |
+
content: "";
|
| 45 |
+
position: absolute;
|
| 46 |
+
inset: 0;
|
| 47 |
+
background: linear-gradient(120deg, rgba(122, 245, 196, 0.12), rgba(246, 196, 82, 0.08), transparent);
|
| 48 |
+
pointer-events: none;
|
| 49 |
+
}
|
| 50 |
+
|
| 51 |
+
.topline {
|
| 52 |
+
display: flex;
|
| 53 |
+
align-items: center;
|
| 54 |
+
justify-content: space-between;
|
| 55 |
+
max-width: 1200px;
|
| 56 |
+
margin: 0 auto 2rem;
|
| 57 |
+
position: relative;
|
| 58 |
+
z-index: 2;
|
| 59 |
+
}
|
| 60 |
+
|
| 61 |
+
.brand {
|
| 62 |
+
display: flex;
|
| 63 |
+
align-items: center;
|
| 64 |
+
gap: 0.5rem;
|
| 65 |
+
font-weight: 700;
|
| 66 |
+
letter-spacing: 0.5px;
|
| 67 |
+
color: var(--text);
|
| 68 |
+
}
|
| 69 |
+
|
| 70 |
+
.logo {
|
| 71 |
+
display: inline-flex;
|
| 72 |
+
align-items: center;
|
| 73 |
+
justify-content: center;
|
| 74 |
+
width: 2.2rem;
|
| 75 |
+
height: 2.2rem;
|
| 76 |
+
border-radius: 10px;
|
| 77 |
+
background: linear-gradient(145deg, rgba(122, 245, 196, 0.15), rgba(124, 142, 255, 0.15));
|
| 78 |
+
box-shadow: 0 10px 30px rgba(0, 0, 0, 0.25);
|
| 79 |
+
}
|
| 80 |
+
|
| 81 |
+
.brand-name {
|
| 82 |
+
font-size: 1.1rem;
|
| 83 |
+
}
|
| 84 |
+
|
| 85 |
+
.pill {
|
| 86 |
+
background: rgba(255, 255, 255, 0.06);
|
| 87 |
+
border: 1px solid var(--border);
|
| 88 |
+
padding: 0.6rem 1rem;
|
| 89 |
+
border-radius: 999px;
|
| 90 |
+
color: var(--muted);
|
| 91 |
+
font-size: 0.9rem;
|
| 92 |
+
box-shadow: 0 12px 30px rgba(0, 0, 0, 0.2);
|
| 93 |
+
}
|
| 94 |
+
|
| 95 |
+
.hero-grid {
|
| 96 |
+
display: grid;
|
| 97 |
+
grid-template-columns: repeat(auto-fit, minmax(320px, 1fr));
|
| 98 |
+
gap: clamp(1.5rem, 2.5vw, 2.5rem);
|
| 99 |
+
max-width: 1200px;
|
| 100 |
+
margin: 0 auto;
|
| 101 |
+
position: relative;
|
| 102 |
+
z-index: 2;
|
| 103 |
+
align-items: center;
|
| 104 |
+
}
|
| 105 |
+
|
| 106 |
+
.hero-copy h1 {
|
| 107 |
+
font-size: clamp(2.6rem, 4vw, 3.6rem);
|
| 108 |
+
margin-bottom: 1rem;
|
| 109 |
+
line-height: 1.1;
|
| 110 |
+
letter-spacing: -0.5px;
|
| 111 |
+
}
|
| 112 |
+
|
| 113 |
+
.eyebrow {
|
| 114 |
+
display: inline-flex;
|
| 115 |
+
align-items: center;
|
| 116 |
+
gap: 0.5rem;
|
| 117 |
+
text-transform: uppercase;
|
| 118 |
+
letter-spacing: 1px;
|
| 119 |
+
font-size: 0.8rem;
|
| 120 |
+
color: var(--muted);
|
| 121 |
+
margin-bottom: 0.75rem;
|
| 122 |
+
}
|
| 123 |
+
|
| 124 |
+
.eyebrow::before {
|
| 125 |
+
content: "";
|
| 126 |
+
display: inline-block;
|
| 127 |
+
width: 24px;
|
| 128 |
+
height: 2px;
|
| 129 |
+
background: linear-gradient(90deg, var(--accent), var(--accent-2));
|
| 130 |
+
border-radius: 999px;
|
| 131 |
+
}
|
| 132 |
+
|
| 133 |
+
.lede {
|
| 134 |
+
font-size: 1.1rem;
|
| 135 |
+
color: var(--muted);
|
| 136 |
+
max-width: 620px;
|
| 137 |
+
}
|
| 138 |
+
|
| 139 |
+
.hero-actions {
|
| 140 |
+
display: flex;
|
| 141 |
+
gap: 1rem;
|
| 142 |
+
align-items: center;
|
| 143 |
+
margin: 1.6rem 0 1.2rem;
|
| 144 |
+
flex-wrap: wrap;
|
| 145 |
+
}
|
| 146 |
+
|
| 147 |
+
.btn {
|
| 148 |
+
display: inline-flex;
|
| 149 |
+
align-items: center;
|
| 150 |
+
justify-content: center;
|
| 151 |
+
gap: 0.6rem;
|
| 152 |
+
padding: 0.85rem 1.4rem;
|
| 153 |
+
border-radius: 12px;
|
| 154 |
+
font-weight: 700;
|
| 155 |
+
border: 1px solid transparent;
|
| 156 |
+
cursor: pointer;
|
| 157 |
+
transition: transform 0.2s ease, box-shadow 0.2s ease, background 0.2s ease, border-color 0.2s ease;
|
| 158 |
+
}
|
| 159 |
+
|
| 160 |
+
.btn.primary {
|
| 161 |
+
background: linear-gradient(135deg, #7af5c4, #7c8eff);
|
| 162 |
+
color: #0a0f1f;
|
| 163 |
+
box-shadow: 0 15px 30px rgba(122, 245, 196, 0.25);
|
| 164 |
+
}
|
| 165 |
+
|
| 166 |
+
.btn.primary:hover {
|
| 167 |
+
transform: translateY(-2px);
|
| 168 |
+
box-shadow: 0 25px 45px rgba(122, 245, 196, 0.35);
|
| 169 |
+
}
|
| 170 |
+
|
| 171 |
+
.btn.ghost {
|
| 172 |
+
background: rgba(255, 255, 255, 0.05);
|
| 173 |
+
border-color: var(--border);
|
| 174 |
+
color: var(--text);
|
| 175 |
+
}
|
| 176 |
+
|
| 177 |
+
.btn.ghost:hover {
|
| 178 |
+
border-color: rgba(255, 255, 255, 0.3);
|
| 179 |
+
transform: translateY(-2px);
|
| 180 |
+
}
|
| 181 |
+
|
| 182 |
+
.btn.wide {
|
| 183 |
+
width: 100%;
|
| 184 |
+
justify-content: center;
|
| 185 |
+
}
|
| 186 |
+
|
| 187 |
+
.hero-badges {
|
| 188 |
+
display: flex;
|
| 189 |
+
flex-wrap: wrap;
|
| 190 |
+
gap: 0.6rem;
|
| 191 |
+
color: var(--muted);
|
| 192 |
+
font-size: 0.9rem;
|
| 193 |
+
}
|
| 194 |
+
|
| 195 |
+
.hero-badges span {
|
| 196 |
+
padding: 0.5rem 0.8rem;
|
| 197 |
+
border-radius: 10px;
|
| 198 |
+
border: 1px solid var(--border);
|
| 199 |
+
background: rgba(255, 255, 255, 0.04);
|
| 200 |
+
}
|
| 201 |
+
|
| 202 |
+
.hero-visual .glass-card {
|
| 203 |
+
background: rgba(255, 255, 255, 0.03);
|
| 204 |
+
border: 1px solid var(--border);
|
| 205 |
+
border-radius: 18px;
|
| 206 |
+
padding: 1.2rem;
|
| 207 |
+
box-shadow: var(--shadow);
|
| 208 |
+
backdrop-filter: blur(10px);
|
| 209 |
+
}
|
| 210 |
+
|
| 211 |
+
.architecture-preview {
|
| 212 |
+
background: rgba(0, 0, 0, 0.3);
|
| 213 |
+
border-radius: 14px;
|
| 214 |
+
border: 1px solid var(--border);
|
| 215 |
+
padding: 1.5rem;
|
| 216 |
+
overflow-x: auto;
|
| 217 |
+
}
|
| 218 |
+
|
| 219 |
+
.architecture-preview pre {
|
| 220 |
+
font-family: "SF Mono", "Fira Code", "Consolas", monospace;
|
| 221 |
+
font-size: 0.85rem;
|
| 222 |
+
color: var(--accent);
|
| 223 |
+
white-space: pre;
|
| 224 |
+
margin: 0;
|
| 225 |
+
line-height: 1.5;
|
| 226 |
+
}
|
| 227 |
+
|
| 228 |
+
.caption {
|
| 229 |
+
margin-top: 0.75rem;
|
| 230 |
+
color: var(--muted);
|
| 231 |
+
font-size: 0.95rem;
|
| 232 |
+
}
|
| 233 |
+
|
| 234 |
+
.section {
|
| 235 |
+
max-width: 1200px;
|
| 236 |
+
margin: 0 auto;
|
| 237 |
+
padding: clamp(2rem, 4vw, 3.5rem) clamp(1.5rem, 3vw, 3rem);
|
| 238 |
+
}
|
| 239 |
+
|
| 240 |
+
.section-header {
|
| 241 |
+
text-align: center;
|
| 242 |
+
max-width: 780px;
|
| 243 |
+
margin: 0 auto 2rem;
|
| 244 |
+
}
|
| 245 |
+
|
| 246 |
+
.section-header h2 {
|
| 247 |
+
font-size: clamp(2rem, 3vw, 2.6rem);
|
| 248 |
+
margin-bottom: 0.5rem;
|
| 249 |
+
}
|
| 250 |
+
|
| 251 |
+
.intro {
|
| 252 |
+
color: var(--muted);
|
| 253 |
+
font-size: 1.05rem;
|
| 254 |
+
}
|
| 255 |
+
|
| 256 |
+
.feature-grid {
|
| 257 |
+
display: grid;
|
| 258 |
+
grid-template-columns: repeat(auto-fit, minmax(240px, 1fr));
|
| 259 |
+
gap: 1rem;
|
| 260 |
+
}
|
| 261 |
+
|
| 262 |
+
.feature-card {
|
| 263 |
+
background: rgba(255, 255, 255, 0.03);
|
| 264 |
+
border: 1px solid var(--border);
|
| 265 |
+
border-radius: 16px;
|
| 266 |
+
padding: 1.25rem;
|
| 267 |
+
box-shadow: 0 10px 30px rgba(0, 0, 0, 0.2);
|
| 268 |
+
transition: transform 0.2s ease, border-color 0.2s ease, box-shadow 0.2s ease;
|
| 269 |
+
}
|
| 270 |
+
|
| 271 |
+
.feature-card:hover {
|
| 272 |
+
transform: translateY(-4px);
|
| 273 |
+
border-color: rgba(122, 245, 196, 0.3);
|
| 274 |
+
box-shadow: 0 18px 40px rgba(0, 0, 0, 0.3);
|
| 275 |
+
}
|
| 276 |
+
|
| 277 |
+
.feature-card .icon {
|
| 278 |
+
width: 48px;
|
| 279 |
+
height: 48px;
|
| 280 |
+
border-radius: 12px;
|
| 281 |
+
display: grid;
|
| 282 |
+
place-items: center;
|
| 283 |
+
background: rgba(122, 245, 196, 0.14);
|
| 284 |
+
margin-bottom: 0.8rem;
|
| 285 |
+
font-size: 1.4rem;
|
| 286 |
+
}
|
| 287 |
+
|
| 288 |
+
.feature-card h3 {
|
| 289 |
+
margin-bottom: 0.35rem;
|
| 290 |
+
}
|
| 291 |
+
|
| 292 |
+
.feature-card p {
|
| 293 |
+
color: var(--muted);
|
| 294 |
+
}
|
| 295 |
+
|
| 296 |
+
.story {
|
| 297 |
+
padding-top: 1rem;
|
| 298 |
+
}
|
| 299 |
+
|
| 300 |
+
.story-grid {
|
| 301 |
+
display: grid;
|
| 302 |
+
grid-template-columns: repeat(auto-fit, minmax(280px, 1fr));
|
| 303 |
+
gap: 1rem;
|
| 304 |
+
}
|
| 305 |
+
|
| 306 |
+
.story-card {
|
| 307 |
+
background: rgba(255, 255, 255, 0.03);
|
| 308 |
+
border: 1px solid var(--border);
|
| 309 |
+
border-radius: 18px;
|
| 310 |
+
padding: 1.5rem;
|
| 311 |
+
box-shadow: var(--shadow);
|
| 312 |
+
}
|
| 313 |
+
|
| 314 |
+
.story-card.secondary {
|
| 315 |
+
background: linear-gradient(145deg, rgba(124, 142, 255, 0.08), rgba(122, 245, 196, 0.06));
|
| 316 |
+
}
|
| 317 |
+
|
| 318 |
+
.story-card h3 {
|
| 319 |
+
margin-bottom: 0.8rem;
|
| 320 |
+
}
|
| 321 |
+
|
| 322 |
+
.story-list {
|
| 323 |
+
list-style: none;
|
| 324 |
+
display: grid;
|
| 325 |
+
gap: 0.7rem;
|
| 326 |
+
color: var(--muted);
|
| 327 |
+
font-size: 0.98rem;
|
| 328 |
+
}
|
| 329 |
+
|
| 330 |
+
.story-list li {
|
| 331 |
+
display: flex;
|
| 332 |
+
gap: 0.7rem;
|
| 333 |
+
align-items: flex-start;
|
| 334 |
+
}
|
| 335 |
+
|
| 336 |
+
.story-text {
|
| 337 |
+
color: var(--muted);
|
| 338 |
+
line-height: 1.7;
|
| 339 |
+
margin-bottom: 1rem;
|
| 340 |
+
}
|
| 341 |
+
|
| 342 |
+
.chips {
|
| 343 |
+
display: flex;
|
| 344 |
+
flex-wrap: wrap;
|
| 345 |
+
gap: 0.5rem;
|
| 346 |
+
}
|
| 347 |
+
|
| 348 |
+
.chip {
|
| 349 |
+
padding: 0.45rem 0.8rem;
|
| 350 |
+
border-radius: 12px;
|
| 351 |
+
background: rgba(0, 0, 0, 0.2);
|
| 352 |
+
border: 1px solid var(--border);
|
| 353 |
+
color: var(--text);
|
| 354 |
+
font-size: 0.9rem;
|
| 355 |
+
}
|
| 356 |
+
|
| 357 |
+
.footer {
|
| 358 |
+
text-align: center;
|
| 359 |
+
color: var(--muted);
|
| 360 |
+
padding: 2rem 1.5rem 0;
|
| 361 |
+
}
|
| 362 |
+
|
| 363 |
+
.footer a {
|
| 364 |
+
color: var(--text);
|
| 365 |
+
border-bottom: 1px solid transparent;
|
| 366 |
+
}
|
| 367 |
+
|
| 368 |
+
.footer a:hover {
|
| 369 |
+
border-color: rgba(255, 255, 255, 0.5);
|
| 370 |
+
}
|
| 371 |
+
|
| 372 |
+
@media (max-width: 768px) {
|
| 373 |
+
.hero {
|
| 374 |
+
padding-top: 2.5rem;
|
| 375 |
+
}
|
| 376 |
+
|
| 377 |
+
.topline {
|
| 378 |
+
flex-direction: column;
|
| 379 |
+
gap: 0.8rem;
|
| 380 |
+
align-items: flex-start;
|
| 381 |
+
}
|
| 382 |
+
|
| 383 |
+
.hero-actions {
|
| 384 |
+
width: 100%;
|
| 385 |
+
}
|
| 386 |
+
|
| 387 |
+
.btn {
|
| 388 |
+
width: 100%;
|
| 389 |
+
justify-content: center;
|
| 390 |
+
}
|
| 391 |
+
|
| 392 |
+
.hero-badges {
|
| 393 |
+
gap: 0.4rem;
|
| 394 |
+
}
|
| 395 |
+
}
|
uv.lock
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|