A plugin-based text-to-speech system for ROS 2
speak_ros provides a flexible TTS framework built on:
- Action-based interface - Asynchronous speech synthesis with progress feedback and cancellation support
- Plugin architecture - Switchable TTS engines via pluginlib without code changes
- Streaming support - Chunked audio synthesis and playback for low-latency response
- PortAudio integration - Cross-platform audio playback
The core package loads TTS plugins dynamically and manages the synthesis pipeline, from text input through audio generation to speaker output.
| Package | Description |
|---|---|
speak_ros |
Core TTS action server implementation |
speak_ros_interfaces |
Custom message, service, and action definitions |
speak_ros_aivis_plugin |
Aivis Speech Engine TTS plugin (Japanese) |
speak_ros_open_jtalk_plugin |
Open JTalk TTS plugin (Japanese) |
speak_ros_openai_tts_plugin |
OpenAI TTS API plugin |
speak_ros_voicevox_plugin |
VOICEVOX TTS plugin (Japanese) |
- Aivis Speech Engine - VOICEVOX-compatible Japanese synthesis with emotion control
- Open JTalk - Offline Japanese speech synthesis
- OpenAI TTS - High-quality cloud-based synthesis via OpenAI API
- VOICEVOX - Japanese synthesis with diverse character voices
- ROS 2 (Humble or later)
- PortAudio development libraries
cd <your_ros2_workspace>
rosdep install --from-paths src --ignore-src -r -ycolcon build --packages-up-to speak_ros speak_ros_aivis_plugin speak_ros_open_jtalk_plugin speak_ros_openai_tts_plugin speak_ros_voicevox_plugin
source install/setup.bashStart the TTS action server with the default plugin (Open JTalk):
ros2 run speak_ros speak_ros_nodeTo use a different plugin, specify the plugin parameter:
# OpenAI TTS
ros2 run speak_ros speak_ros_node --ros-args -p plugin:=speak_ros_openai_tts_plugin::OpenAITTSPlugin
# Aivis Speech Engine
ros2 run speak_ros speak_ros_node --ros-args -p plugin:=aivis_plugin::AivisPlugin
# VOICEVOX
ros2 run speak_ros speak_ros_node --ros-args -p plugin:=voicevox_plugin::VoiceVoxPluginUse the provided test client to send text for synthesis:
# In another terminal
ros2 run speak_ros test_clientThe test client sends a sample text to the /speak action server and displays feedback during synthesis and playback.
To see all installed TTS plugins:
ros2 run speak_ros list_pluginsThe /speak action accepts text and optional parameter overrides:
ros2 action send_goal /speak speak_ros_interfaces/action/Speak "{text: 'Hello world'}"Apache 2.0