Discussions
Is it possible to switch from ElevenLabs to OpenAI TTS in Interactive Avatar sessions?
5 months ago by Junya Ishihara
Hi HeyGen team and community,
I'm currently using the HeyGen Interactive Avatar SDK (@heygen/streaming-avatar v2.0.13) and would like to use OpenAI's TTS model (specifically gpt-4o-mini-tts) instead of ElevenLabs for text-to-speech synthesis.
Current Setup:
- Using the Next.js demo from the official repository
- Currently configured with ElevenLabs Model (eleven_flash_v2_5)
- Configuration in StartAvatarRequest:
voice: { rate: 1.5, emotion: VoiceEmotion.EXCITED, model: ElevenLabsModel.eleven_flash_v2_5, }
What I've Tried:
- Reviewed the SDK TypeScript definitions - only ElevenLabsModel enum is available
- Checked the v2/voices API endpoint - no provider information is exposed in the response
- Searched documentation for OpenAI TTS integration (found info about OpenAI Assistant
API integration, but not TTS provider switching)
Questions:
- Is it possible to specify OpenAI as the TTS provider when starting an Interactive
Avatar session? - Are there any limitations or considerations when using OpenAI TTS compared to ElevenLabs?
Any guidance would be greatly appreciated! Thanks in advance.