Discussions

Ask a Question
Back to all

Is it possible to switch from ElevenLabs to OpenAI TTS in Interactive Avatar sessions?

Hi HeyGen team and community,

I'm currently using the HeyGen Interactive Avatar SDK (@heygen/streaming-avatar v2.0.13) and would like to use OpenAI's TTS model (specifically gpt-4o-mini-tts) instead of ElevenLabs for text-to-speech synthesis.

Current Setup:

  • Using the Next.js demo from the official repository
  • Currently configured with ElevenLabs Model (eleven_flash_v2_5)
  • Configuration in StartAvatarRequest:
    voice: {
      rate: 1.5,
      emotion: VoiceEmotion.EXCITED,
      model: ElevenLabsModel.eleven_flash_v2_5,
    }
    

What I've Tried:

  1. Reviewed the SDK TypeScript definitions - only ElevenLabsModel enum is available
  2. Checked the v2/voices API endpoint - no provider information is exposed in the response
  3. Searched documentation for OpenAI TTS integration (found info about OpenAI Assistant
    API integration, but not TTS provider switching)

Questions:

  1. Is it possible to specify OpenAI as the TTS provider when starting an Interactive
    Avatar session?
  2. Are there any limitations or considerations when using OpenAI TTS compared to ElevenLabs?

Any guidance would be greatly appreciated! Thanks in advance.