• Models
  • News
  • Pricing
  • FAQ
Contact
News
TutorialElevenLabsTTS·Jun 10, 2026·8 minute read

ElevenLabs Text to Speech — Complete Guide for Creators

Everything you need to know about ElevenLabs on Sonna: Eleven v3, Multilingual v2, Flash v2.5 — which model to pick, credit costs, and real-world use cases.

ElevenLabs

ElevenLabs is one of the world's leading text-to-speech (TTS) providers, and Sonna integrates three of their best models directly into the platform — without requiring your own ElevenLabs account. Simply select a voice, type your text, and generate studio-quality audio in seconds.

This guide explains the three ElevenLabs models available on Sonna, the differences between them, and when to use each one.


ElevenLabs Models on Sonna

Sonna provides three ElevenLabs models:

ModelModel IDLanguagesCharacter LimitCost per Character
Eleven v3eleven-v370+5,0002.10 credits
Multilingual v2eleven-multilingual-v22910,0002.10 credits
Flash v2.5eleven-flash-v2-53240,0001.05 credits

All ElevenLabs models require a Pro/Max plan or PAYG credits. Free users can only use Google Cloud TTS (Neural2 and WaveNet).

If you use the Sonna Developer API, all ElevenLabs models automatically receive a 10% discount.


Eleven v3 — The Most Expressive Model

Eleven v3 is the latest and most advanced ElevenLabs model. Released to general availability in March 2026, it brings unprecedented expressiveness to TTS.

Key Advantages of Eleven v3

Audio Tags — a unique Eleven v3 feature that allows you to control emotions and styles directly within the text:

[excited] Welcome to Sonna! [whispers] Your audio generation is ready.
[sighs] This is not easy... [confidently] But we can do it.
[laughs] Absolutely incredible! [slowly] Let us start from the beginning.

Available tags include: [excited], [whispers], [sighs], [laughs], [slowly], [angry], [sad], [surprised], and many more.

Complex Text Accuracy — Eleven v3 features a 68% reduction in errors for text containing numbers, URLs, formulas, and code. It is ideal for technical and educational content.

70+ Languages — supports English, Indonesian, Spanish, Mandarin, Arabic, Japanese, Korean, and dozens of others.

When to Use Eleven v3

Use Eleven v3 when:

  • You need detailed emotional control within the narration
  • The content contains many numbers, URLs, or technical terms
  • You are creating dramatic content like podcasts, audiobooks, or ads
  • Expressive quality is more important than cost

Limitations of Eleven v3

  • 5,000 character limit per request — the smallest among the three models
  • Priced the same as Multilingual v2 despite its higher capabilities
  • Requires more prompt engineering for optimal results

Multilingual v2 — The Top Choice for Long Narration

Multilingual v2 is ElevenLabs' "workhorse" model — mature, stable, and excellent for professional content production. This model is trusted by millions of creators worldwide.

Key Advantages of Multilingual v2

  • Rich Emotional Expression — voices sound natural and lifelike, not robotic
  • 10,000 Character Limit — double that of Eleven v3, perfect for longer narrations
  • 29 Languages including English, Spanish, French, German, Indonesian, Portuguese, Japanese, and Korean
  • High Stability — delivers consistent results for bulk content production
  • Best for voice-overs, audiobooks, e-learning, and post-production

When to Use Multilingual v2

Use Multilingual v2 when:

  • You need long narration exceeding 5,000 characters
  • Your content consists of audiobooks, online courses, or long-form videos
  • You want high quality without learning audio tags
  • You are producing in bulk where consistency is more important than variation

Example Usage

{
  "text": "Welcome to the Python programming course for beginners. In this first module, we will learn the basics of the Python language, starting from variables and data types to control structures.",
  "voice": "YOUR_VOICE_ID",
  "ttsModel": "eleven-multilingual-v2",
  "stability": 0.6,
  "similarity_boost": 0.75
}

Flash v2.5 — The Fastest and Most Affordable

Flash v2.5 is designed for speed and efficiency. With a latency of around 75ms, it is the premier model for real-time applications and voice chatbots.

Key Advantages of Flash v2.5

  • 50% Lower Cost than Eleven v3 and Multilingual v2 — only 1.05 credits per character
  • 40,000 Character Limit — the largest among all ElevenLabs models on Sonna
  • ~75ms Latency — optimized for conversational and real-time applications
  • 32 Languages including all major global languages plus additional ones

When to Use Flash v2.5

Use Flash v2.5 when:

  • Cost is your priority — high-volume content, bulk generation
  • You require extremely long text (over 10,000 characters) in a single request
  • Your application requires rapid responses (chatbots, voice assistants, instant alerts)
  • A quality level of "excellent" is sufficient without needing "outstanding"

Cost Comparison

For 100,000 characters (about 15,000 words or a short book):

ModelTotal Credits
Eleven v3210,000 credits
Multilingual v2210,000 credits
Flash v2.5105,000 credits

Flash v2.5 saves you 50% for high-volume content.


Model Selection Guide

Questions to Help You Choose:

1. How long is your text?

  • < 5,000 characters → Any model will work
  • 5,000 – 10,000 characters → Multilingual v2 or Flash v2.5
  • 10,000 characters → Flash v2.5 only

2. Do you need detailed emotional control?

  • Yes → Eleven v3 (use audio tags)
  • No → Multilingual v2 or Flash v2.5

3. Is cost a primary consideration?

  • Yes → Flash v2.5
  • No, quality is more important → Eleven v3 or Multilingual v2

4. Is this for real-time or batch processing?

  • Real-time/conversational → Flash v2.5
  • Batch/production → Eleven v3 or Multilingual v2

How to Use in Sonna Creative

  1. Go to Sonna Creative and select Text to Speech
  2. In the right panel, select your desired voice
  3. Scroll down the voice settings panel to find Model — choose Eleven v3, Multilingual v2, or Flash v2.5
  4. Enter your text in the main input area
  5. Click Generate — your audio will be ready in 1–3 seconds

For Eleven v3, embed audio tags directly inside your text to guide the voice expressions.


How to Use via API

All ElevenLabs models are available via the Sonna Developer API. The endpoint remains the same for all models — only the ttsModel parameter varies:

curl -X POST https://sonnalabs.app/api/v1/tts/synthesize \
  -H "Authorization: Bearer sona_sk_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Text to convert into speech.",
    "voice": "VOICE_ID",
    "ttsModel": "eleven-v3"
  }'

Replace ttsModel with:

  • "eleven-v3" for Eleven v3
  • "eleven-multilingual-v2" for Multilingual v2
  • "eleven-flash-v2-5" for Flash v2.5

You can generate API keys in the Sonna Console → API Keys. Access requires a Pro/Max plan or PAYG credits.


Summary

Eleven v3Multilingual v2Flash v2.5
Expression Quality⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐
Character Limit5,00010,00040,000
Languages70+2932
Cost per Character2.10 cr2.10 cr1.05 cr
LatencyFastFastUltra-fast
Audio Tags✅❌❌
Best ForExpressive contentLong narrationsHigh-volume workloads

Not sure where to start? Multilingual v2 is the safest choice for most use cases. Upgrade to Eleven v3 when you need highly detailed expressions, or switch to Flash v2.5 when cost efficiency and latency are your main priorities.


See all generative models available on Sonna on the Models page. For developer API integration guides, visit the Sonna Console.

More from News

Google Gemini 2.5 TTS — Natural Multilingual Voice on Sonna

Gemini 2.5 Flash and Pro bring natural AI speech in 30+ languages with style instructions. Here's how to get the most out of both models.

TutorialGoogle GeminiTTS
Jun 8, 2026

How to Generate Original Music with Suno on Sonna

From simple prompts to full custom-mode compositions — a practical guide to Suno v5.5, v5, v4.5, and when to use each version.

TutorialMusicSuno
Jun 6, 2026

Nano Banana, GPT Image, FLUX, Grok — Which to Use?

A side-by-side comparison of every image generation model on Sonna. Prompting tips, credit costs, resolution options, and best use cases per model.

GuideImageAI Models
Jun 4, 2026
View all posts
Sonna

SonnaCreative

  • Text to Speech
  • Image Generation
  • Video Generation
  • Music Generation

SonnaAPI

  • API Reference
  • Text to Speech API
  • Getting Started

Resources

  • Models
  • Pricing
  • FAQ
  • Changelog
  • News
  • Status

Company

  • About
  • Contact
© 2026 SonnaLabs.
Privacy PolicyTerms of ServiceRefund PolicyAccount Deletion