Text to speech
ElevenLabs Text-to-Speech is an advanced AI-powered tool that converts written text into natural, expressive audio files. Leveraging cutting-edge voice models, it enables high-quality, customizable speech synthesis for diverse applications.
Features
- Supports multiple voice identities with customizable voiceId parameters.
- Offers advanced control over speech characteristics including stability and similarity boost.
- Utilizes state-of-the-art models like eleven_monolingual_v1 for natural audio rendering.
- Easy integration via API with detailed documentation for seamless developer use.
- Outputs audio files that can be used in various contexts such as podcasts, audiobooks, and accessibility tools.
Benefits
- Enhances accessibility by converting text to clear, natural-sounding audio.
- Saves time and resources compared to manual voice recordings.
- Improves user engagement with expressive and customizable speech outputs.
- Flexible settings allow optimization for background noise and voice consistency.
- Supports diverse use cases from content creation to customer support automation.
Description
Retrieve an audio file. See the documentation
Parameters
5 parameters
| Name | Type | Description |
|---|---|---|
| voiceIdrequired | string | Identifier of the voice that will be used. |
| modelId | string | Identifier of the model that will be used. Default: |
| textrequired | string | The text that will get converted into speech. |
| similarityBoost | string | Low values are recommended if background artifacts are present in generated speech. High enhancement boosts overall voice clarity and target speaker similarity. Very high values can cause artifacts, so adjusting this setting to find the optimal value is encouraged. It goes from |
| stability | string | Decreasing stability can make speech more expressive with output varying between re-generations. It can also lead to instabilities. Increasing stability will make the voice more consistent between re-generations, but it can also make it sounds a bit monotone. On longer text fragments we recommend lowering this value. It goes from |