MiniMax Speech 2.8 HD
Text-to-Speech • MiniMax • ProxiedMiniMax Speech 2.8 HD focuses on studio-grade audio generation with emotion control, multilingual support (40+ languages), and voice cloning.
| Model Info | |
|---|---|
| Terms and License | link ↗ |
| More information | link ↗ |
| Pricing | View pricing in the Cloudflare dashboard ↗ |
Usage
const response = await env.AI.run( 'minimax/speech-2.8-hd', { text: 'Hello! Welcome to Cloudflare AI Gateway. Let me show you what we can do.', voice_id: 'English_expressive_narrator', speed: 1, volume: 1, pitch: 0, format: 'mp3', }, { gateway: { id: 'default' }, })console.log(response){ "state": "Completed", "result": { "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/simple-speech.mp3" }, "gatewayMetadata": { "keySource": "Unified" }}Examples
Custom Voice — Use a specific voice and adjust speed
const response = await env.AI.run( 'minimax/speech-2.8-hd', { text: 'The weather today is sunny with a high of 72 degrees. Perfect for a walk in the park.', voice_id: 'English_expressive_narrator', speed: 0.9, volume: 1, pitch: 0, format: 'mp3', }, { gateway: { id: 'default' }, })console.log(response){ "state": "Completed", "result": { "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/custom-voice.mp3" }, "gatewayMetadata": { "keySource": "Unified" }}With Emotion — Apply emotional tone to speech
const response = await env.AI.run( 'minimax/speech-2.8-hd', { text: "Congratulations! You've just won the grand prize! This is absolutely incredible news!", voice_id: 'English_expressive_narrator', speed: 1, volume: 1, pitch: 0, emotion: 'happy', format: 'mp3', }, { gateway: { id: 'default' }, })console.log(response){ "state": "Completed", "result": { "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/with-emotion.mp3" }, "gatewayMetadata": { "keySource": "Unified" }}High Sample Rate — Studio quality at 44.1kHz sample rate
const response = await env.AI.run( 'minimax/speech-2.8-hd', { text: 'This recording is generated at studio quality sample rate for the highest possible audio fidelity.', voice_id: 'English_expressive_narrator', speed: 1, volume: 1, pitch: 0, format: 'mp3', sample_rate: 44100, }, { gateway: { id: 'default' }, })console.log(response){ "state": "Completed", "result": { "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/high-sample-rate.mp3" }, "gatewayMetadata": { "keySource": "Unified" }}Parameters
stringrequiredmaxLength: 10000The text to convert to speech. Maximum 10,000 characters.stringrequireddefault: English_expressive_narratorThe voice ID to use for synthesisnumberrequireddefault: 1minimum: 0.5maximum: 2Speech speed (0.5 to 2)numberrequireddefault: 1minimum: 0maximum: 10Speech volume (0 to 10)integerrequireddefault: 0minimum: -12maximum: 12Pitch adjustment (-12 to 12)stringenum: happy, sad, angry, fearful, disgusted, surprised, calm, fluentEmotion control for synthesized speechstringrequireddefault: mp3enum: mp3, flac, wavOutput audio formatone ofstringURL to the generated audio file