MiniMax Speech 2.8 HD

Text-to-Speech • MiniMax • Proxied

MiniMax Speech 2.8 HD focuses on studio-grade audio generation with emotion control, multilingual support (40+ languages), and voice cloning.

Model Info
Terms and License	link ↗
More information	link ↗
Pricing	View pricing in the Cloudflare dashboard ↗

Usage

const response = await env.AI.run(
  'minimax/speech-2.8-hd',
  {
    text: 'Hello! Welcome to Cloudflare AI Gateway. Let me show you what we can do.',
    voice_id: 'English_expressive_narrator',
    speed: 1,
    volume: 1,
    pitch: 0,
    format: 'mp3',
  },
  {
    gateway: { id: 'default' },
  }
)
console.log(response)

Output
Raw response

{
  "state": "Completed",
  "result": {
    "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/simple-speech.mp3"
  },
  "gatewayMetadata": {
    "keySource": "Unified"
  }
}

Examples

Custom Voice — Use a specific voice and adjust speed

const response = await env.AI.run(
  'minimax/speech-2.8-hd',
  {
    text: 'The weather today is sunny with a high of 72 degrees. Perfect for a walk in the park.',
    voice_id: 'English_expressive_narrator',
    speed: 0.9,
    volume: 1,
    pitch: 0,
    format: 'mp3',
  },
  {
    gateway: { id: 'default' },
  }
)
console.log(response)

Output
Raw response

{
  "state": "Completed",
  "result": {
    "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/custom-voice.mp3"
  },
  "gatewayMetadata": {
    "keySource": "Unified"
  }
}

With Emotion — Apply emotional tone to speech

const response = await env.AI.run(
  'minimax/speech-2.8-hd',
  {
    text: "Congratulations! You've just won the grand prize! This is absolutely incredible news!",
    voice_id: 'English_expressive_narrator',
    speed: 1,
    volume: 1,
    pitch: 0,
    emotion: 'happy',
    format: 'mp3',
  },
  {
    gateway: { id: 'default' },
  }
)
console.log(response)

Output
Raw response

{
  "state": "Completed",
  "result": {
    "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/with-emotion.mp3"
  },
  "gatewayMetadata": {
    "keySource": "Unified"
  }
}

High Sample Rate — Studio quality at 44.1kHz sample rate

const response = await env.AI.run(
  'minimax/speech-2.8-hd',
  {
    text: 'This recording is generated at studio quality sample rate for the highest possible audio fidelity.',
    voice_id: 'English_expressive_narrator',
    speed: 1,
    volume: 1,
    pitch: 0,
    format: 'mp3',
    sample_rate: 44100,
  },
  {
    gateway: { id: 'default' },
  }
)
console.log(response)

Output
Raw response

{
  "state": "Completed",
  "result": {
    "audio": "https://pub-04a6d208d361438ea01b797e6973bd19.r2.dev/catalog/minimax__speech-2.8-hd/high-sample-rate.mp3"
  },
  "gatewayMetadata": {
    "keySource": "Unified"
  }
}

text

stringrequiredmaxLength: 10000The text to convert to speech. Maximum 10,000 characters.

voice_id

stringrequireddefault: English_expressive_narratorThe voice ID to use for synthesis

speed

numberrequireddefault: 1minimum: 0.5maximum: 2Speech speed (0.5 to 2)

volume

numberrequireddefault: 1minimum: 0maximum: 10Speech volume (0 to 10)

pitch

integerrequireddefault: 0minimum: -12maximum: 12Pitch adjustment (-12 to 12)

emotion

stringenum: happy, sad, angry, fearful, disgusted, surprised, calm, fluentEmotion control for synthesized speech

format

stringrequireddefault: mp3enum: mp3, flac, wavOutput audio format

▶sample_rate

one of

audio

stringURL to the generated audio file

API Schemas (Raw)

Input

Output