No predefined voices found in './voices' directory.
Tips & Tricks for Dia
- Use **Predefined Voices** for consistent, high-quality output based on provided samples.
- For **Voice Clone**, upload clean reference audio (
.wav
/.mp3
). Crucially, save the exact transcript of the reference audio as the .txt file with the same name as the audio file.[S1] First speaker [S2] Second speaker
or[S1] First speaker
if the reference audio file has only one speaker. - Use **Random / Dialogue** for multi-speaker text (
[S1]
/[S2]
) or single-speaker generation without cloning. - Experiment with **CFG Scale** (higher = more adherence) and **Temperature** (higher = more varied).
- Use **Generation Seed** integer values like 1, 42, 901... for reproducible results.
- Enable **Split text** for long inputs (> ~200-300 chars). Note: Using Random/Dialogue mode with splitting and a random seed (-1) may result in different voices per chunk. Use Predefined/Clone or a fixed seed for consistency across chunks.
- Use the
/v1/audio/speech
endpoint for OpenAI compatibility. - Use the custom
/tts
endpoint for maximum flexibility and configuring all Dia generation parameters, passing reference audio and transcript information etc.