🔮The Codex

Text-to-Speech (TTS)

AI technology that converts written text into natural-sounding spoken audio.

📖 Apprentice Explanation

Text-to-speech AI reads text aloud in a human-like voice. Modern TTS can sound almost indistinguishable from a real person, with natural intonation, emotion, and even different accents.

🧙 Archmage Notes

Modern TTS uses neural architectures like VITS, Tacotron, and transformer-based models. ElevenLabs and others offer voice cloning, multilingual synthesis, and real-time streaming. Key metrics include MOS (Mean Opinion Score), naturalness, and prosody quality.

Related Enchantments

🔮

Multimodal AI

🔮

Neural Network