ZONOS2
Official repository and model card - 2026
Official ZONOS2 resources document the Apache-2.0 release, multilingual TTS API, more than 6M hours of training speech, high-fidelity voice cloning, and 44.1 kHz PCM output.
Creators comparing expressive zero-shot voice cloning, multilingual narration, and local TTS quality
44.1 kHz WAV speech generated with the uploaded voice reference and highest-quality defaults
Input
Reference voice audio plus multilingual text for ZONOS2 voice cloning TTS
Audio formats
Output
44.1 kHz WAV speech generated with the uploaded voice reference and highest-quality defaults
Best for
Creators comparing expressive zero-shot voice cloning, multilingual narration, and local TTS quality
ZONOS2 Voice Cloning TTS generates multilingual speech from a short reference voice clip. TelkNet presents it as a server-side official ZONOS2 deployment with highest-quality defaults and a simple upload-text-download workflow.
ZONOS2 is Zyphra's Apache-2.0 zero-shot TTS model with an MoE architecture, multilingual cloning support, and 44.1 kHz PCM audio output.
ZONOS2 returns 44.1 kHz PCM audio through its DAC codec path.
Official repository and model card - 2026
Official ZONOS2 resources document the Apache-2.0 release, multilingual TTS API, more than 6M hours of training speech, high-fidelity voice cloning, and 44.1 kHz PCM output.
No public paper link is listed; use the official repo or adapter implementation as the source.