CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models

An advanced model for multilingual speech synthesis, achieving high naturalness and minimal latency in streaming applications.