A mysterious AI video model that has ascended global leaderboards has been confirmed as a project under Alibaba.
Open source TTS models Kokoro, Orpheus, and Piper are tested on symbols, abbreviations, and prosody with CER and MOS results.
Kokoro 82M is an 82-million-parameter text-to-speech model that beats many TTS APIs while running locally on CPUs, including ...
Creators can master a full creative workflow and modify any video dimension with natural language instructions A single ...
Mistral launches Voxtral TTS, extending its model family into speech generation and enabling end-to-end voice workflows.
Alibaba's move comes after TikTok parent ByteDance released Seedance 2.0, its video-generation model, earlier this year as Chinese internet companies vie to capture early demand in AI video. Industry ...
San Francisco, California, United States, April 17, 2026 -- fal has announced the official launch of the Seedance 2.0 API on ...
DeepL says its tech could be used for real-time translation with meeting tools like Zoom and Microsoft Teams ...
Google is launching a cost-effective version of its Veo 3 AI video generator, Veo 3.1 Lite, the company announced Tuesday, highlighting Google's renewed commitment to all forms of generative AI.
A study on visual language models explores how shared semantic frameworks improve image–text understanding across multimodal tasks. By ...
Google LLC’s DeepMind artificial intelligence unit today rolled out a new text-to-speech model called Gemini 3.1 Flash TTS.