Text-To-Video Model Applications

8don MSN

Alibaba just revealed it’s behind a viral AI video model dominating leaderboards

A mysterious AI video model that has ascended global leaderboards has been confirmed as a project under Alibaba.

Top Text-to-Speech Models of 2026: Proprietary vs Open Source Compared

Open source TTS models Kokoro, Orpheus, and Piper are tested on symbols, abbreviations, and prosody with CER and MOS results.

Why Developers Are Dropping Cloud APIs for This Tiny 82M Speech Model

Kokoro 82M is an 82-million-parameter text-to-speech model that beats many TTS APIs while running locally on CPUs, including ...

Manila Standard

Alibaba unveils Wan2.7-Video to elevate creators from executors to directors

Creators can master a full creative workflow and modify any video dimension with natural language instructions A single ...

Slator

Mistral Completes Voxtral Speech Stack With Launch of Text-to-Speech Model

Mistral launches Voxtral TTS, extending its model family into speech generation and enabling end-to-end voice workflows.

Alibaba's New AI Video-Generation Model Tops Global Ranking After Debut

Alibaba's move comes after TikTok parent ByteDance released Seedance 2.0, its video-generation model, earlier this year as Chinese internet companies vie to capture early demand in AI video. Industry ...

Seedance 2.0 API Goes Live on fal, Expanding Access to Next-Generation AI Video Generation Infrastructure

San Francisco, California, United States, April 17, 2026 -- fal has announced the official launch of the Seedance 2.0 API on ...

2don MSN

DeepL, known for text translation, now wants to translate your voice

DeepL says its tech could be used for real-time translation with meeting tools like Zoom and Microsoft Teams ...

CNET

Google Launches Veo 3.1 Lite, a More Cost-Effective AI Video Generator Model

Google is launching a cost-effective version of its Veo 3 AI video generator, Veo 3.1 Lite, the company announced Tuesday, highlighting Google's renewed commitment to all forms of generative AI.

Cross-Modal Data Understanding Advances Through Bukun Ren’s Review of Visual Language Models

A study on visual language models explores how shared semantic frameworks improve image–text understanding across multimodal tasks. By ...

Google’s Gemini 3.1 Flash TTS model offers unparalleled control over AI voices

Google LLC’s DeepMind artificial intelligence unit today rolled out a new text-to-speech model called Gemini 3.1 Flash TTS.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results