Microsoft Launches Three New In-House AI Models for Transcription, Voice, and Image Generation
Microsoft unveils three proprietary AI models targeting transcription, voice synthesis, and image generation, directly challenging OpenAI and Google.
Microsoft unveils three proprietary AI models targeting transcription, voice synthesis, and image generation, directly challenging OpenAI and Google.
Google has released Gemini 3.1 Flash Live, its highest-quality real-time audio and voice model to date, featuring reduced latency, improved emotional tone recognition, and mandatory SynthID watermarking on all AI-generated audio. Search Live simultaneously expands to 200+ countries.
ElevenLabs and IBM announced a collaboration to integrate ElevenLabs' Text-to-Speech and Speech-to-Text technology into IBM watsonx Orchestrate, enabling enterprises to deploy natural, multilingual voice-enabled AI agents across 70 languages.
Voice AI startup ElevenLabs triples valuation to $11B with Sequoia-led Series D, reaching $330M ARR as it prepares for IPO.
Voice AI leader ElevenLabs secures $500M Series D funding led by Sequoia Capital, reaching $11B valuation with $330M ARR from enterprise adoption.
Voice AI startup ElevenLabs secures $500M in Series D funding at $11B valuation, tripling its value in one year with enterprise adoption surging.
Google DeepMind has hired the CEO and top engineers from Hume AI, a startup specializing in emotionally intelligent voice AI, to enhance the capabilities of its Gemini models and advance voice-based interaction.
Voice AI infrastructure provider LiveKit has secured $100 million in new funding, reaching a $1 billion valuation. The company powers OpenAI's ChatGPT voice features and is expanding its real-time voice and video solutions.