Newsletter image

Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Search

GDPR Compliance

We use cookies to ensure you get the best experience on our website. By continuing to use our site, you accept our use of cookies, Privacy Policy, and Terms of Service.

CSM-1B

Sesame AI's Conversational Speech Model (CSM) is a groundbreaking advancement in voice technology, designed to create natural, human-like conversations. Unlike traditional text-to-speech, CSM offers a "voice presence," launched in 2025, and is available on GitHub and Hugging Face. Built with a dual-transformer setup, CSM can deliver speech with emotion and context, boasting a rapid 500-millisecond response time. Trained on a vast dataset, it supports applications like empathetic customer support, dynamic language lessons, engaging AR experiences, improved accessibility for the visually impaired, and personalized podcast narration. The technology is open-source, allowing developers to explore its potential. A Python setup guide shows how to create a "Hello World" audio file using CSM, demonstrating its capabilities and encouraging further experimentation. CSM is positioned to revolutionize AI interactions across various fields.

R1-Omni

Alibaba's R1-Omni is an AI model capable of recognizing human emotions from videos and audio, aimed at making AI interactions more empathetic. Released on March 12, 2025, it could enhance products like chatbots and entertainment apps by making them more responsive to users' emotions. Being open-source, R1-Omni allows developers to innovate and integrate affordable AI features into various applications. It utilizes Reinforcement Learning with Verifiable Reward for emotion detection, showing strong performance on datasets. Potential applications include improved customer service, mood-based content suggestions, mental health support, and adaptive educational tools. The model positions Alibaba competitively in the AI field, with its open-source nature fostering faster innovation. Users can explore R1-Omni on platforms like Hugging Face, contributing to community-driven development and future consumer applications.