Show HN: Speech-to-speech playground for OpenAI's new Realtime API https://ift.tt/V12ET6P

Show HN: Speech-to-speech playground for OpenAI's new Realtime API Hi there - Ben from LiveKit here! If you’re curious about OpenAI’s brand-new Realtime API and speech-to-speech model, check out this hosted playground and play with the model yourself. If you’d like to learn more about how this came together, read on. If you’re like me, you’ve probably been wondering what novel things a model like this can do in an API setting with unfettered access to the system prompt and other parameters. I’ve been fortunate to have had early access through my work at LiveKit, where we’ve built open-source developer tooling that makes deploying this model in a production app as simple as possible. I thought it would also be fun to build a “playground” environment, partially to dogfood our own tooling but largely because I just wanted to play with the model. This playground is freely available to anyone to try, and comes loaded up with a bunch of fun demos of the model’s unique capabilities that I’ve put together. What blew my mind is how much mileage you can get out of the system prompt alone in this API. Here are some use-cases that are at least halfway to a complete MVP: - "Customer Support": An complete phone support agent for the playground - "Spanish Tutor": A bilingual language-learning demo - "Meditation Coach": It can actually pause and resume speech all on its own as it guides you through a meditation routine Also some fun (and a bit irreverent…) demos of its style and non-verbal capabilities: - "Smoker’s Rasp": It can cough and speak like it’s been smoking three packs a day for 30 years (my favorite, lol) - "Unconfident Assistant": Umms, buts, and more - surprisingly lifelike - "Opera Singer": The best singing demo I’ve been able to compose (but still not quite what they showed off back in May…) The playground doesn’t store anything anywhere besides your browser but you can share anything fun you put together with a link that encodes your config into URL params. For now - anyone can use this playground to access the model and give it a spin (session limit 5min). In the coming days when more people have access to the underlying API, I’ll update it to require you bring your own OpenAI API Key. Lastly - if you’re even more curious how this was built or want to tweak or adapt it for yourself, the whole project and every dependency is open-source (link in footer!). https://ift.tt/I6AHLiV October 2, 2024 at 04:59AM

Post a Comment

0 Comments