Spotify has unveiled a remarkable new feature powered by artificial intelligence (AI) that translates a podcast into multiple languages using the same voices of those in the show.
It’s powered partly by OpenAI’s just-released voice generation technology that needs just a few seconds of listening to replicate a voice.
Spotify said the feature makes for “a more authentic listening experience that sounds more personal and natural than traditional dubbing,” adding: “A podcast episode originally recorded in English can now be available in other languages while keeping the speaker’s distinctive speech characteristics.”
Starting today, Spotify is making available voice-translated episodes from select creators, with shows translated from English to Spanish. In the coming days, French and German episodes will also be made available.
Available episodes in Spanish include:
Alternatively, head to the Voice Translation hub in Spotify’s app to view all of the translated shows that are currently available, with new shows arriving in the weeks and months to come.
Spotify said that around 100 million people “regularly” listen to podcasts on its platform, and its new AI-powered voice translation offering could notch up millions of more listens for shows that suddenly find themselves in new, enormous markets.
One of the tests of Spotify’s new feature will be whether the translation element manages to accurately capture the nuances of the original dialogue.
Either way, it looks like another setback for voice actors, as the technology could easily be transferred to movies and TV shows, replicating the voices of an entire cast for international versions of the content.
Voice cloning technology has been around for some time and is getting better all the time due to advances in AI. Not surprisingly, it’s already being used for nefarious purposes, too, with a growing number of related scams coming to the attention of law enforcement agencies. It’s feared that the technology could also fuel a rise in more convincing misinformation as bad actors use it to create audio of politicians or leading figures appearing to say things that they didn’t.