What’s become one of the internet’s go-to companies for creating realistic enough visual deepfakes now has the ability to clone your voice and force it to speak in a growing variety of tongues. ElevenLabs announced Tuesday its new voice cloning now supports 22 more languages than it did previously, including Ukrainian, Korean, Swedish, Arabic, and more.
According to ElevenLabs, the new Multilingual v2 model promises it can produce “emotionally rich” audio in a total of 30 languages. The company offers two AI voice tools, one is a text-to-speech model and the other is the “VoiceLab” that lets paying users clone a voice by inputting fragments of theirs (or others) speech into the model to create a kind of voice cone. With the v2 model, users can get these generated voices to start speaking in Greek, Malay, or Turkish.
The service went live on the company’s site around midday ET Tuesday. Users only need to type the text in its actual language to hear the translated voice, and it should work with any voice clone created by the company or by users. As a main English speaker, it’s hard to gauge how well each accented voice does representing each language, but the speech does take the time to seem naturalistic with the occasional breathless pause between sentences and quotes.
The ElevenLabs platform has seen its share of controversy after it launched last year. The company’s initial beta platform saw 4Chan users abusing its systems to impersonate celebrities, forcing them to say racist, misogynistic, and transphobic scripts. It was also used by AI evangelists to attack voice actors who complained about the widespread use of voice cloning tech. Since then, ElevenLabs claims its integrated new measures to ensure users can only clone their own voice. Users need to verify their speech with a text captcha prompt which is then compared to the original voice sample.
Company co-founder, the ex-Palantir executive Mati Staniszewski, said in a release “Eventually we hope to cover even more languages and voices with help of AI and eliminate the linguistic barriers to content.”
Out of Beta, ElevenLabs is Trying to Push AI Voices on Media
Alongside the new language capabilities, ElevenLabs also claimed this push now marks that its AI voice cloning tech is no-longer in its beta phase just as the company is drilling deeper into making the tech available to media companies. Back in June, ElevenLabs received $19 million in seed funding from the likes of tech kingmakers Andreesen Horowitz alongside former DeepMind head, now Inflection AI co-founder Mustafa Suleyman.
ElevenLabs promotes its voice cloning tech as a way for companies to create audiobooks, videos, and even voice NPCs in video games. The company claims it’s struck a deal with Paradox Interactive, the publisher behind games like the Hearts of Iron series and the upcoming The Lamplighters League. The company’s voice cloning tech has been explicitly cited by gaming voice over actors who are concerned the tech is being used to undercut their work.
Gizmodo reached out to Paradox for comment, but we did not immediately hear back.
On the books front, tech giants like Google and Apple have tried pushing AI-narrated audiobooks. Apple’s Books app started featuring narrators with bland names like “Archie,” and “Warren” to voice some content. Those who listen to audiobooks have noted these voices are—for lack of a better term—lifeless compared to the stock of professional voice actors who can actually pay attention to the rise and fall of a narrative. The actors union SAG-AFTRA and the Writers Guild of America are currently on strike, and a big part of the current negotiations with the entertainment industry have centered on AI.
However, ElevenLabs is promoting that AI voices can save publishing companies both time and money creating audiobooks. In a Monday blog post, the company promoted it worked with Lukeman Literary, a literary agency and small indie publishing company, to fine tune its audiobook processing. The company claimed it used to take agencies “weeks” to produce a single audiobook, but with AI that’s shortened to mere hours.
Lukeman Literary has helped publish books by big name public figures like Rutger Hauer and the Dalai Lama alongside other fiction works. In an email sent to Gizmodo, Lukeman stressed that his agency and publishing arms were distinct, so there weren’t any plans to convert the agency’s represented titles to AI narration. Still, as far as his publishing business, he said that he never embraced AI narration because the “quality” wasn’t there, but since testing ElevenLabs’ features he said he’s “finally impressed” enough to potentially use it. He further said that “AI narration is a godsend” for independent writers because it’s far cheaper than doing human narration.
Despite saying AI voice is finally good enough for prime time, Lukeman agreed that AI “will definitely pose a challenge” for voice actors but proposed that “some” authors and publishers will still want audiobooks voiced by a real human.
There’s also the potential for licensing voices, though “the big questions are how prevalent that work will be, how much new revenue it may add, and whether that results in an ultimate revenue loss or gain for narrators,” he said.
Whether or not voice actors will eventually be able to license their voice to AI for residuals, those sort of agreements are still foreign to the publishing industry that’s becoming more and more enamored with AI. With the strike still ongoing, it may take time to learn how the actors at large respond to an industry that’s looking for a way to cash in on the audiobooks trend, but without real human audio.