Microsoft's New AI Can Mimic Anyone's Voice! Here's Why VALL-E Concerns Experts

Here are the things you need to know about VALL-E.

Microsoft's new AI can easily copy people's voices.

China's Falling Behind AI Race Due to 'Lack of Vision,' Researcher Claims
The shadow of Uruguyan developer Tammara Leites poses in front of a text generated by (digital Simon) thanks to artificial intelligence ahead of the (dSimon) performance at the Avignon fringe festival, in Avignon on July 14, 2022. Photo by CLEMENT MAHOUDEAU/AFP via Getty Images

The tech giant manufacturer quietly introduced the new artificial intelligence. There were no press releases or other big announcements for the technology.

But this is not the problem. Experts shared their concern about the ability of the new VALL-E to mimic anyone's voice.

If you are wondering why some individuals are alarmed by the arrival of VALL-E, here's what the AI is really capable of.

Microsoft's New AI, VALL-E

According to ZDNet's latest report, Microsoft's VALL-E is a new TTS (text-to-speech) system.

Microsoft's New AI Can Mimic Anyone's Voice! Here's Why VALL-E Concerns Experts
A Microsoft logo is pictured during the presentation of the Xbox One in Shanghai on July 30, 2014. A Chinese probe into Microsoft is probably targeting its "monopoly " of the country's operation system market, state media said, after the US software giant became the latest foreign firm to earn Beijing's scrutiny. Photo credit should read JOHANNES EISELE/AFP via Getty Images

The software giant said it is specifically a new neural codec language model that uses discrete codes derived from the neural audio codec.

Via its Github demo page, Microsoft wrote that the new AI could copy an individual's speaking style.

VALL-E can do this by listening to a three-second audio recording.

"VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt," said Microsoft.

VALL-E's Efficiency

Microsoft trained the new VALL-E TTS system on 60,000 hours of English language speech.

The tech firm used Meta's LibriLight audio library, which has over 7,000 audio recordings.

Based on Microsoft's samples, VALL-E could copy the voices of different speakers from LibriLight efficiently.

The software giant provided the original three-second audio recordings and compared VALL-E's versions.

Surprisingly, the TTS tech could copy the diction and speakers' speech. Most of VALL-E's audio is so similar that you won't notice any differences from the original ones.

This is where the problem starts. Many experts said that if scammers and other cybercriminals can create a similar technology, they can easily copy the voices of their victims.

Just imagine answering a call from strangers. After that, your voice is already being used for malicious campaigns.

If you want to learn more about why many individuals are concerned about the new VALL-E, you can click here.

The new VALL-E is not the only tech that Microsoft is busy with.

Recently, the new Microsoft Surface Duo 3 is reportedly integrated with a foldable display.

We also reported that Microsoft's Windows 7 secure boot has been rolled out.

For more news updates about VALL-E and other new AIs, keep your tabs open here at TechTimes.

Tech Times
Article owned by Tech Times | Written by Griffin Davis Photo owned by Tech Times
ⓒ 2024 TECHTIMES.com All rights reserved. Do not reproduce without permission.
Join the Discussion
Real Time Analytics