Meta’s AI-powered audio codec promises 10x compression over MP3

Enlarge / An illustrated depiction of knowledge in an audio wave.

Meta AI

Last week, Meta introduced an AI-powered audio compression technique referred to as “EnCodec” that may reportedly compress audio 10 instances smaller than the MP3 format at 64kbps with no loss in high quality. Meta says this system might dramatically enhance the sound high quality of speech on low-bandwidth connections, similar to telephone calls in areas with spotty service. The method additionally works for music.

Meta debuted the expertise on October 25 in a paper titled “High Fidelity Neural Audio Compression,” authored by Meta AI researchers Alexandre Défossez, Jade Copet, Gabriel Synnaeve, and Yossi Adi. Meta additionally summarized the analysis on its weblog dedicated to EnCodec.

Meta claims its new audio encoder/decoder can compress audio 10x smaller than MP3.Enlarge / Meta claims its new audio encoder/decoder can compress audio 10x smaller than MP3.

Meta AI

Meta describes its technique as a three-part system educated to compress audio to a desired goal dimension. First, the encoder transforms uncompressed information right into a decrease body charge “latent space” illustration. The “quantizer” then compresses the illustration to the goal dimension whereas retaining observe of a very powerful info that may later be used to rebuild the unique sign. (This compressed sign is what will get despatched by way of a community or saved to disk.) Finally, the decoder turns the compressed information again into audio in actual time utilizing a neural community on a single CPU.


A block diagram illustrating how Meta's EnCodec compression works.Enlarge / A block diagram illustrating how Meta’s EnCodec compression works.

Meta AI

Meta’s use of discriminators proves key to creating a technique for compressing the audio as a lot as attainable with out shedding key parts of a sign that make it distinctive and recognizable:

“The key to lossy compression is to identify changes that will not be perceivable by humans, as perfect reconstruction is impossible at low bit rates. To do so, we use discriminators to improve the perceptual quality of the generated samples. This creates a cat-and-mouse game where the discriminator’s job is to differentiate between real samples and reconstructed samples. The compression model attempts to generate samples to fool the discriminators by pushing the reconstructed samples to be more perceptually similar to the original samples.”

It’s value noting that utilizing a neural community for audio compression and decompression is way from new—particularly for speech compression—however Meta’s researchers declare they’re the primary group to use the expertise to 48 kHz stereo audio (barely higher than CD’s 44.1 kHz sampling charge), which is typical for music information distributed on the Internet.

As for functions, Meta says this AI-powered “hypercompression of audio” might help “faster, better-quality calls” in unhealthy community circumstances. And, in fact, being Meta, the researchers additionally point out EnCodec’s metaverse implications, saying the expertise might finally ship “rich metaverse experiences without requiring major bandwidth improvements.”

Beyond that, possibly we’ll additionally get actually small music audio information out of it sometime. For now, Meta’s new tech stays within the analysis section, however it factors towards a future the place high-quality audio can use much less bandwidth, which might be nice information for cellular broadband suppliers with overburdened networks from streaming media.


Please enter your comment!
Please enter your name here

Popular Posts

Together At Last: Titans Promises a Tighter Team and Darker Foes

The Titans have confronted interdimensional demons, assassins and a famously fearsome psychiatrist, however are they ready for what’s coming subsequent? HBO Max’s Titans returns...

Tweet Saying Nets ‘Formally Released Kyrie Irving’ Is Satire

Claim: The Brooklyn Nets launched Kyrie Irving from the NBA crew on Nov. 3, 2022. Rating: On Nov. 3,...

Data intelligence platform Alation bucks economic tendencies, raises $123M

Join us on November 9 to learn to efficiently innovate and obtain effectivity by upskilling and scaling citizen builders on the Low-Code/No-Code Summit. Register...

Medieval II Kingdoms expansion release date revealed

If you’ve been itching for extra Total War gameplay, we’ve received one thing for you. Feral Interactive has lastly revealed the Total War:...