PYMNTS-MonitorEdge-May-2024

Meta Introduces Generative AI Model With Speech Generation Capabilities

Meta Platforms has introduced a generative artificial intelligence (AI) model that performs speech-generation tasks.

The new Voicebox can help with tasks like audio editing, sampling and styling, Meta said in a Friday (June 16) press release.

“Voicebox can produce high-quality audio clips and edit prerecorded audio — like removing car horns or a dog barking — all while preserving the content and style of the audio,” Meta said in the release. “The model is also multilingual and can produce speech in six languages.”

The capabilities of Voicebox include text-to-speech generation using audio samples as short as two seconds, recreating portions of speech for editing and noise reduction, and producing a reading of text in an individual’s voice in any of those six languages, according to the press release.

The languages Voicebox currently includes are English, French, German, Spanish, Polish and Portuguese, the release said.

Future use cases for this new generative AI tool could include giving natural sounding voices to virtual assistants and non-player characters in the metaverse, allowing visually impaired people to hear written messages in the voices of their friends, creating and editing audio tracks, and helping people communicate in other languages in their own voice, per the release.

“Voicebox is an important step forward in our generative AI research, and we look forward to continuing our exploration in the audio space and seeing how other researchers build on our work,” Meta said in the release.

PYMNTS reported Tuesday (June 13) that generative AI is bringing brands’ customer service to its next horizon.

The technology can detect emotion, offer advice and complete entire transactions.

Already, 61% of consumers say that voice assistants will become as smart and reliable as human assistants, and 41% say that will happen within five years, according to the PYMNTS report “How Consumers Want to Live In the Voice Economy.”

Google parent Alphabet and Microsoft have also talked up generative AI for voice applications.

In April, both companies talked up the ways they are respectively developing and rolling out generative AI tools across the enterprise, including those that help with content creation, collaboration and better, more personalized search results.

PYMNTS-MonitorEdge-May-2024