Meta to Add New AI-Powered Video Generation Capabilities to Apps

Meta AI

Meta has unveiled generative artificial intelligence (AI) research that shows how simple text inputs can be used to create custom videos and sounds and edit existing videos.

Dubbed “Meta Movie Gen,” this AI model builds upon the company’s earlier generative AI models Make-A-Scene and Llama Image, the company said in a Friday (Oct. 4) blog post.

Movie Gen combines the modalities of those earlier generation models and enables further fine-grained control, according to the post.

Meta is currently making this new model available to only some employees and outside partners but plans to integrate it into some of its existing apps next year, Bloomberg reported Friday, citing its interview with Connor Hayes, a Meta vice president focused on generative AI products.

Before making Movie Gen available to the public, Meta aims to prevent the technology from being used to make videos of people without their consent, according to the report.

Meta said in an April blog post that three of its platforms would label AI-manipulated content as “Made with AI.”

In Meta’s Friday blog post, the company said: “While there are many exciting use cases for these foundation models, it’s important to note that generative AI isn’t a replacement for the work of artists and animators. We’re sharing this research because we believe in the power of this technology to help people express themselves in new ways and to provide opportunities to people who might not otherwise have them.”

Movie Gen’s video generation capabilities enable it to generate 16-second videos with 16 frames per second, while accounting for object motion, subject-object interactions and camera motion, according to the post.

The foundation model can also generate personalized videos based on a person’s image and a text prompt, the post said.

With its video editing capabilities, Movie Gen can add, remove or replace elements and modify the background or style, per the post. It targets only the relevant pixels, leaving the original content otherwise preserved.

The model can also generate up to 45 seconds of audio based on a video and text prompts, adding ambient sound, sound effects and background music that are synced to the video content, according to the post.

Potential future uses of this technology include editing videos to share on Reels and creating animated birthday greetings to send via WhatsApp, per the post.

“As we continue to improve our models and move toward a potential future release, we’ll work closely with filmmakers and creators to integrate their feedback,” the post said. “By taking a collaborative approach, we want to ensure we’re creating tools that help people enhance their inherent creativity in new ways they may have never dreamed would be possible.”

For all PYMNTS AI coverage, subscribe to the daily AI Newsletter.