MAV3D by Meta AI

Developed by Meta AI, MAV3D (Make-A-Video3D) is a pioneering method that generates three-dimensional dynamic scenes from textual descriptions. It employs a 4D dynamic Neural Radiance Field (NeRF) optimized for scene appearance, density, and motion consistency by querying a text-to-video diffusion-based model. The resulting dynamic video outputs can be viewed from any camera location and angle, and integrated into various 3D environments. Notably, MAV3D does not require any 3D or 4D data for training; the text-to-video model is trained solely on text-image pairs and unlabeled videos. This approach represents a significant advancement in AI-driven content creation, enabling the seamless generation of dynamic 3D scenes from simple text inputs.

Visit Resource