Stability AI, the company known for developing Stable Diffusion, is teasing a new generative AI that has the capability to produce short-form videos based on a text prompt.
Dubbed as Stable Video Diffusion, this innovative AI consists of two models – SVD and SVD-XT – and is capable of creating clips at a resolution of 576 x 1,024 pixels. In addition, users can customize the frame rate speed to run between 3 and 30 FPS. Depending on the model chosen, the length of the videos will vary. SVD produces content that plays for 14 frames, while SVD-XT extends that to 25 frames. Despite the length, the rendered clips are limited to playing for about four seconds before ending, according to the official listing on Hugging Face.
The company has released a video on its YouTube channel showcasing the capabilities of Stable Video Diffusion. Test footage shows high-quality content, which is quite a departure from the low-quality results seen in other AI-generated videos. One of the most impressive demonstrations features an Ice Dragon with remarkable detail in its scales and panoramic mountain views. However, the animation is limited to basic movements such as a slow head bob or panning shots.
Despite its promising features, Stable Video Diffusion has some limitations. It reportedly cannot achieve perfect photorealism, generate legible text, and struggles with rendering faces. However, a demonstration on Stability AI’s website indicates that the model is capable of rendering a man’s face without any noticeable flaws, suggesting that its performance may vary based on the case.
As of now, the project is still in the early stages and not ready for a wide release. Stability AI has emphasized that Stable Video Diffusion is currently intended for research purposes only and is not meant for real-world or commercial applications. This cautionary approach is likely due to a previous incident where Stability Diffusion’s model leaked online, resulting in its misuse for creating deep fake images.
To enter the waitlist for trying out Stable Video Diffusion, interested individuals can fill out a form on the company’s website. While there is no official launch date, the preview will include a Text-To-Video interface. In the meantime, the AI’s white paper is available for those interested in delving into the technical details behind the project.
Moreover, the white paper mentions the use of publicly accessible video datasets for training, which is noteworthy given the previous lawsuit from Getty Images over allegations of data scraping. This suggests the team is being cautious to avoid any further disputes.
While the release date for Stable Video Diffusion remains unknown, there are other AI video maker options available. Interested readers can refer to TechRadar’s list of the best AI video makers for 2023.
I have over 10 years of experience in the cryptocurrency industry and I have been on the list of the top authors on LinkedIn for the past 5 years.