Nvidia’s text-to-video technology will take your GIF game to the next level

Now that ChatGPT and Midjourney are pretty popular, the next big AI race is text-to-video generators – and Nvidia just showed off some impressive tech demos that could soon take your GIFs to a new level.

AND new scientific article and microsite (opens in a new tab) from Nvidia’s Toronto AI Lab, called “High-Resolution Video Synthesis with Latent Diffusion Models,” gives us a taste of the amazing video creation tools that will soon join the ever-growing list of the best AI graphics generators.

Hidden diffusion models (or LDMs) are a type of artificial intelligence that can generate videos without requiring a lot of computing power. Nvidia says its technology does this by relying on the work of text-to-image generators, in this case Stable Diffusion, and adding a “temporal dimension to the implicit space diffusion model.”

(Image credit: Nvidia)

In other words, its generative AI can make still images move in a realistic way and scale them up to use super-resolution techniques. This means it can generate short 4.7 second videos at 1280×2048 or longer videos at a lower resolution of 512×1024 to drive videos.

Our immediate thought after seeing the early demos (like the ones above and below) was how much this could improve our GIF game. Okay, there are bigger ramifications like the democratization of video creation and the prospect of automated film adaptations, but at this stage turning text into a GIF seems to be the most exciting use case.

A bear playing the electric guitar

(Image credit: Nvidia)

Simple prompts such as “stormtrooper vacuuming on the beach” or “bear playing electric guitar, high definition, 4K” produce quite useful results, even if some creations contain natural artifacts and morphing.

Leave a Reply

Your email address will not be published. Required fields are marked *