Google DeepMind's Genie 2 can generate interactive 3D worlds

World models (AI algorithms that can generate simulated environments in real time) are one of the most impressive applications of machine learning. There’s been a lot of movement in this space over the last year, which is why Google DeepMind announced Genie 2 on Wednesday. While the previous model was limited to generating 2D worlds, the new model allows you to create 3D worlds and maintain them for significantly longer.

Genie 2 is not a game engine. Instead, it is a diffusion model that generates images as the player (human or another AI agent) moves through the world that the software is simulating. Genie 2 infers ideas about the environment when generating frames, providing the ability to model water, smoke, and physical effects. However, some of those interactions can be very game-like. Additionally, the model is not limited to rendering scenes from a third-person perspective; it can also handle first-person and isometric views. All you need to get started is a single image prompt, either provided by Google’s own Imagen 3 model or a real-world image.

Introducing Genie 2: an AI model that lets you create an infinite variety of playable 3D worlds, all from a single image. 🖼️

Such large-scale underlying world models could allow future agents to be trained and evaluated in countless virtual environments. →… pic.twitter.com/qHCT6jqb1W

— Google DeepMind (@GoogleDeepMind) December 4, 2024

In particular, Genie 2 remembers parts of a simulated scene even after they leave the player’s field of vision, allowing it to accurately reconstruct those elements when they appear again. This is in contrast to other world models like Oasis. Oasis, at least in the version Decart released to the public in October, had trouble remembering the layouts of the Minecraft levels it generated in real time.

However, there are even limits to what the Genie 2 can do in this regard. DeepMind says its model can generate a “consistent” world for up to 60 seconds, and most of the examples the company shared on Wednesday can be run in significantly less time. In this case, most videos are around 10-20 seconds long. Additionally, the longer Genie 2 needs to maintain the illusion of a consistent world, the more artifacts are introduced and image quality degrades.

DeepMind did not elaborate on how it trained Genie 2, other than to say it relied on “large-scale video datasets.” Don’t expect DeepMind to make Genie 2 publicly available any time soon. For now, the company is primarily using the model as a tool for training and evaluating other AI agents, including its own SIMA algorithm, and as something that artists and designers can use to prototype and quickly try out ideas. I’m thinking about it. In the future, DeepMind suggests that world models like Genie 2 are likely to play a key role on the path to artificial general intelligence.

“Training more general embodied agents has traditionally been bottlenecked by the availability of sufficiently rich and diverse training environments,” DeepMind said. “As we have shown, Genie 2 will enable future agents to be trained and evaluated in a new, world-bound curriculum.”

Source link

What's Hot

Review: Valiant Beyond: Bloodshot Special #1

Review: Phantom vs Sky Band #1

Review: Doruk #2

Tesla’s “Robotaxi” brand may be too common for trademarks

Know what time this cool asteroid clock is

Get more than $ 400 from one of our favorite alien wear game monitors

Transformers #22 Review

Comic Book Review: Doctor Who #1 (2020)

The Invincible Universe: Battle Beast #5 Review

Transformers #21 Review

Review: Valiant Beyond: Bloodshot Special #1

Review: Phantom vs Sky Band #1

Review: Doruk #2

Review: Star Trek #35 (1992)

Our Picks

Review: Valiant Beyond: Bloodshot Special #1

Review: Phantom vs Sky Band #1

Review: Doruk #2

Most Popular

The best gaming laptops for 2024

Iranian hackers tried to leak Trump information to the Biden campaign

EU gives Apple six months to ease interoperability between devices

Subscribe to Updates

What's Hot

Google DeepMind’s Genie 2 can generate interactive 3D worlds

Related Posts