OpenAI experts have trained a neural network to play Minecraft Equally high quality as human players.
The Neural Network was trained on 70,000 hours of miscellaneous in-game footage, complemented by a short Database Among the videos where contractors performed specific in-game tasks, along with Keyboard And Mouse Inputs were also recorded.
After fine-tuning, OpenAI discovered that the model is capable of performing all sorts of complex skills, from swimming to hunting animals and eating their meat. It also captures the “jump of the column”, a move where the player places a block of material below the mid-jump to achieve height.
Perhaps most impressive, AI was able to create diamond tools (requiring a long process in sequence), which OpenAI described as an “unprecedented” achievement for a computer agent.
An AI breakthrough?
The significance of the Minecraft project is that it demonstrates the effectiveness of a new strategy employed by OpenAI to train AI models – called video pretraining (VPT) – which the company says could accelerate the development of “common computer-user agents.”
Historically, AI model had difficulty using raw video as a source for training. What Easy enough to understand what happened, but not necessarily How. Indeed, the AI model will absorb the desired results, but there is no realization of the input combinations required to reach them.
With VPT, however, OpenAI combines a large video dataset with a large video dataset obtained from a public web source with a carefully curated pool of footage laying the basic model with relevant keyboard and mouse movements.
To fine-tune the base model, the team then plugs in small datasets designed to teach specific tasks. In this context, OpenAI has used footage of players performing early game operations, such as cutting down trees and creating craft tables, which is said to have made a “massive improvement” in the reliability of the model being able to perform these tasks.
Another strategy involves “rewarding” the AI model in order to achieve each step in the work sequence, an exercise known as empowerment learning. This process allows the neural network to collect all the components for a diamond pickaxe with a human-level success rate.
“VPT paves the way for agents to learn to work by watching large numbers of videos on the Internet. Compared to generative video modeling or reverse approaches that only give representative results, VPT offers exciting opportunities to learn large behavioral priorities directly in more domains than just language,” OpenAI Explained Blog post (Opens in new tab).
“Although we only test on Minecraft, the game is very open and the native human interface (mouse and keyboard) is very common, so we believe our results are good for other similar domains, such as for computer use.”
To encourage more experimentation in space, has partnered with OpenAI MineRL NeurIPS Contest, Donates its contractor data and model code to competitors trying to use AI to solve complex Minecraft tasks. Grand Prize: 100,000.