A computational model with this capability is a visual world model.Īn appealing alternative for robotic navigation and planning agents is to use a world model to encapsulate rich and meaningful information about their surroundings, which enables an agent to make specific predictions about actionable outcomes within their environment. People navigating in unfamiliar buildings can take advantage of visual, spatial and semantic cues to predict what’s around a corner. However, navigation cues learned in this way are expensive to learn, hard to inspect, and difficult to re-use in another agent without learning again from scratch. A typical approach is to implicitly learn what these cues are, and how to use them for navigation tasks, in an end-to-end manner via model-free reinforcement learning. For robotic agents, taking advantage of semantic cues and statistical regularities in novel buildings is challenging. For example, even in an unfamiliar house, if they see a dining area, they can make intelligent predictions about the likely location of the kitchen and lounge areas, and therefore the expected location of common household objects. When a person navigates around an unfamiliar building, they take advantage of many visual, spatial and semantic cues to help them efficiently reach their goal. Posted by Jing Yu Koh, Research Engineer and Peter Anderson, Senior Research Scientist, Google Research
0 Comments
Leave a Reply. |