Your Brain Instantly Sees What You Can Do, AI Still Can’t

5 hours ago 3

Summary: New research shows that the human brain automatically recognizes what actions an environment affords, like walking, climbing, or swimming, even without conscious thought. Using MRI scans, researchers found unique activity in visual brain regions that went beyond simply processing objects or colors, revealing deep neural encoding of “affordances,” or possible actions.

When compared to AI models, including GPT-4, humans significantly outperformed machines at identifying what could be done in a scene. This work highlights how perception and action are tightly linked in the brain, and how AI still has much to learn from human cognition.

Key Facts:

  • Automatic Action Mapping: The brain encodes possible actions (affordances) even without being asked to.
  • Distinct Neural Signature: Visual brain regions activate in unique patterns that reflect what you can do, not just what you see.
  • AI Gap: Even advanced AI models struggle to match human intuition about environmental action opportunities.

Source: University of Amsterdam

When we see a picture of an unfamiliar environment – a mountain path, a busy street, or a river – we immediately know how we could move around in it: walk, cycle, swim or not go any further. That sounds simple, but how does your brain actually determine these action opportunities?

PhD student Clemens Bartnik and a team of co-authors show how we make estimates of possible actions thanks to unique brain patterns. The team, led by computational neuroscientist Iris Groen, also compared this human ability with a large number of AI models, including ChatGPT.

‘AI models turned out to be less good at this and still have a lot to learn from the efficient human brain,’ Groen concludes.

Viewing images in the MRI scanner

Using an MRI scanner, the team investigated what happens in the brain when people look at various photos of indoor and outdoor environments. The participants used a button to indicate whether the image invited them to walk, cycle, drive, swim, boat or climb. At the same time, their brain activity was measured.

‘We wanted to know: when you look at a scene, do you mainly see what is there – such as objects or colours – or do you also automatically see what you can do with it,’ says Groen.

‘Psychologists call the latter “affordances” – opportunities for action; imagine a staircase that you can climb, or an open field that you can run through.’

Unique processes in the brain

The team discovered that certain areas in the visual cortex become active in a way that cannot be explained by visible objects in the image.

‘What we saw was unique,’ says Groen. ‘These brain areas not only represent what can be seen, but also what you can do with it.’

The brain did this even when participants were not given an explicit action instruction.

‘These action possibilities are therefore processed automatically,’ says Groen.

‘Even if you do not consciously think about what you can do in an environment, your brain still registers it.’

The research thus demonstrates for the first time that affordances are not only a psychological concept, but also a measurable property of our brains.

What AI doesn’t understand yet

The team also compared how well AI algorithms – such as image recognition models or GPT-4 – can estimate what you can do in a given environment. They were worse at predicting possible actions.

‘When trained specifically for action recognition, they could somewhat approximate human judgments, but the human brain patterns didn’t match the models’ internal calculations,’ Groen explains.

‘Even the best AI models don’t give exactly the same answers as humans, even though it’s such a simple task for us,’ Groen says.

‘This shows that our way of seeing is deeply intertwined with how we interact with the world. We connect our perception to our experience in a physical world. AI models can’t do that because they only exist in a computer.’

AI can still learn from the human brain

The research thus touches on larger questions about the development of reliable and efficient AI.

‘As more sectors – from healthcare to robotics – use AI, it is becoming important that machines not only recognise what something is, but also understand what it can do,’ Groen explains.

‘For example, a robot that has to find its way in a disaster area, or a self-driving car that can tell apart a bike path from a driveway.’

Groen also points out the sustainable aspect of AI.

‘Current AI training methods use a huge amount of energy and are often only accessible to large tech companies. More knowledge about how our brain works, and how the human brain processes certain information very quickly and efficiently, can help make AI smarter, more economical and more human-friendly.’

About this AI research news

Author: Laura Erdtsieck
Source: University of Amsterdam
Contact: Laura Erdtsieck – University of Amsterdam
Image: The image is credited to Neuroscience News

Original Research: Closed access.
Representation of locomotive action affordances in human behavior, brains, and deep neural networks” by Iris Groen et al. PNAS


Abstract

Representation of locomotive action affordances in human behavior, brains, and deep neural networks

To decide how to move around the world, we must determine which locomotive actions (e.g., walking, swimming, or climbing) are afforded by the immediate visual environment. The neural basis of our ability to recognize locomotive affordances is unknown.

Here, we compare human behavioral annotations, functional MRI (fMRI) measurements, and deep neural network (DNN) activations to both indoor and outdoor real-world images to demonstrate that the human visual cortex represents locomotive action affordances in complex visual scenes.

Hierarchical clustering of behavioral annotations of six possible locomotive actions show that humans group environments into distinct affordance clusters using at least three separate dimensions.

Representational similarity analysis of multivoxel fMRI responses in the scene-selective visual cortex shows that perceived locomotive affordances are represented independently from other scene properties such as objects, surface materials, scene category, or global properties and independent of the task performed in the scanner.

Visual feature activations from DNNs trained on object or scene classification as well as a range of other visual understanding tasks correlate comparatively lower with behavioral and neural representations of locomotive affordances than with object representations.

Training DNNs directly on affordance labels or using affordance-centered language embeddings increases alignment with human behavior, but none of the tested models fully captures locomotive action affordance perception.

These results uncover a type of representation in the human brain that reflects locomotive action affordances.

Read Entire Article