The Business & Technology Network
Helping Business Interpret and Use Technology
S M T W T F S
 
 
 
 
 
 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
 
 
 
 
14
 
15
 
16
 
17
 
18
 
19
 
20
 
21
 
22
 
23
 
24
 
25
 
26
 
27
 
28
 
29
 
30
 
 
 
 
 
 
 

AI Needs Spatial Intelligence for Next Leap, Experts Say

DATE POSTED:November 12, 2025

Fei-Fei Li, known as the “godmother of AI,” argues that the current era of artificial intelligence (AI), dominated by large language models and image classifiers, has reached its limits.

In her new essay, “From Words to Worlds,” she writes that while machines have mastered text and images, they still lack the grounding needed to understand space, motion and consequence. “We’ve built machines that can read and write,” she wrote, “but not ones that can see, move and live in the world.”

Why AI Needs Spatial Intelligence

Li’s central point is that without spatial intelligence, the ability to perceive geometry, depth, movement and the relationships between objects, AI cannot advance much further. Large language models generate and organize information but operate entirely in abstraction, without awareness of how things change in real space. Computer vision systems can identify objects but cannot reason about how those objects interact, move or evolve. A recent PYMNTS reporting noted that teaching machines to see like humans will separate the next generation of autonomous systems from static pattern recognizers.

Spatial intelligence allows machines to understand not only what is in a scene but how that scene will change when an object moves or when conditions shift. Without this capability, robots and smart systems remain limited to narrow tasks and struggle to adapt in unpredictable environments. Li and others believe that giving machines the ability to model and predict real-world dynamics will allow them to function more safely and effectively where conditions shift from moment to moment.

What Current AI Misses

Today’s AI can analyze and classify, but it lacks the ability to reason about action and consequence. Language models may know what a door is, but they cannot determine whether it is open or closed, or what will happen if it swings shut. Vision systems detect motion or objects but cannot infer intent or anticipate what happens next.

A PYMNTS article described the problem clearly: AI can label, sort and describe the world but cannot live in it. Systems trained on static images or text struggle when reality differs from their training data.

This gap limits how AI is used outside controlled environments. Factory robots still rely on preprogrammed layouts, autonomous vehicles struggle with rare or unexpected events, and many AI systems remain observers rather than active participants in the environments they are meant to serve. Li writes that until AI gains spatial awareness, its role will be restricted to narrow predictions rather than broad autonomous behavior.

Building World Models

To address these limitations, researchers are developing what are known as world models, systems designed to understand how the world behaves rather than just how it looks. These models integrate perception, simulation, spatial reasoning and prediction so that machines build an internal model of cause and effect. Instead of learning only from text and static images, they learn from environments, simulations and sensor inputs to understand how objects move, interact and change over time.

Google DeepMind’s Genie 3 can generate 3D environments governed by physics, where AI agents learn by exploring virtual worlds rather than static datasets. Nvidia’s Cosmos platform follows a similar path, training robots in simulated environments that mirror real-world physics. These developments reflect a broader industry shift away from pattern recognition toward grounded understanding.

The World Economic Forum recently described this shift as the next major frontier for AI. In its 2025 report on spatial computing, wearables and robotics, the organization said spatial technologies are beginning to merge the digital and physical worlds, creating “a persistent layer of intelligence that understands context, movement and interaction.”

The Forum noted that the convergence of sensors, computer vision and real-time mapping will underpin the next decade of human-machine collaboration, where AI does not just process data but interprets the world as it unfolds. It highlighted how advances in spatial computing could enable more intuitive interfaces, smarter industrial robots and adaptive urban systems that respond dynamically to how people and machines move through space.

These insights echo Li’s view that true progress in AI depends on grounding intelligence in the physical world. Spatial computing gives AI the sensory inputs and situational awareness required to interact safely and effectively with its surroundings, turning static models into active participants in the environments they serve.

How Things Will Change

Li believes spatial intelligence will redefine what AI can do and how it behaves. With world models, machines will shift from passive analysis to active planning and adaptation. A warehouse robot could plan a path around shifting inventory rather than stopping when blocked. An autonomous car could anticipate pedestrian movement instead of waiting for explicit signals. Even digital assistants could one day interpret gestures, spatial context or shared visual frames.

The industrial implications are significant. A recent PYMNTS article on spatial computing and digital twins noted that enterprises are pairing AI with real-world mapping and sensor data to test decisions virtually before executing them in the real world. Digital replicas of real environments are becoming training grounds for spatially aware AI, enabling high-risk simulations in low-risk settings.

The post AI Needs Spatial Intelligence for Next Leap, Experts Say appeared first on PYMNTS.com.