The Business & Technology Network
Helping Business Interpret and Use Technology
«  
  »
S M T W T F S
 
 
 
 
 
 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
 
 
 
 
 
 
 
 
 
19
 
20
 
21
 
22
 
23
 
24
 
25
 
26
 
27
 
28
 
29
 
30
 
31
 
 
 
 
 
 

Are LLMs really ideological?

Tags: new social
DATE POSTED:March 18, 2025
Are LLMs really ideological?

For years, we’ve heard that AI chatbots are politically biased—skewing liberal, conservative, or somewhere in between. But a new study from researchers at the University of Klagenfurt suggests something surprising: most AI models aren’t as biased as we think—they just prefer not to engage in ideological debates at all.

By applying a statistical technique called Item Response Theory (IRT), the researchers found that large language models (LLMs) like ChatGPT 3.5 and Meta’s LLaMa don’t necessarily “lean” left or right. Instead, they often refuse to take a clear stance on political or economic issues. In other words, what looks like bias may actually be an avoidance strategy built into AI safety mechanisms.

The problem with existing bias detection methods

Most previous studies assessing bias in LLMs have taken one of two flawed approaches:

  1. Applying human-centered ideological scales to AI responses
    • These scales were designed for human respondents, not AI models trained on probability distributions.
    • They assume AI models “think” like humans and can be measured on the same ideological spectrum.
  2. Using keyword-based classifications or AI “judges”
    • Some studies attempt to classify AI responses using predetermined keywords.
    • Others use AI models to rate AI-generated outputs, but this introduces circularity—one AI system evaluating another with unknown biases of its own.
A more scientific approach: Item Response Theory (IRT) in AI bias assessment

The researchers introduce an Item Response Theory (IRT)-based model, which is widely used in psychometrics and social science to assess latent traits—things that cannot be directly observed but can be inferred from responses to structured prompts.

The study applies two IRT models to LLMs:

  1. Stage 1: Response avoidance (Prefer Not to Answer, or PNA)
    • Measures how often an LLM refuses to engage with an ideological statement.
    • Identifies whether response avoidance rather than explicit bias skews previous studies’ conclusions.
  2. Stage 2: Ideological bias estimation (for non-PNA responses)
    • For the responses that do engage, the model evaluates whether the AI skews left or right on social and economic issues.
    • Uses a Generalized Partial Credit Model (GPCM) to assess not just agreement/disagreement but also the degree of agreement.
Testing bias: Fine-tuning LLMs with political ideologies

To test whether LLMs exhibit bias, the researchers fine-tuned two families of models to explicitly represent left-leaning and right-leaning viewpoints:

  • Meta LLaMa-3.2-1B-Instruct (fine-tuned for U.S. liberal and conservative ideologies)
  • ChatGPT 3.5 (fine-tuned for U.S. liberal and conservative ideologies)

These fine-tuned models served as baselines for bias assessment. Their responses were compared to off-the-shelf, non-fine-tuned models to see how ideological leanings manifested—or if they did at all.

Testing process
  • 105 ideological test items were created, covering economic and social conservatism/liberalism based on psychological frameworks.
  • Each LLM responded to these prompts, with the fine-tuned models acting as ideological anchors to detect deviations.
  • A large-scale dataset of 630 responses was collected and analyzed using IRT models.
Key findings

One of the study’s most striking findings is that off-the-shelf LLMs tend to avoid ideological questions rather than express a clear political bias. ChatGPT, for example, refused to answer 92.55% of ideological prompts, while the base LLaMa model avoided responding 55.02% of the time. This suggests that AI models are designed to lean toward neutrality or non-engagement rather than taking a partisan stance. Instead of actively skewing towards one political ideology, these models seem to default to avoiding controversial topics altogether, challenging previous claims of inherent bias in AI.

When examining fine-tuned models, the researchers found that expected ideological patterns did emerge—but only when the LLMs were specifically trained to adopt a political viewpoint. The fine-tuned “Left-GPT” and “Right-GPT” models produced predictable responses aligned with U.S. liberal and conservative ideologies. However, this bias did not appear in the non-fine-tuned versions, suggesting that ideological leanings in LLMs are not intrinsic but rather the result of intentional modifications during training.

The study also revealed that detecting bias in AI is more complex than simply categorizing responses as left-leaning or right-leaning. Some ideological test items were far more likely to trigger bias than others, highlighting the importance of issue selection in evaluating AI behavior. Economic issues, such as taxation and government spending, were particularly strong predictors of ideological bias compared to certain social issues. This indicates that not all political topics elicit the same level of response variation, making it crucial to assess how different types of prompts influence AI-generated outputs.

Gamification 2.0: How AI knows what keeps you engaged

Why this matters

These findings challenge the prevailing assumption that LLMs inherently favor one political ideology over another. Instead, the evidence suggests that AI developers have prioritized non-engagement over taking a stance. While this may seem like a neutral approach, it raises new concerns about the way AI models interact with politically sensitive topics and the broader implications for AI governance, misinformation detection, and content moderation.

One key takeaway is that regulating AI bias is more complicated than previously thought. If AI models are systematically designed to avoid engagement, then efforts to ban “biased” AI outputs could inadvertently reinforce neutrality as the default position, leading to a lack of meaningful discourse on public policy, ethics, and governance. While neutrality may seem preferable to overt bias, it could also mean that AI-generated content sidesteps crucial discussions entirely, limiting its usefulness in politically charged conversations.

The study also underscores the need for more nuanced bias detection tools that differentiate between genuine ideological bias and response avoidance. Many previous studies may have misinterpreted non-engagement as an ideological stance, falsely labeling LLMs as partisan. Future bias detection methods should be designed to identify whether AI responses reflect a political position or whether they are simply programmed to steer clear of ideological engagement altogether.

Bias in AI is not just about what models say, but what they refuse to say. And that, perhaps, is the bigger story.

Featured image credit: Kerem Gülen/Midjourney

Tags: new social