Sofia Hafner

Research

Training language models to be warm can reduce accuracy and increase sycophancy

Published in Nature

Gender trouble in language models: an empirical audit guided by gender performativity theory

Presented at ACM FAccT

Measuring what matters: Construct validity in large language model benchmarks

Presented at NeurIPS

Press

BBC News

The friendlier the AI chatbot the more inaccurate it is, study suggests

New study led by Lujain and Sofia finds that friendlier chatbots make more mistakes.

29 Apr 2026

Mashable

Study: Friendly AI chatbots may be less accurate

How does a friendlier chatbot respond to a falsehood about the moon landings?

29 Apr 2026

Nature

Friendlier LLMs tell users what they want to hear - even when it is wrong

A large language model that is trained to respond in a warm manner is more likely to give incorrect information and reinforce conspiracy beliefs.

29 Apr 2026

The Telegraph

Why you don’t want your AI chatbot to be nice to you

Systems trained to sound friendlier are up to 30 per cent less accurate, study finds

29 Apr 2026

The Guardian

Friendly AI chatbots more likely to support conspiracy theories, study finds

Chatbots programmed to respond warmly even cast doubts on Apollo moon landings and fate of Hitler, researchers say.

29 Apr 2026

The Verge

Friendly chatbots make more mistakes

The researchers found AI chatbots trained to be warmer were significantly more likely to make factual errors and agree with false beliefs than the originals.

29 Apr 2026

de Correspondent

The claims about increasingly smart AI models?

More vibe than science. Luc comments.

13 Nov 2025

NBC News

AI Revolution – NBC News discuss latest OII study exploring AI evaluation

The NBC Morning News programme discuss the findings from Andrew's latest study which finds weaknesses in how AI systems are evaluated.

09 Nov 2025

The Register

AI benchmarks are a bad joke - and LLM makers are the ones laughing

Covers our research finding that many AI benchmarks do not measure the right things.

07 Nov 2025

Gizmodo

AI Capabilities May Be Overhyped on Bogus Benchmarks, Study Finds

Covers our Measuring What Matters study on the construct validity of AI benchmarks.

06 Nov 2025

NBC News

AI’s capabilities may be exaggerated by flawed tests, according to new study

Researchers behind a new study say that the methods used to evaluate AI systems’ capabilities routinely oversell AI performance and lack scientific rigour.

06 Nov 2025

The Guardian

Experts find flaws in hundreds of tests that check AI safety and effectiveness

Scientists say almost all have weaknesses in at least one area that can ‘undermine validity of resulting claims’ with commentary and latest research findings from Andrew.

04 Nov 2025

The Daily Telegraph

ChatGPT is driving people mad

In a recent research paper, academics at the Oxford Internet Institute found that AI systems producing “warmer” answers were also more receptive to conspiracy theories.

17 Aug 2025

Oxford Internet Institute

Do language models have an issue with gender?

Feature piece by Sofia about our study on how to best evaluate if language models perpetuate gender stereotypes.

09 Jun 2025