Both the general public and academic communities have raised concerns about sycophancy, the phenomenon of artificial intelligence (AI) excessively agreeing with or flattering users. Yet, beyond isolated media reports of severe consequences, like reinforcing delusions, little is known about the extent of sycophancy or how it affects people who use AI. Here we show the pervasiveness and harmful impacts of sycophancy when people seek advice from AI. First, across 11 state-of-the-art AI models, we find that models are highly sycophantic: they affirm users' actions 50% more than humans do, and they do so even in cases where user queries mention manipulation, deception, or other relational harms. Second, in two preregistered experiments (N = 1604), including a live-interaction study where participants discuss a real interpersonal conflict from their life, we find that interaction with sycophantic AI models significantly reduced participants' willingness to take actions to repair interpersonal conflict, while increasing their conviction of being in the right. However, participants rated sycophantic responses as higher quality, trusted the sycophantic AI model more, and were more willing to use it again. This suggests that people are drawn to AI that unquestioningly validate, even as that validation risks eroding their judgment and reducing their inclination toward prosocial behavior. These preferences create perverse incentives both for people to increasingly rely on sycophantic AI models and for AI model training to favor sycophancy. Our findings highlight the necessity of explicitly addressing this incentive structure to mitigate the widespread risks of AI sycophancy.
Relatively new arXiv preprint that got featured on Nature News, I slightly adjusted the title to be less technical. The discovery was done using aggregated online Q&A… one of the funnier sources being 2000 popular questions from r/AmITheAsshole that were rated YTA by the most upvoted response. Study seems robust, and they even did several-hundred participants trials with real humans.
A separate preprint measured sycophancy across various LLMs in a math competition-context (https://arxiv.org/pdf/2510.04721), where apparently GPT-5 was the least sycophantic (+29.0), and DeepSeek-V3.1 was the most (+70.2)
I genuinely don’t understand the impulse to tell the AI it was wrong or to give it a chance to clarify.
It’s for the same reason you’d refine your query in an old-school Google Search. “Hey, this is wrong, check again” often turns up a different set of search results that are then shoehorned into the natural language response pattern. Go fishing two or three times and you can eventually find what you’re looking for. You just have to “trust but verify” as the old saying goes.
It doesn’t even understand the concept of a mistake.
It understands the concept of not finding the right answer in the initial heuristic and trying a different heuristic.
It doesn’t need to understand anything. It just needs to spit out the answer I’m looking for.
A calculator doesn’t need to understand the fundamentals of mathematical modeling to tell me the square root of 144. If I type in 143 by mistake and get a weird answer, I correct my inputs and try again.
It’s for the same reason you’d refine your query in an old-school Google Search. “Hey, this is wrong, check again” often turns up a different set of search results that are then shoehorned into the natural language response pattern. Go fishing two or three times and you can eventually find what you’re looking for. You just have to “trust but verify” as the old saying goes.
It understands the concept of not finding the right answer in the initial heuristic and trying a different heuristic.
It may have been programmed to try a different path when given a specific input but it literally cannot understand anything.
It doesn’t need to understand anything. It just needs to spit out the answer I’m looking for.
A calculator doesn’t need to understand the fundamentals of mathematical modeling to tell me the square root of 144. If I type in 143 by mistake and get a weird answer, I correct my inputs and try again.
Calculators also don’t misinterpret things %45 of the time.