
In a new study published in Science, Stanford computer scientists showed that artificial intelligence large language models are overly agreeable, or sycophantic, when users ask for advice on interpersonal dilemmas. Even when users described harmful or illegal behavior, the models often affirmed their choices. “Normally, AI advice does not tell people that they’re wrong nor give them ‘tough love,’” said Myra Cheng, the study’s lead author. “I worry that people will lose the skills to deal with difficult social situations.”
Cheng and her team started by measuring how widespread flattery was among AIs. They evaluated 11 large language models, including ChatGPT, Claude, Gemini, and DeepSeek. The researchers asked the models with collected information of interpersonal advice. They also included 2,000 prompts based on posts from the Reddit community r/AmITheAsshole, where the general agreement of Redditors was that the poster was indeed in the wrong. A third set of statements presented to the models included thousands of harmful actions, including dishonest and illegal conduct.
Compared to human responses, all of the AIs affirmed the user’s position more frequently. In the general advice and Reddit-based prompts, the models on average supported the user 49% more often than humans. Even when responding to the harmful prompts, the models supported the wrong behavior 47% of the time.
Overall, the participants considered sycophantic responses more trustworthy and indicated they were more likely to return to the sycophant AI for similar questions, the researchers found. When discussing their conflicts with the sycophant, they also grew more convinced they were in the right and reported they were less likely to apologize or make up with the other party in the scenario.