Up to 17 percent of AI conference reviews written by AI

Occurred: March 2024

Researchers have found that between 6.5 percent and 16.9 percent of text submitted as peer reviews to major scientific conferences are likely to have been substantially modified by LLMs.

Stanford University, NEC Labs America, and UC Santa Barbara researchers analysed the peer reviews of papers submitted to leading AI conferences ICLR 2024, NeurIPS 2023, CoRL 2023 and EMNLP 2023, and discovered that large language models tend to employ adjectives like "commendable," "innovative," and "comprehensive" more frequently than human authors.

The insight enabled the researchers to compare peer reviews by AI-powered machines against those of humans. 'Our results suggest that between 6.5 percent and 16.9 percent of text submitted as peer reviews to these conferences could have been substantially modified by LLMs, i.e. beyond spell-checking or minor writing updates,' they concluded.

The researchers contended that such practices potentially deprive those whose work is being reviewed of diverse feedback from experts, and that AI feedback risks a homogenisation effect that skews toward AI model biases and away from meaningful insight. They also argued the scientific community should be more transparent about the use of LLMs. 

Incident databank

Operator:  
Developer: OpenAI
Country: Global
Sector: Research/academia
Purpose: Generate conference peer reviews
Technology: Machine learning
Issue: Cheating/plagiarism
Transparency: Governance