An analysis of tens of thousands of research-paper submissions has shown a dramatic increase in the presence of text generated using artificial intelligence (AI) in the past few years, an academic publisher has found.
To screen manuscripts for signs of AI use, the AACR used an AI tool that was developed by Pangram Labs, based in New York City. When applied to 46,500 abstracts, 46,021 methods sections and 29,544 peer-review comments submitted to 10 AACR journals between 2021 and 2024, the tool flagged a rise in suspected AI-generated text in submissions and review reports since the public release of OpenAI’s chatbot, ChatGPT, in November 2022.
In a preprint posted last year, Spero [Pangram Lab’s CEO] and his colleagues showed that Pangram’s accuracy was 99.85%, with error rates 38 times lower than that of other currently available AI-detection tools.
The AACR analysis suggests that current policies for disclosing AI use have limited effect. A further analysis found that 36% of 7,177 manuscripts submitted between January and June 2025 were flagged by Pangram for suspected AI-generated text in their abstract, but authors of only 9% of all submissions disclosed their use of AI to the journals.
I think most scientists are aware of the fact that ppl use AI to write/edit; heck a good number of them do it themselves, the most egregious one I’ve seen was someone passing everything they write to ChatGPT for edit (emails, abstracts, …) before sending it out. So this finding is not too surprising
I wasn’t aware of the company Pangram Labs, so I was glad to learn of what they are doing. Their method seems robust too: good AI-generated writing data paired with human writing, the ML model architecture they use is well-established too. They claim they can even distinguish different AI models (like ChatGPT apart from DeepSeek) which sounds cool… Sadly they said they can’t distinguish AI generated vs AI edits, there are some differences between the two :(
Pangram Lab’s preprint: https://arxiv.org/abs/2402.14873



That’s how I use LLMs as well, and I’d even argue this is one of the few use cases where it actually performs incredibly well since it’s doing what it was designed to do: generate natural-sounding language.
I rarely have it write an entire message for me. Usually, I write the message first and then let the LLM check it for grammar and clarity. I’ve instructed it not to rewrite the whole thing, just to correct grammar and make slight adjustments for coherence. If you compare the two versions side by side, the differences are usually quite subtle and I usually tweak it a little more before posting.
The classic “I didn’t have the time to write a short letter, so I wrote a long one instead” excuse doesn’t really apply any more. Ramble as much as you feel like and let the LLM worry about refining it into coherent communication. Minor tweaks are also welcome, if the original post is short enough.