OpenAI Confirms Reliable Tool for Detecting ChatGPT-Generated Text, Weighs Public Release

OpenAI has developed a tool that can detect content created by ChatGPT, addressing concerns about individuals claiming AI-generated work as their own.

This text watermarking feature could be transformative for educators and employers suspecting misuse of the chatbot. However, OpenAI is currently deliberating its release, likely due to potential impacts on its sales.

The Wall Street Journal revealed that OpenAI has had this watermarking system ready for about a year but has debated internally whether to make it public. The system works by subtly altering the way ChatGPT selects words and phrases, creating an undetectable pattern that another tool can identify—essentially an invisible watermark.

OpenAI updated a blog post following the WSJ report, confirming,

"Our teams have developed a text watermarking method that we continue to consider as we research alternatives."

The post notes that while watermarking is highly accurate and effective against localized tampering, such as paraphrasing, it struggles with globalized tampering, including using translation systems and rewording text with another generative model. Users could even bypass the system by asking the model to insert a special character between every word and then remove that character.

OpenAI also highlights that text watermarking could disproportionately impact certain groups, such as non-native English speakers, by stigmatizing the use of AI writing tools.

This watermarking method is one of several solutions OpenAI is exploring. The company has also investigated classifiers and metadata as part of "extensive research on the area of text provenance."

An OpenAI spokesperson told TechCrunch that while the WSJ report is accurate, the company is taking a "deliberate approach" due to "the complexities involved and its likely impact on the broader ecosystem beyond OpenAI."

Despite OpenAI's stated reasons for hesitating on the watermarking method, a significant factor is likely user preference: 30% of surveyed ChatGPT users indicated they would use the chatbot less frequently if watermarking were implemented.

In March 2023, a study by computer scientists from the University of Maryland concluded that text generated by LLMs cannot be reliably detected in practical scenarios. OpenAI seemed to agree, shutting down its AI classifier tool, designed to determine the likelihood that a piece of text was AI-generated, in July last year due to its low accuracy rate.

News Source: TechSpot