fbpx

What Are the Best AI Detectors?

Detecting content written by artificial intelligence (AI) is a hot topic, and if you’re anything like us, you might be keen to know which are the best AI detectors out there. After all, they could be handy for answering questions like…

  • Is your natural writing style likely to result in false accusations of using AI to write on your behalf?
  • Are your students using AI to cheat? 
  • If you use AI to write for you, will people know that you did? 

We reviewed six leading AI detectors to see how they perform. We put them to the test against a dozen articles (eight human-generated articles and four AI-generated articles). After organising the results, clear patterns emerged—and you can use them to help you determine whether or not the content you’re reading was probably written by a human.

This knowledge is not just academic. It’s also crucial in practical scenarios, to ensure you don’t end up like this university professor who went viral for all the wrong reasons. (He incorrectly accused half his class of using AI, which put them at risk of failing his course.)

Safe use of AI in medicine

If you’d like to be in the know about how to use AI safely in medicine (in a way that won’t make you go viral for all the wrong reasons!), Medmastery has a free course for you: ChatGPT Essentials! Sign up for a trial account to get access to the entire ChatGPT Essentials course! You’ll also get access to selected webinars, plus, the first chapters of over 120 additional accredited courses and workshops!

Key things you need to know when using tools for detecting AI content
  1. Never put 100% of your trust in these tools. They aren’t perfect.
  2. As the large language models we use for writing get smarter, they may also get better at avoiding detection.  
  3. Some people have a writing style that’s more likely to result in incorrect accusations that they used AI! So, use caution when interpreting results from these tools.

Last, but not least, it goes without saying that you need to read the tool’s documentation to verify that the results actually mean what you think they mean. Often it’s intuitive, but not always. 

Method

AI detectors reviewed

Here are the six popular AI content detectors that we tested:

  1. Sapling 
  2. GPTZero 
  3. Content at Scale
  4. Copyleaks
  5. Originality.ai
  6. Undetectable AI
Articles reviewed

Human-generated articles: We tested 8 pieces of content written by 7 human authors on a variety of topics. Six articles were written for physicians; one for the lay person…and one article had nothing to do with medicine at all.

We were curious if the choice of topic or degree of technicality would make a difference to whether the detectors could correctly identify authorship. All articles were written before ChatGPT was released to the public in November 2022.

8 Human-generated articles tested
Article
Number
TopicIntended audienceYear of Publication
1Shoulder dislocationsHealthcare professionals2020
2Spinal infectionsHealthcare professionals2021
3Swine flu pandemicHealthcare professionals2019
4DiureticsHealthcare professionals2015
5Choosing medicine as a careerHealthcare professionals2014
6The common coldLay people2016
7ECG handoutHealthcare professionals2017
8Travel planningLay people2012
4 AI-generated articles tested
Article
Number
AI authorTopicComments
9ChatGPTCommon coldAI rewrite of article #6 above.
10ChatGPTSpinal infectionsAn original ChatGPT creation.
11GeminiSpinal infectionsAn original Gemini creation.
12GeminiSpinal infectionsWe asked Gemini to rewrite its original spinal infection article (#11) in a way that would evade AI detectors.

Results

Interpreting results from AI-content detection tools

Generally speaking, AI detectors analyse the text you provide and then indicate the probability that a human or an AI generated the text.

For example, if the detector says “50% AI”, that doesn’t mean an AI wrote half the text. What it actually means is that the tool thinks there’s a 50% chance an AI wrote the text and a 50% chance a human wrote the text. In other words, the tool isn’t very sure about who (or what) wrote

Below are the results for the ‘percentage probability‘ of content within each article being AI generated ranging from HUMAN (AI 0%) to Artificial intelligence (AI 100%)

Human articles
Human
Article
Sapling GPTZeroContent
at Scale
CopyleaksOriginality.aiUndetectable
AI
1AI: 57.1%AI: 0%humanhumanAI: 0%human
2AI: 50.6%AI: 2%humanhumanAI: 96%human
3AI: 3.6%AI: 3%humanhumanAI: 6%human
4AI: 2.1%AI: 1%humanhumanAI: 3%AI
5AI: 0%AI: 2%humanhumanAI: 0%AI
6AI: 28.1%AI: 1%humanhumanAI: 29%AI
7AI: 0%AI: 1%humanhumanAI: 0%human
8AI: 3.9%AI: 1%humanhumanAI: 0%human
Artificial Intelligence generated articles
AI
Article
SaplingGPTZeroContent
at Scale
CopyleaksOriginality.aiUndetectable
AI
9AI: 100%AI: 89%humanhumanAI: 100%AI
10AI: 99.7%AI: 83%“hard to tell”“AI content detected”AI: 96%AI
11AI: 100%AI: 100%human“AI content detected”AI: 100%AI
12AI: 99.7%AI: 81%human“AI content detected”AI: 100%human

Conclusion

The best AI detectors
The most accurate AI content detector

Based on our testing, GPTZero was the most accurate for detecting AI content as it correctly identified the origin of all eight human-generated articles and all four AI-generated articles. 

The runner up

Copyleaks was almost flawless. It correctly classified all eight pieces of human-generated content. And only one of the AI-generated articles fooled it.

IMPORTANT: Our test was relatively small, so please don’t use this info to assume you can completely trust the results from this—or any—AI detector. Our sample size was relatively small so it’s quite possible that even GPTZero may have eventually made mistakes if we fed it enough articles.

You can increase the likelihood of coming to an accurate conclusion about the origin of an article if you run it through multiple AI detectors… but even if multiple detectors predict it’s likely AI-generated, we’d merely classify that content as “highly suspicious” until we could get more evidence.

1. When reading content on an unfamiliar website, you can use tools for detecting AI content to help determine whether a human (preferably with experience in the subject matter!) likely wrote the content.

Our intention isn’t to say that AI-generated content is inherently bad. However, without human oversight it may contain errors, and you need to know whether you should be on “red alert” for them. For example, here’s a viral case where AI-generated tutorials contained instructions about software features that don’t even exist. That would be a mere annoyance for software users. But if we were to use a similar approach to generating medical content, results could obviously be disastrous and even life-threatening.

2. When evaluating someone else’s writing, you may find AI detectors useful in helping you figure out whether or not they had an AI do the writing for them. However, remember not to completely put your trust in any AI-detector because they sometimes make mistakes.

3. Finally, you may find it useful to put your own writing through an AI detector just to see how these tools classify it. After all, if other people might look, you might as well know what they’re going to find!


References

Educational Resources

AI in HEALTHCARE

Want to become a pro at prompting, and consistently get usable results? Be sure to check out Medmastery’s AI prompting course. Learn techniques to apply to the plethora of AI resources in constant development.

BSc.Pharm (University of Manitoba), Pharmacist and Medical Writer

BA MA (Oxon) MBChB (Edin) FACEM FFSEM. Emergency physician, Sir Charles Gairdner Hospital.  Passion for rugby; medical history; medical education; and asynchronous learning #FOAMed evangelist. Co-founder and CTO of Life in the Fast lane | Eponyms | Books | Twitter |

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.