thewayne: (Default)
[personal profile] thewayne
Gotta love it. As is not unusual, the claims and expectations greatly exceeded the reality of what they can do. Likewise, passages from the Bible were also deemed to be AI-generated.

So what's up with this? AI writing detectors use "properties like "perplexity" and burstiness" to evaluate the text and make its classification." Perplexity is defined as how surprising the text is compared to what it was trained on. If the corpus of what the LLM was trained on was as many published English texts as the trainers could lay their hands on (legally or not), how much can be surprising? They describe perplexity along the lines of 'complete the following sentence: I would like a cup of ____'. If you answer water, coffee, or tea, you're not very perplexing. If you answered spiders, you're quite perplexing. At the time this article was written, there was one hit in Google for a cup of spiders.

The consequences of false positives, which are common? "A case reported by USA Today highlights the issue in a striking way. A student was accused of cheating based on AI text detection tools and had to present his case before an honor board. His defense included showing his Google Docs history to demonstrate his research process. Despite the board finding no evidence of cheating, the stress of preparing to defend himself led the student to experience panic attacks. Similar scenarios have played out dozens (if not hundreds) of times across the US and are commonly documented on desperate Reddit threads."


In other news, in the last week, a couple of AI cheating detector companies have stopped doing that line of work because it just isn't reliable.

https://arstechnica.com/information-technology/2023/07/why-ai-detectors-think-the-us-constitution-was-written-by-ai/

Date: 2023-08-12 05:48 am (UTC)
silveradept: A kodama with a trombone. The trombone is playing music, even though it is held in a rest position (Default)
From: [personal profile] silveradept
The stochastic parrots get tripped up by shibboleths, as we fully expected them to. It also says something about what they expect the average level of pompous erudition to be in any given piece of literature. Perhaps if an LLM were trained on the corpus of contemporary materials, it would have significantly fewer false positives. Of course, the LLM can't actually think itself through, go, "This sounds like it was written in the late 18th century, I should compare it to contemporary materials," and then render a judgment about whether it was AI-generated.

January 2026

S M T W T F S
    1 23
45 6 7 89 10
11 12 13 14 15 16 17
18 1920212223 24
25 2627 2829 3031

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 1st, 2026 05:01 pm
Powered by Dreamwidth Studios