Unreliable Narrators: Hallucinations May Be Causing Your Generative AI Tools To Lie to You
by Alex Applegate on Jul 25, 2023 1:17:33 PM
Artificial intelligence is enjoying a bit of what appears to be a golden era recently, but it’s not always a good thing, and sometimes, it willfully lies to you.
Web-based artificial intelligence tools, particularly Large Language Models (LLMs), such as ChatGPT, have experienced explosive growth in the last year. ChatGPT has grown to over one billion users in the last few months, stable diffusion has become a household word in some circles, hundreds (if not thousands) of tools have emerged that can create a digital picture from a description, create entire songs in minutes, or maybe even just do your homework for you. Just over the span of the last 30 days, DNSFilter has observed a 55% increase in traffic to domains related to “OpenAI” and a 32% increase in traffic to domains related to “ChatGPT”.
Understandably, there has been an equivalent panic. You can not only make digital art or music, but you can train it specifically to impersonate the work of another artist. Even if someone isn’t stealing someone else’s intellectual property, is a song, image, or prose generated by a person using an AI tool actually art? If students can ask “the interwebs” to write their essays, how can a teacher evaluate if the material was understood? It’s a short jump to arrive at concerns over deepfakes and their threat to trusting any form of media in the world around us. I even gave an interview last month about the urgency of preventing the threats of a rogue AI.
The Threat of AI
Not to be ignored is a very real threat that malicious actors can leverage these same AI tools and concepts to analyze, attack, and deceive our networks and web resources, but can even be used adversarially against our cybersecurity defenses, whether human, automated, or artificial intelligence-enhanced.
Each of those concerns are very real, and have intense debate surrounding all of them now. Some venues are banning any AI-generated art, unions are taking fierce stands against the use of AI to create movies, music, voices, and images in commercial spaces. There is a thriving market of AI detection engines that use AI to check to see if essays were written by an AI. Politicians, policy makers, and heads of industry are conducting urgent meetings to figure out how to best legislate the technology, or at least the best way to institute guard rails and ethics guidelines to stay ahead of an emerging industry that’s taken on a life of its own. Everywhere you look, artificial intelligence is both the benevolent savior destined to reshape the world and simultaneously a malevolent and unstoppable force poised to destroy everything it touches, and it’s readily available in every corner of the Internet.
And the concept is truly magical. The model is trained on some massive body of information, say, maybe, huge chunks of the Internet, and it learns everything about everything. If you ask it a question, it’ll give you an answer—a really believable answer, that sounds more and more like a human wrote it every day. And if you don’t like the initial version then you can just keep tuning it and adding parameters until it understands your requirements and it’ll produce something magical, probably better than you could as a human, in next to no time. Life will never be the same ever again.
Except...well, sometimes it misses the bar. Even early on, word began to spread that huge, fully realized studies could be performed unsupervised entirely by the model, unless someone with more than a surface level understanding of the topic actually read the result, and then it sounded much less impressive much of the time. Don’t get me wrong, that’s still unbelievably impressive, but maybe it’s not the panacea we thought it was, at least not yet. But it does get worse.
For example, stories have begun to emerge about the models sometimes making typographical errors. How can a machine learning model that learned how to use and spell a word through the examination of a large number of examples? Realistically, it probably shouldn’t, but the inner workings of such a program are extremely complex, and sometimes the decision making logic follows the wrong branch on the decision tree. Even more strangely, if part of your instructions are to watch out for mistakes, most of the time it will probably catch itself, assuming it doesn’t somehow hit that same snag in logic again. The New Yorker posted a couple of articles earlier this year discussing their analysis of the technical aspects of how this would be possible.
The industry seems to have settled on calling these errors in judgment “hallucinations,” although that may not be much more comforting as a euphemistic anthropomorphism than “glitch.” There are a number of things that could cause such a hallucination. For example, the machinery to run these massive datasets can be very expensive, and companies may choose to cut corners where they can. Or maybe they not only use cheaper machines, but also smaller data sets. But there are very real efforts to attempt to leverage these tools to work smarter and depending on them far too much.
There are too many instances to link, but a search on either side of the education divide will create a deluge of complaints from teachers that all of their students are trying to use ChatGPT to coast in their classes, and just as much of a flood of students complaining that their teachers fed their hard work through an AI detector program and they were threatened with failure because their work scored that it was written by an AI with certainty.
In New York, a lawyer is facing sanctions because his court filings cited 6 cases that were entirely fabricated by ChatGPT and he failed to verify them. A researcher for the Wall Street Journal reported a case where generative AI was asked for a definition for something that didn’t exist, so it made one up and even provided references. The New York Times recently posted about a series of prompts discussing dozens of these inaccurate prompts in an effective exercise in demonstrating how common they are.
And there are more frightening and intentionally malicious risks as well. There is even research into capitalizing on using hallucinations in adversarial generative AI to intentionally deliver you malware when you stumble across something that doesn’t actually exist. There is also research into adversarial inputs designed to force the AI into providing a misclassification based on its training model and data.
Generally speaking, however, the good news is that if you are moderately insistent on checking the output and doing basic fact-checking against the results, an AI hallucination can be relatively easy to detect. Misspellings and links that don’t exist can be mitigated with a little bit of effort. The bigger issue presents itself when a user is legitimately depending upon the AI to produce results in a domain where they don’t have enough knowledge to verify or discredit the results, or in scenarios where a large volume of work is produced and it becomes difficult to comb through the entire response.
These tools have the potential to provide novel education and improved foundations for work and for play, but they are merely tools and should always be used as a starting point, not as a destination. There will almost certainly be cases where AI can be used to perform specific classification tasks and repetitive operations better, faster, and more accurately than humans ever would be capable of, but there always needs to be a principle of trust, but verify.
When researchers talk about DNS security, they often refer to anything that protects DNS infrastructure. Although protective DNS and DNS security fall under the cybersecurity umbrella, protective DNS takes a different approach to cybersecurity than standard DNS security. Both security strategies are important for the stability of your business, but protective DNS reduces risks from your weakest link–human error. Protective DNS is critical for you...
The impending Cisco Umbrella RC End-of-Life has many Umbrella users concerned about their next steps and questioning which protective DNS solution might be able to fill the gap for their organization.
Industry State of the Art
This month there was a high level of focus on compliance issues spanning several focus areas from governments and oversight agencies around the world. And while there were actions taken with regard to specific vulnerabilities, a larger spotlight was placed on bigger picture security considerations in a more general context.