What AI Really Is
I see a lot of surprised posts on social media and articles in the news about how badly incorrect AI generated text is, how it constantly gets things wrong, how it mixing up information in search, or how it just completely fabricates statements as if they were fact. This has never surprised me given how LLMs work but it still surprises me that people are surprised by this.
I’d like to think that this is about information and education, and that if people knew how these things actually worked they’d be less surprised by it and in turn less inclined to trust and use it in all applications. So I will try to explain what “AI” is, how it works, and why you will never be able to trust its output.
Specifically in this case I am talking about Large Language Models (LLMs) which are the fashionable thing in AI these days.
How do LLMs work?
At the highest level a LLM (Large Language Model) is a program which generates a sequence of words which are most likely to follow the previous sequence of words. The way it knows what word comes after the previous words is by scanning as many texts as possible and recording the connections between words. This is the “Model” of a Large Language Model. It has no knowledge of what the words actually mean, nor does it know if a sequence of words is factual, fictional, sarcasm, or just plain lies. All it knows is just the probabilities of which words could follow a given sequence of words.
A very simple example of this is presented in the ACCU 2025 talk titled “A Very Small Language Model” which is about building a very basic language model which can generate realistic looking text. A LLM is just a slightly more complicated version of this working with a massively larger set of word connections.
The massive Model is built by scanning (legally or otherwise) all the available written texts in existence. Naturally this includes texts which are accurate and factual, but also includes texts which are fictional, politically motivated, discredited, fraudulent, or just plain wrong. More importantly it lumps all of these texts together, meaning the connections between words in factual sentences are treated the same as in sentences containing falsehoods.
Scanning all written works in existence has caused authors and publishers to file lawsuits against these companies for copyright infringement, so far with mixed success. It has also caused online publishers to either close their archives or make licencing deals with these companies, and when that content has been user generated those sites have faced backlash from their users.
Now when you ask the LLM a question what it is actually doing is calculating which word is most likely to follow the words that have been fed into it. This not only includes your question, but also the previous chat history and control text which has been included by the company. It does this to generate one word, which it then adds to the input text and feeds the whole thing back into itself to generate the next word. This is shown in the way a LLM outputs text one word at a time, with the speed mostly being determined by the available processing power.
These calculations are also probabilistic meaning they include randomness in the result, as evidenced by a LLM generating different outputs for an identical input. This means that a LLM will not always pick the most likely word each time, but a word which is statistically likely to appear, which will then alter which words are likely to appear after.
A more mathematical explanation behind how a LLM works is provided in this delightful video, where it also contains links to videos explaining more of the theoretical foundations of the technology behind LLMs.
What does this mean?
Given that a LLM is just a program which probabilistically generates a sequence of words that are statistically most likely to follow the input text that it was given, we should realise that it:
- Doesn’t actually know what is fact, truth, real, fiction, falsehood, fraud, or a lie.
- Doesn’t actually know what any of those concepts even are.
- Doesn’t know what is true or false.
- Has no concept of being right or wrong.
- Has no concept of internal consistency of ideas, subjects, or objects.
- Has no concept of what ideas are factual, fraudulent, or fictional.
- Cannot do arithmetic or perform simple operations (like count how many letters are in a word).
It just generates plausible looking text based on its input.
Therefore:
- When it answers a question there is no certainty or guarantee that the information is correct or factual.
- When it apologises for getting something wrong, it is not sorry, it is just generating words which are most likely to appear after your message correcting it.
- When it says it is doing something in the background, it is not, it is just generating words which are most likely to appear after you ask it to do something.
- It will never be able to generate new ideas or facts, only recombine words based on the text it has been trained on.
To reiterate, all a LLM does is generate a random sequence of statistically probable words based on an input. There is no knowledge, no thought, no creative decisions, and no internal consistency behind those words. Which means they can easily contain both true and false statements, or even a mixture of both in some bizarre combination. This makes the output unreliable as it cannot be trusted.
Hallucinations
Unfortunately the AI industry has come to refer to these false statements as mere “hallucinations”, and claim that it is just a small problem that can be solved with more resources and time, rather than it being a fundamental issue with the technology itself.
This view is present in this article, where they at least admit that inaccurate and false statements are a fundamental problem of how LLMs work, but also say that the problems can be somewhat mitigated, and that we should just adapt to getting back unreliable information.
This seems completely counter-intuitive from a computer system which we have grown accustomed to being deterministic in its behaviour and output.
I really don’t like the use of the word “hallucinations” because it seems benign, like the LLM is having a temporary memory or mental issue, rather than calling it what it is, a “fabrication”, in both meanings of the word.
Externalising Costs
Another big problem is that LLMs can generate a lot of plausible looking text at a speed that is faster than humans can write. Given that the text is also unreliable and with questionable accuracy, it means that the costs of reading through and fact checking it has been externalised from the writer1 of the text to the people reading it. It is quicker and cheaper to generate the text, but it costs everyone else more to be able to process the text, even if they discard it.
There are notable examples of this happening in all manner of fields, some of which are very important to life and liberty.
- In the legal field where a lawyer uses a LLM to generate a brief for a court which contains completely fictional legal cases and court decisions. In the best case these are discovered to be false and the lawyer is fined for not checking, but in the worst case these actually decide a legal ruling and are then included in court documents, further propagating these fabricated falsehoods.
- In the computer security field these are security bug reports being submitted to a bug bounty program, where the actual security issue is pure fiction and was fabricated by a LLM. This costs people’s time in investigating the issue and prevents them from working on many other aspects of software development.
- In reports for governments that are authored by big name management consultancies where LLMs are used and insert references to fabricated statistics, court rulings, and other studies. This is dangerous because it is used to justify certain government actions which can harm people’s lives.
So instead of a writer taking more time to write something good and succinct for people to read in a short time, LLMs vomit out a lot of text which takes other people’s time to read and process, even if they discard it quickly.
These days it is not just text that generative AI models like LLMs spew out, there’s images, audio, and video that also gets generated and is making it harder to distinguish fact from fiction.
What now?
Just realise that LLMs are just text generation programs, with no actual knowledge inside of them, and no capability to do actual data processing. With this in mind you can see what tasks a LLM might be useful for - emitting plausible looking text, and what tasks it is wholly unsuited for - emitting factual text, performing calculations, answering questions.
For some it might (seem to) be a useful tool which helps them do things and write what they need to write, but realise that you always need to check and verify its output with actual non LLM sources, as you the writer1 are ultimately liable for what is in the text that the LLM generates.
“A computer can never be held accountable, therefore a computer must never make a management decision”. - IBM Training Manual, 1979
As for me I see no present value in using LLMs for generating content, as we have existing proven technologies for many of its use claimed cases, I have no interest in spending time to fact check its output2, and I prefer to write code myself. I may experiment with it now and again, and if there comes a time in the future when LLMs have an actual useful use case I can always try it then.