The Unspoken Context
Today, our topic is related to the shortcomings of AI, specifically Unspoken Context.
A quick look at how LLMs work could explain the reason. LLMs work with text data, but they are not simply search engines pulling facts from a database. Instead, they are incredibly complex pattern-matching machines. They work by predicting what words should logically come next based on the massive amount of text they have read (sometimes using an internet search to add fresh context), and they compile these patterns into one big text structured by your prompts. Thinking models go multiple steps further; they prompt themselves iteratively based on the original prompt and what they found, and finally, they generate one big text.
What this means is this: LLMs can miss things. If there is nothing written on a certain aspect of a topic, the LLM will completely glaze over it. The big problem is this happens almost every time you interact with an LLM, because the knowledge base (the whole internet and books) is not structured in a way to mention every relevant thing when talking about a topic. This is why we feel so alien when reading something outside of our domain, because there are nuances we need from a domain to connect the dots, or certain implications that are not directly mentioned in the text.
What this causes is even more problematic, because what an LLM does is make this domain-specific text into something that the user can read. It manages to do this so successfully that it replaced old, non-academic (sometimes academic ones concerningly) searching methods instantly. Although the text reads fluently and is easily understandable, the nuances disappear completely, and the responsibility falls upon the user to discover the nuances and ask the LLM about them.
Because of this missing Unspoken Context, it falls upon the user to prompt the LLM with "What needs to be considered (Wouldn't we need ...? / Wouldn't that mean that ...?)". LLMs cannot make logical connections; they can just connect the written dots. In most cases, it feels like the same thing, and that's why results look good, but the places where it makes a difference result in huge mistakes.
Especially if you talk about speculative things that don't have their answer on the internet, like brainstorming for a 'new' thing, you need to be extremely careful. The LLM might look like it gives a coherent and satisfactory answer, but most of the time, the information it gives has no grounding whatsoever; therefore, it is filled with RANDOM speculations and RANDOM conclusions.
We love brainstorming with people, and it is the best way to brainstorm, but replacing people with an LLM actually creates the opposite result. Because if you fall into the trap of getting convinced, the brainstorming session ends with less information and consideration compared to what you would do if you did it alone.
You can argue, "But I prompt my LLM in a way that it doesn't confirm my thought or give false information, and instead it says when it doesn't know something or challenges my thoughts if they are wrong." Well, sorry to break it to you, but that is not something you can prompt.
The biggest issue with speculations is that there is nothing to prove or refute (without logical efforts), and since LLMs cannot grasp the Unspoken Context like a person in a domain does, they cannot draw any logical conclusions to determine what you thought was right or wrong. On top of that, saying something that is not determinable is a huge logical leap, which, again, it cannot do. So, it finds something relevant from the internet and writes that in a way that resembles an answer to your prompt, and again, it looks good if you do not challenge it (mentally).
Oh, by the way, the challenge thing is a huge meme; you shouldn't fall into the "challenging it fixes the errors" trap. It just makes one more search with the specific challenge you provided to see if it can find anything; if it cannot, the next result has a high chance of being wrong. The false confidence it gives is incredibly dangerous.
An easy exercise to see how much you can trust your LLM is to ask weird, speculative questions about your own area of expertise, an area where you understand the Unspoken Context; this can even be a topic you already dealt with before. But you need to do it like an intern would do, with little grasp of the Unspoken Context and not enough domain knowledge.
Then you can look at the LLM's answers and draw your own conclusions about whether you can trust it in areas you do not know. Many people do this exercise daily for their work, but with their own words; therefore, the LLM is not giving bad answers, but you need to realize your prompts are doing a lot of the heavy lifting.
So, the point of the exercise is to test what happens if you cannot do the heavy lifting yourself. This is to rid yourself of the false sense of confidence LLMs create and to understand you cannot "simply" learn with an LLM, as it still needs to involve a lot of input and effort from you.