Here are some unorganised thoughts about my rudimentary knowledge about LLMs
Let’s first start with what an LLM (ChatGPT, BARD) , Large language model is, simply put, a model generated by feeding humungous amount of data into a neural network allowing it to “Predict” what the next word would be based on what was the previous words were . For eg : If you say “He is”, based on older data it has seen it might assume the next word is “a” . “He is a”… Your prompts are also part of the chain so if you asked “Who is elon musk” it might say “he is a founder” ….So on and so forth.
I use this basically to give myself a sense of intuition to help me understand what is happening. And once you get it, you understand why it is so good at what it does, and what are the apparent issues that exist
It obviously is very good because it has trained on so much data that its predictions are really spot on.
Humans are predictable, and so is our language.
When the system has seen so much text, it kind of has enough data to predict very precisely what the next word is
But that also means an LLM is NOT trying to actually give you the right answer, It simply is trying to predict what the next word is.
It has no sense of meaning, no idea of actual structure, no style.
This is also why it goes into “hallucinations” where it will invents facts, even papers, or just random stuff. It is not doing it on purpose, it’s just predicting the next word.
This also means that all the biases that exist in the real world also exist in the model. This is not so different from many other models out there, just much more apparent in LLM
Can the LLMs actually become factually accurate?
Depends on who you ask. While a lot of people do think that accuracy problem can be solved via various techniques like proper prompting, cross reference, citations, more training etc, some prominent AI researchers like Yann LeCun think otherwise
His argument as I understand in a nutshell is this: Assume there are n number of potential ways to answer correctly, for it to reach one of those n states it needs to follow a very narrow specific set of paths (Words string together), one word here and there would take you to a whole different tangent of what we call “AI Hallucinations”.
If he is correct, its not exactly doom and gloom for LLMs, it just means its applicability needs to be in fields which are more immune to occasional factual inaccuracies. Eg: LLMs are great for script writing, copywriting, marketing, maybe basic education
Also it might be an issue of expectation management. When you search for something you see multiple websites that try to answer your query, some that are accurate and some that are not. Search engines like Google, Bing try to find the most trusted sources and constantly upgrade based on user’s feedback. But you are mostly aware than you are getting your answer from a “website”.
if there is an inaccuracy you are more forgiving towards the search engine and think about how to rephrase your query or go to another search search result. In case of chat interface the expectation of accuracy is high because there is just 1 answer and LLM models sound very confident. Perhaps slowly people will start to adjust their expectations
But is it JUST predicting words
Well you say as if that’s a bad thing. So its likely a lot is going on behind the scene and under the hood.
For one, just because it is simply “predicting” doesnt mean its a pure probabilistic thing. Its likely that our latent structures, grammer, style is encoded into the model .
One food for thought is that if structures in a language also gives it meaning, does that mean it gets meaning ?
Language itself is not too impossible to predict .Take for example Zipf’s law . It states that the second most used word in a language is used half the time of most used word. The third most used word is used half of second most …and so on.
Its seen not just in english but across every language. Even the ancient ones we do not know.
This might eventually be a way for us to even communicate with aliens.If their communication does not follow zipf’s law, are they even intelligent? And the trouble is, we don’t exactly know why every language does that
So what next
Accuracy remains one of the top concern. its likely our near future will be an army of “Fact checkers” who would simply be checking if the LLMs got it right before putting the content to use. Then LLM become a great tool, a partner to many professions super charging productivity. But the field of “content generation” and “creative professions” like writing articles, maybe even books is definitely disrupted forever. Hallucinations are not just tolerated, but rather preferred here.