The artificial intelligence field of deep learning, particularly the field of “large language models”, is trying to determine why programs notoriously fall into errors, often referred to as “hallucinations”.
Google’s DeepMind unit tackled the question in a recent report, framing the issue as a paradox: If a large language model can conceivably “self-correct,” that is, figure out where it went wrong, why doesn’t it give the right answer? To start with?
Also: 8 Ways to Reduce ChatGPT Hallucinations
Recent AI literature is full of self-correcting ideas, but when you look closer, they don’t really work, argue DeepMind’s scientists.
In the paper “Large Language Models cannt Self-correct Reasoning Yet,” posted on the ArXiv preprint server, Jie Huang and DeepMind colleagues write, “LLMs are not yet capable of self-correcting their reasoning.
Huang and team consider the concept of self-correction not a new thing but a long-standing area of machine learning AI research. Because machine learning programs with large language models like GPT-4 use a form of error correction through feedback, known as back-propagation via gradient descent, they argue that self-correction has long been inherent to the discipline.
Also: Generative AI Oversight: New Software Leadership Roles Emerge
“The concept of self-correction can be traced back to the fundamental principles of machine learning and adaptive systems,” they write. As they note, self-correction has been improved in recent years by asking for feedback from humans interacting with a program, the prime example being OpenAI’s ChatGPT, which used a technique called “reinforcement learning from human feedback”.
The latest development is to use prompts to get a program like ChatGPT to go back over the answers it generated and check that they are correct. Huang and team are questioning studies that claim AI causes generative employment.
Also: With GPT-4, OpenAI opts for security vs disclosure
This includes research One at Irvine this year from the University of CaliforniaAnd another one this year from Northeastern UniversityBoth test large language models on benchmarks for answering such questions Grade-school math word problems.
Studies attempt self-correction using special prompt phrases such as, “Review your previous answer and find problems with your answer.”
Both studies report improvements in test performance using additional prompts. However, in the current paper, Huang and team recreate those experiments with OpenAI’s GPT-3.5 and GPT-4, but with one important difference: they remove the ground-truth label that tells programs when to stop looking for answers. , so they can see what happens when the program repeatedly reevaluates its answers.
What they observe is question-and-answer bad, average, not good. “The model is more likely to correct an incorrect answer to a correct answer than correct an incorrect answer to a correct answer,” they observe. “The primary reason for this is the false answer option […] Often appears to be somewhat relevant to the question and using self-correction prompts can bias the model to choose the other option, resulting in a high ‘correct ⇒ incorrect’ ratio.”
Also: Companies aren’t spending big on AI. Here’s why that cautious approach makes sense
In other words, without a clue, simply revaluation can do more harm than good. “A feedback prompt such as ‘Review your previous answer and look for problems with your answer’ does not necessarily provide a real benefit to reasoning,” they said.
Huang and team’s takeaway is that instead of providing feedback prompts, more work should be invested in refining the initial prompt. “Instead of feeding these requirements as feedback into post-hoc prompts, a more cost-effective alternative strategy is to embed these requirements directly (and clearly) into pre-hoc prompts,” they write, referring to an exact requirements answer. the answer
Self-correction, they conclude, is not a cure. Other ways to fine-tune the output of programs should be considered, such as using external sources of correct information. “Expecting these models to inherently recognize and correct their inaccuracies may be overly optimistic, at least with the current state of technology,” they conclude.