Capgemini SE

10/17/2024 | News release | Distributed by Public on 10/18/2024 21:56

From nature to AI: preventing model collapse with evolutionary diversity

From nature to AI: preventing model collapse with evolutionary diversity

Jonathan Aston

Oct 17, 2024

Exploring the future of Generative AI training

The most well-known form of generative AI is the large language models (LLMs), and here we look to Andrew Ng for his description of how LLMs work. They use large amounts of text data in a supervised learning process to repeatedly predict what the next word in a sequence will be. Therefore, when one prompts ChatGPT or other LLMs with some input, the response is a series of predictions of the best next words given the information used to train the model.

LLMs, therefore, are based on a concept of mass consensus whereby the output given is the most common answer found in the training material predicted word by word. This means any answer is only as good as the consensus of the training material. This is why many people believe these LLMs are not capable of real creativity and novelty in answers and have referred to them as "stochastic parrots". One could argue that this is only partially true as ChatGPT has a reinforcement learning from human feedback (RLHF) process where humans rank the multiple possible responses by which one's sound human-like, but all of these responses are still an amalgamation of what was in the training data.

With this in mind, let's focus on the future training material for future models and how it differs from the training material today.

Image:https://sitn.hms.harvard.edu/flash/2023/the-making-of-chatgpt-from-data-to-dialogue/

What is the problem with future training material?

Currently, all training material is generated by humans with no input from AI and this is what the latest ChatGPT is trained on. However, now that LLMs like ChatGPT are widely available for people to use, more of the material humans create will be co-created or entirely created by LLMs, and this will start to dilute and pollute the pool of training material for future LLMs.

The problem in future training material is that it will be made with the help of LLMs which means that it will be like previous training material. Therefore, we are losing the variability and variety in the training material. This effect then snowballs, and we quickly end up in a scenario whereby the training material is massively influenced by LLM-generated content and the variability of training material decreases to the point that all the training material begins to look similar to itself. One possible solution is to mark all LLM-generated content so that people and future AIs know which training material was generated from LLMs, however this is not widespread practice today.

We are already seeing this implication of this :Forbes has reported on one generative AI model spreading misleading information because it was trained on misleading information generated from LLMs. More training material will not add any more variability to the output or improve it. This can lead to the models becoming very narrow in their answers which can lead to weak performance and - ultimately pulling into question their degree of utility. This concept is called model collapse.

So, how do we avoid model collapse?

As someone who studied biology before working with AI, it's probably no surprise that like most good problems, I looked for inspiration from the natural world. Darwin, in his theory of evolution by natural selection, talks about how organisms randomly mutate. If these mutations give the organism a fitness benefit, then the mutation will proliferate throughout the population, and the population will become fitter (an increase in the ability to survive to reproductive age and reproduce). The abundance and randomness of these mutations lead to high levels of diversity in the population. Therefore, if this process is repeated, it is easy to see how a population of organisms will change over time to become better suited to their environment and maintain a high level of diversity, leading to resilience to environmental change.

The interesting piece of information in this theory is that in nature, mutations occur randomly but are selected non-randomly. Therefore, until evaluated by the organism when it is alive, the mutation could be beneficial or detrimental.

Image: https://evolution.berkeley.edu/misconceptions-about-natural-selection-and-adaptation/but-its-not-random-either/

Is the main difference then that we are not introducing mutation into our models?

Let's apply this idea to LLMs. Should we introduce random mutation into LLMs? This could be difficult and lower the value of the LLMs, so maybe we just need to keep some element of diversity of response.

Let's address the 'problem of hallucination.' Hallucination is when an LLM predicts the next best word but the sentence it makes is either fundamentally false or misleading. Andrew Ng again likens this phenomenon to humans making mistakes, learning a pattern or association that is false. However, hallucinations are not a problem but a feature, and they represent, precisely, some of the diversity seen in nature. To elaborate, if we ask the same question twice, we will get two slightly different answers, and if we change how we ask the question, we get more different answers. This may be some of the diversity we seek.

If this is the case, we could look at the natural world again to see how nature deals with the evaluation of mutation. It is often the environmental constraints that determine if a mutation is beneficial or not. For example, does it give a fitness benefit to the organism? . Giraffes, for instance, mutated to have longer necks, and this was of great advantage in accessing new food sources, but sometimes mutations can occur, which fundamentally make the organism unable to survive in their environment.

Nature has developed some proofreading mechanisms. There are ways to spot mutation in DNA replication and fix it to limit mutation so it does not happen too often but ensuring that it will still occur. The mutations that are not picked up are normally those which are not obviously wrong. This proofreading is one way we can police the hallucinations of LLMs by having experts review the content and say if it is correct or not, while keeping some hallucinations that are not obviously wrong. This is similar to the RLHF done by ChatGPT that we referred to above and a possible solution would be to automate something similar, so this is done for all LLM outputs.

These hallucinations, therefore, can create novel, but still sensible, answers and with the evaluation by proofreading or as assessed by the environment, we can identify the useful hallucinations and remove the less useful ones.

Coming back to the theme of creativity, I leave you with a final thought: What is the difference between creativity and the identification of useful hallucinations?

About Generative AI Lab

We are the Generative AI Lab, expert partners that help you confidently visualize and pursue a better, sustainable, and trusted AI-enabled future. We do this by understanding, pre-empting, and harnessing emerging trends and technologies. Ultimately, making possible trustworthy and reliable AI that triggers your imagination, enhances your productivity, and increases your efficiency. We will support you with the business challenges you know about and the emerging ones you will need to know to succeed in the future.

We have three key focus areas: multi-agent systems, small language models (SLM) and hybridAI. We create blogs, like this one, Points of View (POVs) and demos around these focus areas to start a conversation about how AI will impact us in the future. For more information on the AI Lab and more of the work we have done visit this page: AI Lab.

Meet the author

Jonathan Aston

Data Scientist, AI Lab, Capgemini's Insights & Data

Jonathan Aston specialized in behavioral ecology before transitioning to a career in data science. He has been actively engaged in the fields of data science and artificial intelligence (AI) since the mid-2010s. Jonathan possesses extensive experience in both the public and private sectors, where he has successfully delivered solutions to address critical business challenges. His expertise encompasses a range of well-known and custom statistical, AI, and machine learning techniques.

Related