
Even AI Isn’t Immune To Internet ‘Brain Rot’
In a case of 'Yeah, we kinda figured that would be the case, but now we have proof', it turns out that social media and a large portion of Internet content can be just as bad for AI as it is for us.
Researchers at three universities, Texas A&M, University of Texas at Austin, and Purdue, collaborated on a study to investigate the impact of what they termed 'junk web text' in large language models, or LLMs. And now, the results have been made available.
Now, I obviously don't know much about AI or how you train them, but I'll do my best to make sense of the paper the researchers published.
Here goes.
The researchers started with multiple blank LLMs, and fed them each different diets of information. They classified the data by two main criteria: engagement, or popularity, of the content, and the quality of the content. Each LLM was fed a different mix.
After the LLMs were 'trained' by the data, they went through a battery of tests, and the ones trained on what was labeled 'junk' (think Twitter posts and clickbait articles) data did worse than the ones that weren't.
The team also documented the various failures in the reasoning used by the models, which included things like: not planning things out, answering without thinking, skipping steps, using poor logic and making factual errors. Scarily enough, all things humans do on occasion too.
Next, the researchers tried to re-train the LLMs to see if the impact of the junk data could be lessened. Sadly, it appears that the bad habits from the initial training persisted.
So in AI as in food, you are what you eat.
LOOK: Montana's Most Beautiful Rivers
Gallery Credit: mwolfe
The Best Toy Store in Montana
Gallery Credit: Michael Foth

