In other words, without “fresh real data” — translation: original human work, as opposed to stuff spit out by AI — to feed the beast, we can expect its outputs to suffer drastically. When trained repeatedly on synthetic content, say the researchers, outlying, less-represented information at the outskirts of a model’s training data will start to disappear. The model will then start pulling from increasingly converging and less-varied data, and as a result, it’ll soon start to crumble into itself.
https://futurism.com/ai-trained-ai-generated-data
So, as more and more lazy people ask AI to “write” for them, the programs get less and less accurate…
Or, as the authors of the study conclude, “…without enough fresh real data in each generation of an autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease.”
I.e., the use of AI-generated content to train AI doesn’t work, and since there is already way too much AI-generated garbage all over the internet, it’s almost impossible to sort out which is which when the AI-creators “scrape” data from the web.
So…
See, machines can’t replace us entirely — their brains will melt!
But then again, that might not be so hopeful after all. When AI takes over the world, maybe it won’t kill humans; perhaps it’ll just corral us into content farms…
At least we won’t wind up as batteries.
Yet.
PS. I find it both hysterically amusing and disturbing that my blog program offers an “experimental AI assistant.” Granted, the program does let you know that AI-generated content accuracy is not guaranteed, but wth would I want to use AI for a personal blog? The whole purpose of a blog is to WRITE. AI-generated text is not writing. It is intellectual property theft.