Reddit says that it has caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to start blocking the Internet Archive from indexing the vast majority of Reddit. The Wayback Machine will no longer be able to crawl post detail pages, comments, or profiles; instead, it will only be able to index the Reddit.com homepage, which effectively means Internet Archive will only be able to archive insights into which news headlines and posts were most popular on a given day.
— Jay Peters for The Verge (via Simon Willison)
Next it will be google. And most of the post-2025 forum knowledge in the web will be lost. Imagine a web browser that cannot access StackOverflow or Reddit. How useful it is? LLMs will need new data to continue being relevant, and a new data monetization strategy will change the internet forever.