Blog Archives

AI Ouroboros, Reddit Edition

Last year, if you recall, there was a mod-led protest at Reddit over some ham-fisted changes from the admins. Specifically, the admins implemented significant costs/throttles on API calls such that no 3rd-party Reddit app would have been capable of surviving. Even back then it was known that the admins were snuffing out competition ahead of an eventual Reddit IPO.

Well, that time is nigh. If you want a piece of an 18-year old social media company that has never posted a profit – $18m revenue, -$90m net losses last year – you can (eventually) purchase $RDDT.

But that’s not the interesting thing. What’s interesting is that Google just purchased a license to harvest AI training material from Reddit, to the tune of $60 million/year. And who is Reddit’s 3rd-largest shareholder currently? Sam Altman, of OpenAI (aka ChatGPT) fame. It’s not immediately clear whether OpenAI has or even needs a similar license, but Altman owns twice as many shares as the current CEO of Reddit so it probably doesn’t matter. In any case, that’s two of the largest AI feeding off Reddit.

In many ways, leveraging Reddit was inevitable. It’s been an open secret for years that Google search results have been in decline, even before Google started plastering advertisements six layers deep. Who knew that when you allowed people to get certified in Search Engine Optimization, that eventually search results would turn to shit? Yeah, basically everyone. One of the few ways around that though was to seed your search with +Reddit, which returned Reddit posts on the topic at hand. Were these intrinsically better results? Actually… yes. A site with weaponized SEO wins when they get your click. But even though there are bots and karma whores and reposts and all manner of other nonsense on Reddit, fundamentally posts must receive upvotes to rise to the top, which is an added layer of complexity that SEO itself does not help. Real human input from people who otherwise have no monetary incentive to contribute is much more likely to float to the top and be noticed.

Of course, anyone who actually spends any amount of time on Reddit will understand the downsides of using it for AI training purposes. One of the most upvoted comments on the Reddit post about this:

starstarstar42 3237 points 1 day ago* 

Good luck with that, because vinyl siding eats winter squid and obsequious ladyhawk construction twice; first on truck conditioners and then with presidential urology.

Edit: I people my found have

That’s all a bit of cheeky fun, which will undoubtedly be filtered away by the training program. Probably.

What may not be filtered away as easily are the many hundreds/thousands of posts made by bot accounts that already repost the same comment from other people in the same thread. I’m not sure how or why it works, but the reposted content sometimes becomes higher rated than the original; perhaps there is some algorithm to detect a trending comment, which then gets copied and boosted with upvotes from other bot accounts? In any case, karma farming in this automated way allows the account to be later sold to others who need such (disposable) accounts to post in more specialized sub-Reddits that otherwise require certain limits to post anything (e.g. account has to be 6+ months old and/or have 200+ karma, etc). Posts from these “mature” accounts as less obviously from bots.

While that may not seem like a big deal at first, the endgame is the same as with SEO: gaming the system. The current bots try to hijack human posts to farm karma. The future bots will be posting human-like responses generated by AI to farm karma. Hell, the reinforcement mechanism is already there, e.g. upvotes! Meanwhile, Google and OpenAI will be consuming Reddit content which itself will consist of more and more of their own AI output. The mythological Ouroboros was supposed to represent a cycle of death and rebirth, but the AI version is more akin to a dog eating its own shit.

I suppose sometime in the future its possible for the tech-bro handlers or perhaps the AI itself to recognize (via reinforcement) that they need to roll back one iteration due to consuming too much self-content. Perhaps long-buried AOL chatroom logs and similar backups would become the new low-background steel, worth its weight in gold Bitcoin.

Then again, it may soon be an open question of how much non-AI content even exists on the internet anymore, by volume. This article mentions experts expect 90% of the internet to be “synthetically generated” by 2026. As in, like, 2 years from now. Or maybe it’s already happened, aka Dead Internet.

[Fake Edit] So… I wrote almost exactly this same post a year ago. I guess the update is: it’s happening.

Full Circle

I saved almost $400 this Black Friday! By… not buying anything, thanks to bots.

Truth be told, it might not actually be due to bots, but I have my doubts. Specifically, both the GameStop $199 PS4 + $50 voucher deal and the Kohls $199 PS4 + $60 voucher deal were sold out by the time I got up on Thanksgiving morning. I am sure there are still technically $199 PS4s floating around (Edit: Looks like a no), but considering those vouchers were almost the equivalent of all three of the PS4 games that I would have played, I’d rather take my wallet and go stay home.

Then again, maybe it was all normal people pulling annoying arbitrage bullshit like WoW AH goblins. Out of curiosity, I went to eBay to look at the current listings of things.

PS4old1

Seriously?!

WHO ARE THESE PEOPLE?! Both the clueless idiots still capable of navigating eBay but un-savvy enough to not look for deals with a simple Google search, and the swindlers preying on them. This is all a prime counter-example to show whenever someone tries to win an economics argument with the assumption of rational consumers. We are all irrational as hell.

Alas. Perhaps this entire episode is doing me a favor by not enabling me to buy three $20 PS4 games at technically $86 apiece. Even if I had gotten the voucher, they still would have been the equivalent of $66. In almost all other cases, I would prefer to play games on my PC, which already has a Blu-Ray player. Now, I may have eked out a bit more value from the free games from PS+ each month, but considering that my PS3 has gotten zero use in the last year, that’s still debatable.

I also passed on the Honor 6X for now. It actually went on a flash sale for $145, but in the process of looking at it closer, I realized that I bought my Honor 5X back in June 2016. Seems a bit silly to buy a new phone 1.5 years later when my current one is still functioning at 100%. People spend way more money on new iPhones every year, but those people are irrational.

What I did end up picking up were Far Cry 4 and No Man’s Sky, for about $13 and $20 after discounts, respectfully. I’m still on the fence about Destiny 2 at the moment, but I might take the plunge with some of my $90 in Blizzard credit from having sold WoW gold a year ago; I should still have enough for the next WoW expansion since Destiny 2 is on sale. That should be enough, right?

…right.