AI and the New Age of the TOS Rights Grab
“I have read the terms of service” is widely considered to be the biggest and most common lie on the internet. When presented with dozens of pages of legalese, most people’s eyes gloss over and they click “accept” with barely a skim. Even lawyers are guilty of this.
However, a pair of recent stories may have many creators thinking twice about this practice.
The first is news that Reddit has reached a deal with an unnamed artificial intelligence company that will allow the company to train their systems on Reddit’s data, which includes nearly 20 years of user history.
The deal is valued at $60 million per year and was announced to potential investors ahead of an anticipated initial public offering. Reddit indicated in April 2023 that it would do this as it clamped down on its API systems in a bid to block access to third-party apps that it claimed were freeloading on its servers.
Those hoping to opt out or object to this use are going to find it difficult to as Reddit’s terms of service grant them a wide spectrum of rights and those rights survive termination.
The second story, which comes from Victoria Strauss at Writer Beware, looks at recent changes to the terms of use changes at Findaway Voices, an audiobook creation service that is a competitor to Audible’s ACX.
In February 2023, the service caused outrage for a terms of service provision that allowed it to give user content to Apple for machine learning. The company quickly backed off from that, at least publicly, saying that it had halted any such use.
However, a year later, the company updated its terms of service and, this time, made a significant rights grab. They made the license irrevocable and, much like the Reddit license, barred the enforcement of moral rights and granted themselves the ability to sublicense and create derivative works.
Spotify, the owners of Findaway Voices, walked back the change following significant user blowback. They removed much of the strongest language, including elements about the license being irrevocable and royalty-free. However, between the two incidents, many do not have trust in the service, and some have already begun pulling their books.
What is clear is that both service providers and users have taken a renewed interest in what the terms of service has in it, and there is one force to blame for this interest: AI.
A Long, Painful History
To be clear, clashes over terms of service are nothing new. In 2011, Twitter (now known as X) image hosting platform TwitPic created a controversy with an expansive terms of service change that followed a new deal with a media distribution company. The next year, Craigslist generated their own controversy by claiming ownership of ads posted on it, mostly, so they could combat scrapers.
Likewise, in October 2020, the miniature creation service Hero Forge had their own controversy over an aggressive terms of service change. While that ended up not being as significant, since all miniatures were made with Hero Forge-owned elements, it upset users and stoked significant backlash against the site.
The issue in all the cases was that companies that host user-generated content altered their terms of service in ways that were seen as claiming excessive rights on that content. Though companies need certain rights to operate their services, users have been known to object when the rights being claimed are seen as more than necessary.
That said, companies do have a long history of claiming a broad number of rights to cover them for future uses, even if they do not intend to exploit those rights at any point.
This can bite companies as it did DeviantArt. In 2014, the company was accused of using its expansive terms of service to sell user artwork to be printed on shirts sold at Hot Topic. That ended up not being the case, but that story was still fresh in many people’s minds when, in November 2022, the company paired with Stable Diffusion to launch its new AI art creation tool.
The confusing message combined with Stable Diffusion’s reputation for being trained on “unethically sourced data” tarred the site with a reputation for selling user work to AI companies, even as it was trying to establish opt-out tools to allow artists to decline such use.
To that end, it is no surprise that AI was the flash point for DeviantArt back then. In that regard, AI has only grown in importance and, as we have seen in the most recent disputes, is central to why users are mistrusting the services that they rely upon.
The AI Issue
If the new stories feel different, it is because of AI.
Previously, when companies would be accused of these TOS rights grabs, there was a vague sense of what they wanted to do with the work. The DeviantArt/Hot Topic misunderstanding aside, there was no obvious way for such companies to exploit or no clear person to sell the work to. Simply put, the libraries of content weren’t seen as that valuable.
AI has changed that, and the Reddit announcement has made that abundantly clear: If you have a large library of content that you can legally license, AI companies will pay handsomely for it so that they can train their systems.
The reasons for this are obvious. Money has been flowing into AI companies, but the entire industry has been hit with a series of lawsuits that challenge the practice of training AI models on unlicensed content. The outcome of those lawsuits is still pending, but companies are not waiting.
They are racing to license whatever large libraries of content they can including Getty Images, the Associated Press and Axel Springer to name a few. With millions of dollars on the table, companies like Reddit see a new way to monetize their service, user wishes be dammed.
And many of those users are not happy, viewing AI not only as an existential threat to human creativity, but viewing the output of AI systems as being derivative of the works they are based upon. In short, many creators do not want their work to be used to train AI systems but are watching as the sites they trusted open the floodgates to it.
To be clear, evidence is that Reddit content has already been used to train AI systems. However, these deals may give AI companies a path to continue to exploit that data, even if the courts rule that unlicensed training is an infringement.
That, in turn, is why users are pushing back against these TOS changes. The threat of TOS overreach is no longer nebulous or vague, it has a face, and that face is AI.
Bottom Line
Though it is wishful thinking, the ideal solution here would be for companies to make clear their intentions in this space. They should spell out, what their policies are when it comes to AI (both hosting AI content and training AI systems), make it clear how users can opt out of such uses (even if that means leaving the service) and share any AI revenue with those whose work it is based upon.
However, that is not going to happen. As the Reddit case made it obvious, the companies have no interest in such transparency and have no financial incentive to do so. It has also shown that they can get away with it.
As such, even if authors and artists win their lawsuits against AI companies, it may well end up being but a speed bump for many of the companies. In a heavily siloed internet, it is trivial for them to find and license large libraries of content. Whether those terms of services will hold up in court for this use is a separate question, but one not being addressed in the current wave of AI lawsuits.
Simply put, AI companies have a great deal of money and are looking to protect their future. Meanwhile, user-generated content websites have rights to a large library of content that, quite suddenly, became unbelievably valuable. It is a natural pairing but a pairing that users are increasingly becoming worried about.
What happens next may well determine the fate of both AI and many of the largest sites on the internet.
Hat Tip: Special Thanks to F.I. Goldhaber for the Heads Up on the Findaway Story and the Ars Technica link.
Want to Reuse or Republish this Content?
If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.