Facebook’s Plagiarism Problems Are Deeper Than You Realize

Back in September, I reported on Facebook’s Widely Viewed Content Report and how Casey Newton, a reporter at The Verge, noticed that nearly all the top posts on Facebook for the quarter were plagiarized

This month, Newton is at it again and recently published an updated article that looks at the latest quarterly Widely Viewed Content Report. The findings, to put it mildly are not shocking.

Once again, Newton found a similar pattern of plagiarism with nearly all the most-viewed content coming from outside sources such as Reddit, Twitter and TikTok.

However, Newton’s report comes as the Wall Street Journal is also examining Facebook’s efforts to block plagiarized and pirated content. As part of their series The Facebook Files, which is an examination of leaked internal documents from the company, the paper published a scathing review of the company’s practices surrounding copied content

According to those documents, Facebook Pages that shared stolen content accounted for 40 percent of the net traffic to those public profiles. The researchers that prepared the documents further noted that the easiest way to build a successful page on Facebook was to simply copy content that was successful elsewhere. 

According to the Wall Street Journal, though several researchers at Facebook proposed ways to decrease the reach of such content, those proposals were “deprioritized” by Mark Zuckerberg. This was in part because the company feared running afoul of the Digital Millennium Copyright Act (DMCA).

However, in my experience and the experience of others, the issue with duplicate content goes much deeper than that. That’s because many people who reported copyright infringing content to Facebook were turned away by a DMCA takedown system that threw nothing but obstacles.

A Long-Running Problem

One of the more trying elements of working with Facebook on copyright issues is that, for much of the past two years, Facebook has made it very difficult for smaller rightsholders to file takedown notices.

The issue was straightforward. When a DMCA notice was filed, even if it was done using Facebook’s own DMCA form, the company would request additional information. First this would be a request for how the submitter is authorized to file the report (even if the submitter is the creator and rightsholder) and/or an explanation of how the content was infringing. However, these would be done over the course of multiple emails taking a process that usually only requires one letter and creating an email chain of 4 or 5 letters.

This would happen if even all the information was provided in the original notice and, to make matters worse, Facebook would often still refuse to honor the notice after the hurdles had been jumped through. 

In my personal experience, this was a big problem for text-based works, but it was also an issue for images, including non-consensual pornography images that Facebook would simply refuse to remove on any grounds (despite clearly violating Facebook’s terms of service). 

I’ve sat down twice to write an article about this specific issue. The first was in August 2020, shortly after this problem began. This was confirmed by other filers, and seems to have been a mounting frustration for DMCA filers. 

The second time was in July of this year. However, both times I postponed the article after Facebook said that they were working to address the issue. Last year, nothing changed but after the July conversations, the policy did seem to change, with myself and others experiencing much fewer cases of obstruction.

However, that still leaves over a year when Facebook was routinely obstructing and reject legitimate DMCA notices. While that can be blamed on company policy that, hopefully, has been rectified, it casts doubt on their argument that they can’t remove the plagiarized content because of the DMCA.

After all, it’s hard to say that you fear the DMCA, when you’re actively obstructing one of the core tenets of it. 

If Not the DMCA, Then Why?

To be clear, the DMCA does not prohibit Facebook from trying to block or restrict plagiarized content from becoming popular. Other large companies such as YouTube have taken strong proactive steps without endangering their protection under the law.

One of those companies is actually Facebook themselves, which, back in September 2020, launched an update to their rights management platform to help proactively block some infringing images. This included both on Instagram and Facebook.

Facebook also implemented filters for video content following the “freebooting” controversies of 2015 and 2017. This is why it launched the Rights Manager too in the first place

Facebook’s own actions belie its excuse of being afraid of the DMCA. They weren’t scared to block videos and images years ago, and they felt comfortable rejecting and obstructing filed DMCA notices, why did they not take similar action against other kinds of plagiarized and pirated content?

The first reason may simply be that they couldn’t. That the tech challenges of tracking the broader internet for content, determining which pages and posts trade in plagiarized and/or copyright infringing material and stop it.

That certainly would be understandable. The challenge of comparing the broader internet to content uploaded on your service is much different from comparing content at a controlled library of specific types of content. 

The other possible reason is that Facebook simply didn’t want to. The plagiarized content has, historically, done very well for them and proved to be popular. Given how heavily that content is viewed and how dependent the company is on ad revenue, going to war against it could significantly harm their bottom line.

While we likely won’t know for certain, the answer likely is a combination of both. The tech challenges would be great but Facebook also little motivation to make the attempt. As long as copyright notices are relatively rare, and the plagiarized content remains popular, there’s not much motivation for them to try.

However, if they want both a reason and a possible approach, I think they might be able to find those in the same place.

The Google Solution

While Google is hardly a role model when it comes to copyright issues, it’s also an interesting case study in finding more subtle ways nuanced ways of dealing with widespread plagiarized and/or infringing content.

Back in 2011, Google had a serious problem. Its search results were routinely being clogged by low content sites that took the form of either content farms, which generated large amounts of low-quality but original content, and scraping sites that simply grabbed content from wherever they could only online and published it, both with and without modification.

So, Google came up with a solution: An algorithm change.

Dubbed the Panda (or Farmer) update, it went live on February 24, 2011 and had a serious impact on both content farms and scraping sites. It didn’t remove such sites from the web, it didn’t even remove them from Google, but it decreased their rankings to the point that they were no longer viable spamming techniques.

This is something that Facebook could easily do and is likely what the engineers were suggesting. They may not be able to detect every piece of plagiarized content on their site, but they can detect low-quality pages and profiles that trade in such content and reduce how often they are recommended or shown to others.

They can’t eliminate the content, but they can keep it from playing such a large role in the average user’s feed.

There is a selfish reason for Facebook to do this. As Newton pointed out in his latest post, Facebook has a serious age problem. It has the oldest demographics of the major social media networks and much of that is because so little of the content on it is original or fresh.

The latest videos are on TikTok and YouTube, the latest images are on Instagram (which is owned by Facebook), the best livestreams are on Twitch and Reddit, Discord, Medium and a myriad of other sites/forums are where you go for written content.

There’s no field or space where Facebook is seen as being on the cutting edge and that’s largely because of the issue of duplicated content.

In short, while tolerating plagiarized content may be good for their short-term bottom line, it comes at the expense of their long-term reputation.

Bottom Line

To put it simply, Facebook has a problem. It has an addiction to copied content. It is an addiction that has been around for years and has manifest itself in a variety of ways including obstructing DMCA notices, promoting pages with copied content and even copying features of competing services.

If Facebook wants to thrive in the next ten years, it needs to become known as a place for original content. It needs to be a place where creators create and share, not where people share after they’ve been somewhere else.

To achieve that, Facebook needs to clamp down on plagiarized and copyright infringing content, needs to build features that enable such creativity and to reward the creators that choose it first.

Otherwise, Facebook will continue to get older and more and more secondary in the social media wars. Though its huge audience will mean it has a role for a long, long time to come, that role is more out of obligation than it is excitement.

To be clear, this is just one of Facebook’s many problems that have been highlighted by the Wall Street Journal. However, this could be the one, more than any other, that keeps the site falling farther and farther away from relevance.