Frustration and Fighting Plagiarism

Jonathan BaileyApril 30, 2008

6 minutes read

We live in a strange time of copyright history. We exist in an age of rapid change, both in terms of technology and of law, we live in a time where people’s views about copyright are changing and artists are trying to convert “free” into a business model. It is the age of Creative Commons, free culture and mashups. an era of free thought and free exchange.

But, even in the face of that, artists have many reasons to want to exert reasonable control over their work. Even those that are comfortable giving away certain rights still don’t want their work to be plagiarized, scraped in its entirety by spammers or otherwise leeched off of.

Even in the era of free culture, there is still a place for copyright enforcement and it still has a role in protecting authors as they post their works online. However, enforcing copyright, for your average author, is no small challenge.

Simply put, the technology is not there, the law is not accessible and, even after three years of working in this field, the improvements have been slim.

If individual and smaller copyright holders are going to be able to protect their works at all, there are going to have to be improvements. The bad guys are not getting any fewer and the technology they use is developing rapidly as they compete with search engines to get their work ranked well.

In short, we’re caught in the cross-fire of a different war and we lack the means to protect ourselves. That is something that has to change soon.

The Issues We Face

When a new artist or Webmaster seeks to persue an infringement of their work, the learning curve is steep and the stakes are very high. Many refuse to start simply because it is too intimidating and I am hard pressed to fault them.

This learning curve is exasperated by a collection of inadequacies and problems that Webmasters have to endure and work around. Those problems include the following:

Copyright Law is Confusing: Even if one discounts the slew of copyright myths, copyright law is confusing and vague. It is so bad that many lawyers call it “unreadable” and few touch if they can avoid it. Laypeople have almost no hope of navigating it successfully, especially when it comes to issues such as fair use.
Lack of Detection Tools: It is a simple fact, there are no tools currently available to the average consumer designed, from the ground up, to track content copying. There are ways to use existing tools for that purpose, such as with Google Alerts, but these are hacks designed to turn a generic search engine into a content detection tool. Even services such as Copyscape and Bitscan are search engine hacks in and of themselves, just significantly more user-friendly and elegant in nature. Sadly, despite improvements, this simplicity comes at the expense of accuracy.
Networking is Hard: Assuming one can navigate copyright law and locate an infringement they are prepared to act upon, finding out who to write can be difficult. Determining the host of the site is no easy task in many cases and, though new tools have come forward to help with that, the accuracy can leave something to be desired.
Hosts are Uncooperative: Even if you can find the host and determine who to contact, there is no guarantee that they will be helpful. Many hosts are uncooperative in these matters and refuse to remove infringing works, even if the law is very clear. Even those who are cooperative can throw up unnecessary roadblocks and create huge delays.
The Process Doesn’t Scale: Though the process gets easier every time you do it, Keeping track of your information gets harder as you grow. I’ve used databases, spreadsheets and even Word files to keep track of the cases I’ve handled, However, every solution has been both inefficient and incomplete.

With these problems in mind, it is very easy to see why so many either avoid pursuing infringements of their work or stop soon after they start. It is an intimidating process and one that can grow into a tremendous time-sink very easily.

Fixing this problem is going to require more than just more changes to the current system, it is going to require a full solution that addresses these issues by creating a completely new set of tools, ones built from the ground up.

Fixing What’s Broken

If we assume that the current system is broken, or at least not functioning very well, we then have to look at what we can do to fix it.

Unfortunately, some things can not be repaired, at least not anytime soon. Copyright law is not going to magically become intelligible nor are we going to be able to consistently make locating a host easy. Though we can improve these problems with knowledge and technology, some element of these problems will remain.

Fortunately other elements can be addressed.

A Centralized Solution: Currently detection and cessation are two different functions requiring very different tools. Unifying these elements would not only make the process faster, but easier to track. This would make it much less intimidating for new comers and streamline the entire process for veterans.
Better Searching: The problem with relying on Google, or any other search engine, is that they routinely ban and remove sites that appear to be spam-like in nature. Sadly, these are the exact sites we are sometimes trying to find. Furthermore, any site that relies on Google for searching will make trade offs between simplicity and accuracy, either missing matches or requiring more human filtering of results.
Real Image Searching: Image searching right now is a failure. There is no way for a visual artist, especially a smaller one, to effectively search for copies of their work on the Web. Fingerprinting technology exists, but is only available to those with deep pockets.
Better Host Accountability: Hosts often ignore DMCA notices and spam reports because they realize that there is very little chance of them being sued. Though much of this is due to the nature of copyright law in the United States, one would hope that there would be other methods of holding hosts accountable other than lawsuits. Though I made an attempt of that with my Host Report, my status on the Web combined with limited sample size and lack of time hamstrung the effort. Such a project would have to be larger than just this site.
Integration: Though it is a cliche, we are in Web 2.0. However, the tools are decidedly from the old Web. RSS feeds, widgets and APIs open up a new world of possibilities and ways to simplify the process. New tools should both be aware of these elements and take advantage of them.

Though this sounds like a mammoth task, and it is, much of the work has already been done. Large copyright holders already have tools capable of many of these things, they are just yet to be made available to the public at large.

Fortunately, there may be help on the horizon.

A New Interest

Though the tools and technology have not changed much in the past few years, what has shifted is that there is now a growing interest in creating them. At least two companies, Attributor and Blogwerx, seek to bring many of the features above to the table.

Attributor, for its part, seems to be farther along, already doing beta testing and having signed up both the AP and Reuters, among others. They’ve also announced plans for a “self serve” version of the service aimed at bloggers.

Hopefully, we will see fruits from that later this year.

Also, there are rumors of other companies looking to set up and provide similar services, including at least one company that may be producing a stand-alone software application that will provide much of that functionality.

However, no matter what company it is that makes the big leap, this is an area that is ripe for a great deal of innovation in the months and years to come.

Conclusions

In the nearly three years that I have been running this site, precious little has changed for small copyright holders. Even as new tools have helped deep-pocketed corporations protect their works, often to the point of waging an unreasonable war, those same tools have passed over the rest of us.

However, the powers that be have started to see the potential for making the technology available to the rest of us. This has the potential to not only help us protect our content, but further the copyright dialog by bringing about a greater understanding about the frequency and ways that content is being reused.

Right now, for the most part, we are all just feeling around in the dark, hoping to get an idea of what is going on outside of our site. Even those who are against copyright enforcement have to agree that more knowledge is never a bad thing.

Hopefully, that time is coming soon.

Want to Reuse or Republish this Content?

If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.

Click Here to Get Permission for Free