People generally first discover plagiarism of their work in one of three ways: Curiosity, Good Samaritan or accident.
Of the three, curiosity seems to be the least common, at least in my own experiences. Although there's little doubt that more and more people are wondering about who is reusing their content, something helped by services like Copyscape, it's still rare for people take take an interest in protecting their copyright without first discovering a problem. Discovering plagiarism through a friend or a stranger, the way I discovered my first plagiarist, is also very unusual. Sadly, few people, for whatever reason, detect and report plagiarism involving other's works.
Strangely enough, "happy" accidents seem to be the leading means by which Webmasters and bloggers first discover that their work is being ripped off. The unfortuante side of this is that it can greatly increase the shock and emotion in what is often an already intense situation.
However, on the positive side, it can provide new clues on ways to actively search for and detect plagiarism. If an "accident" can be successfully repeated, it may be of great use to others.
A Rose By Any Other Name…
Like many, Sayesha found her first plagiarist through a search. However, she wasn't looking for the title of her post, a keyphrase or anything else to do with her writing. Instead, she was searching for her name.
Searching for one's name is not only a way to potentially detect plagiarism, it is also a way learn what others are saying about you and find out what potential employers might find out when and if they do the same. Many bloggers already search for their own name regularly, some even have Technorati watchlists for the purpose,
However, the usefulness of name searches for detecting plagiarism is limited severely by both how unique your name or pen name is and whether or not you put your name anywhere in your actual content. People who have unusual names and write autobiographical stories (or have the habit of refering to themselves in third person) will likely get more out of this approach.
Despite that, there is some usefulness for the technique. Names are resistant to synonymizng so they are likely to remain intact even if other parts have been automatically changed, also since many traditional copy/paste plagiarists will scoop up bylines on accident, it can aid in detection of those as well.
So. while it won't replace existing plagiarism detection techiniques, it can make an excellent additional method for those with unique names. Setting up a Google Alert to check for new instances of your name could provide some interesting insight and, perhaps, a few cases of plagiarism along the way.
Reading What You Love
If you're passionate about something and run a Web site regarding it, odds are that you keep up to date on it. Google Alerts, blog searches and news sites are great ways to keep up to date on current events in your field.
However, they often times have the strange side effect of exposing a plagiarist or two along the way.
For example, to help with my writing for this site, I have a variety of search terms that I follow regularly including "plagiarism", "content theft", "copyright infringement" and more. Most of the time, my own posts show up in my searches, as one might expect, but a few times they've appeared twice, once on my blog and once under a different person's name.
Ignoring the fact that only a very daft person would try to steal from a site entitled "Plagiarism Today", it has happened and I've usually discovered it through a combination of Feedburner's Uncommon Uses feature and my regular Technorati searches.
Simply put, plagiarized copies of your content are very likely to show up in the same places that your content does. If you follow the searches that mean the most to you, you'll likely find unauthorized copies of your work at some point in time.
The down side to this method is that it offers the most help to bloggers and Webmasters that focus on a smaller niche. Those dealing with a broad range of topics will likely have moved on to something else long before their old works get picked up.
Otherwise though, this is just another way that it pays to stay on top of your subject. You really never know what you may find.
How Did You Get Here?
Finally, Webmasters who have access to referral logs generally check them from time to time. It's considered good practice to know both where your traffic is coming from and where your server's resources are going.
However, referral logs can also, sometimes, point you to a plagiarist. This was the case for Kasia who discovered a forum user was plagiarizing a post of hers, which, in turn, was borrowed, with attribution, from Robert Cringely.
Referral logs are most useful for visual artists that might have their images not just plagiarized, but hotlinked. This results not only in a theft of intellectual property, but also of server resources. Those cases, which are increasingly common as more people unfamiliar with the dynamics of the Internet begin to publish thier own blogs and their own sites.
However, theft of textual work can be detected by referral logs as well. Many times blogs link to older articles in order to reference them. This prompts visitors on the plagiarist's site to click the links and visit yours, causing the duplicate article to show up as a referral.
The downside to this method is that it simply will not work if the image is hosted on the plagiarist's server or if the text is copied without the links intact. Also, since many plagiarists do not get a lot of traffic at their sites, the odds of one showing up very high in the logs is slim. In short, it would almost certainly take a very thorough check of the referral logs to catch all but the biggest and boldest of plagiarists.
Finally, most free sites and blog services do not offer access to referral logs and though a free counter can offer many of the same benefits, it will do little to stop image plagiarism.
Despite that, referral logs can and do work. Plagiarists are regularly caught using them and it makes perfect sense to check your site's statistics regularly for other reasons, not just stopping plagiarism.
In the end, the accidents that lead to the discovery of plagiairism are rarely accidents at all. They're the natural outcome of good Webmasters doing the things that good Webmasters do. They might not expect to find plagiarism of their work while doing it, but an added perk of the behavior is that, sometimes at least, it can help you find the people that are stealing your content.
Sadly though, it's not enough to rely on "happy accidents" to detect plagiarism. Of all of the cases I have handled, only a small percent were discovered by surprise. Rather, the vast majority have come in through either my Google Alerts, special plagiarism searches or good-natured readers that, when made aware of the problem, began to actively help me persue copycats.
Despite that, it's important to keep one's eyes open and realize that the things Webmasters do every day can yield surprising results.
Simply put, the fact that there are more thorough and more effective ways to hunt plagiarists should not discourage us from using the information we already access regularly to aid us. It takes no time to open our eyes and pay attention.
In fact, it's probably the easiest step any of us can take when protecting our content.
[tags]Plagiarism, Content Theft, Copyright Infringement, Accidents, Technorati, Search Engines[/tags]