A pair of recent articles, one by Louis Gray and another by possible248 (who co-authors the blog along with, among others, Voyagerfan5761, are regular here) showcased public interest in relavent search terms, namely company names and Linux distributions respectively, using Google Trends.
This, in turn, inspired me to do my own keyword analysis to gauge if and how public interest in topics relevant to this site have changed over the years.
What I found was surprising and seemed to run counter to what I was seeing with my own traffic but was interesting nonetheless.
Perhaps the most obvious keyword and definitely the most common one that leads visitors to this site, this keyword has seen surprisingly little change over the past few years.
Over all, the graph for it is flat with a few “ticks” upward when news stories, such as the Obama controversy and the Kaavya Viswanathan scandal, broke. There are also season downward ticks at the end of every year, likely due to the holidays.
In general, it appears that the overall interest in plagiarism, both academically and artistically, has remained consistent and unchanged.
Probably the most unusual graph, content theft as a search term spiked in mid-2005, around the time this site was founded, and then leveled off, only to become a regular search term again in recent months.
It is unclear to me what has caused these specific spikes but the latest one seems to be holding and showing some sustainable interest in the topic. Something that could indicate greater public interest in the issue and in the term itself.
Copyright, on the other hand, has seen a marked decrease over the past few years, at least as a search term.
While this seems counter-intuitive, considering that stories about copyright, especially as it pertains to the RIAA/MPAA, seem to dominate social news sites, please are clearly not search for copyright information as much as they used to.
This is reflected even more strongly in the related graph for the RIAA and the DMCA, where the downward slope is even more pronounced and, in the case of the RIAA, seems to almost disappear completely.
Though it doesn’t appear that people have lost interest in copyright issues, it is clear that they are not searching for them as much as they once were.
One of the greater concerns people have about plagiarism is the issue of duplicate content. As we can see on the graph above, the term rocketed onto the chart in early 2007, stabilized and seems to be slowly marching upward.
Duplicate content, of course, covers more than just plagiarism and scraping, but a wide variety of SEO concerns. However, it is clear that this is a topic being talked about more and more. It is unclear in what capacity this term is being searched for.
Plagiarism Detection Tools
Looking at the chart for Copyscape (shown above) shows a steady increase in the number of searches over the past year and a half. This seems to mesh with my own experience, which has shown a great increase in content protection over the past 18 months.
Other Plagiarism detection tools, such as Bitscan and Attributor, did not have enough information for Google Trends to draw any conclusions. Academic plagiarism detection tools, such as Turnitin, have shown a steady increase with seasonal dips as school lets out.
Long Tail Keywords
Unfortunately, a lot of the keywords most specific to this site such as “spam blogs”, “splogs”, “RSS scraping”, etc. did not have enough data to produce results. Many of these terms are fairly new, created since I started Plagiarism Today, and are not widely used.
It will be interesting to see in a year or two if these keywords start to register then.
In doing this “study” I realize that Google Trends is both limited and a largely invalid source of data. Not only is the data proprietary, meaning it can not be vetted, but the information is relative and contains little hard data.
Also, many of the keywords looked at are not keywords that are searched for by typical searchers and instead would only be searched for by bloggers. Others, however, were likely searched by both. This means that we may not have an accurate picture of how just content creators feel about these issues.
The goal of this check was just to get a quick idea of what was going on and what the potential attitudes were.
When I personally look at these charts, I draw three conclusions.
First, I see that there is a sharp decrease in the interest of searchers in the legal aspects of copyright. This could be due to greater understanding about copyright, and thus less need to search about it, or just that that users have just moved on from the early copyright controversies of the late nineties.
Second, there is a clear, if slow, increase in interest in tracking one’s own content and the non-legal penalties that come from infringing or being infringed. This could be a sign that creators are not thinking about these issues in the light of a legal paradigm, but rather, in a more practical framework.
Finally, it is clear that the interest in plagiarism, both academically and artistically, remains fairly steady and that it remains an issue of interest even after the scandals fade from the headlines.
Personally, this site has seen an explosive growth over the past year, both doubling in traffic and enabling me to leave my day job to work full-time as a consultant. Clearly, things are changing in this area.
I look forward to following these changes closely over the coming years.
Note: All of the graphs in this post are used with permission from Google.