If you are looking for your content online or trying to find plagiarism in content you’ve been handed, Google can be both your best friend and your worst enemy. Though it’s frustrating when a spammer or plagiarist outranks you in Google for your own work and Google’s removal policies have, at least in the past, left a lot to be desired, it’s also one of the best tools around for discovering copied content online.
However, Google has a lot of “under the hood” tweaks and features that make such searches a lot easier but it seems that a lot of people don’t know about them.
With that in mind, here are five simple and fast Google hacks to help you find your content on the Web, understand how it is being used and, when appropriate, put a stop to its abuse.
1. Set Result Date Range
If you regularly search for your content, such as you have static marketing copy that you routinely look for, you’ll probably find that, after a while, the search results become less and less valuable as they are cluttered with either older, lawful uses of your work, false positives or older cases you didn’t/can’t deal with.
However, Google has a feature that lets you see only the most recent results. Simply by clicking “Show Search Tools” on the left-hand side column, you will be able to choose from a variety of date ranges including the past day, week, month, year or enter a custom range.
This makes it easy to only see results from the last time you’ve searched. As an alternative, if you stay logged in to the same Google account, you can also use the “Not Yet Visited” link to see only results you haven’t been to, making it even easier.
2. Digital Fingerprints
Tracking dynamic content, such as blog content, is difficult because it is constantly changing and the shelf life of most of it is so short that searching for all of it is impractical, if not impossible. However, using a digital fingerprint, a trick I’ve talked about before, makes it easy to locate at versions that were scraped from your RSS feed.
Basically, you just edit your RSS feed to include a semi-random string of letters and numbers, something that is unique to you and your site, and routinely search for it. Wherever it appears, you know someone, most likely, scraped your RSS feed and you’ll want to follow up.
You can even pair this up with Google Alerts and be notified via email when your fingerprint appears online.
3. Google Similar Image Search
As great as Google is for finding text it is sometimes rather lackluster for finding images. The reason is the text contents of a webpage are easily machine-readable where the contents of an image are not, thus why CAPTCHAS work to keep bots out.
However, Google does offer one neat trick for detecting image plagiarism and copyright infringement. If you visit Google Image Search and hover over an image, you can select the “Similar” link and Google will take you to results that look, to Google at least, to be the same.
It works best with images that are broadly used on the Web but even those with a limited amount of reuse may still find it to be a practical alternative to dedicated image search engines, such as Tineye.
4. Detect Translated Plagiarism (kind of)
Google has a new, somewhat experimental, new technique for detecting content that has been translated from one language to another. You can use it by performing a traditional search for a statistically improbable phrase (using quotes) and then, as with the date range search, click “Show Search Tools”. Then, at the very bottom of the list, click “Translated Foreign Pages”
To be honest, the translation is a bit crude and, since it is automated, will not likely catch many plagiarisms, especially if you choose phrases with difficult-to-translate words. However, the feature does check multiple languages at once, including letting you choose which languages to add, and shows a great deal of promise for the future.
Right now, however, it isn’t much more than a toy that might get lucky once in a while.
5. Wildcard Game
If you are having a difficult time finding a good, unique phrase in the content you are searching for, you may be able to find a workable solution by using Google’s wildcard function.
Basically, if you use an “*” in the middle of a phrase, Google will treat it as a wildcard and plug in any word. So, for example, if you have a phrase that is relatively unique but sometimes people like to substitute a word or two, you can easily wildcard those words out and get all the results.
This is especially useful when searching for academic plagiarism. All one has to do is find a suspect phrase, put quotes around it and asterisk out the word or words that seem out of place and then let Google do the searching.
Remember, however, the wildcard only works for whole words, not for partial words.
Hopefully these quick Google hacks can help you be a better Google Ninja when looking for your content or checking other work for plagiarism. They by no means make Google a one-stop shop for plagiarism fighting, but they do make it a bit more powerful and help give you a edge when you’re looking for copied content.
In the end, plagiarism detection is still very much a game for humans and Google is just a tool to help with it. It will always take the judgement of human eyes and human minds to make the determination of what is right and wrong, both legally and ethically.
Still it is nice to get a little help along the way.