Google Similar Images: Poor Copy Detection

gsi-logo

Earlier this week, Google announced a new Google Labs tool called “Similar Images.” The idea is to improve on the accuracy or Google Image Search for ambiguous terms, such as “Jaguar”, which can be both for a kind of animal or a kind of car.

The idea is that, if you get a wide range of image for a search term, you can use the “Similar Images” link to specify only images that look like the one above, meaning, in the case of the search for Jaguar, searches for the car or the cat.

However, photographers and artists instantly became interested in whether it could be used to help detect misuse of their images on the Web. Though visual searches such as Tineye do an excellent job finding copies, including modified ones, based upon a source image, they do not have the breadth of Google Image Search.

I decided to do some experimenting with Similar Images to see if it could be work, using some regularly-copied images from a good friend of the sitis, Sandi and JW Baker at WolfSongStudio.

What I found was very interesting.

Making it Work

The first problem that I had with with Similar Images was that there was no means to use a photo or image as a reference point, as with Tineye. Instead, I could only perform a standard text search for an image.

sandi-image-search-fail-1

I decided to search for one of their most popular images, “Dance The Moon” by using the title as a search term. I got a wide range of search results back, including two of the image itself, as seen to the right. However, even though there were multiple copies of the image up (both of the ones in this image are legitimate), there was no option to use the Similar Image Search function.

All I could do is scroll through the results the same as with any Google Image Search and see if anyone had happened to copy the image and associate it with its title. After a few pages of searching, it appeared none had.

So I tried a second image, this one entitled “Grandfather Bear” with much the same results. The image appeared once on the first page of results but did not contain the “Similar Images” function.

I then looked for another image of Sandi’s that I know to be well-plagiarized, one entitled “Hawkwoman” but only found images of the comic book character. I narrowed the search by adding Sandi’s name to the search, which in turn located several copies of the image, all on legitimate sites, but no “Similar Images” function.

At this point, I decided to look for something more generic. I searched for “George Washington” and then clicked the Similar Images link below one of the paintings only to find that it didn’t work well at all for finding similar images to it.

gw-image-fail-1

At this point I ran the George Washington image through Tineye and the results were staggering, pages and pages of images that were nearly identical to the source.

tineye-win

It appears that, despite its limitations, Tineye is still the clear winner in detecting image plagiarism.

The Problem with Google Similar Images

The simple truth is that Similar Images was never designed to detect image copying or plagiarism and, without drastic changes, likely won’t be able to fulfill the function.

There are two reasons for this:

  1. Still Tied to Keywords: Similar Image Search only finds similar images that register for the same keyword. If an image is uploaded with a different title or no keywords at all, then it won’t show up as a similar image in Google
  2. Weak Matching Algorithm: The matching algorithm, from what I can see, has very broad guidelines as to what is or is not similar. It seems to look mostly at the color of the images, not the actual content. This creates a problem when trying to detect image copying. Though it can distinguish a cat from a car somewhat well, it can’t tell a painting of George Washington from a building.

Basically, Similar Images was designed to refine Google’s existing image search product, not make it usable for another function. Though it is probably better off with it than without it, it doesn’t make Google Image Search the plagiarism detection system many would have hoped for.

Bottom Line

In the end, this is an offering that is still in beta (within Google Labs) so it will likely change and improve. It could become, with time and work, a useful tool for detecting image copying and misuse. In the meantime though, most artists are better off with Tineye or, if they want a professional solution, PicScout.

Google Similar Images shows a great deal of promise in this area, but doesn’t deliver on it as is.

Want to Reuse or Republish this Content?

If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.

Click Here to Get Permission for Free