Google Similar Images: Poor Copy Detection

By Jonathan Bailey • Apr 24th, 2009 • Category: Articles, Products

gsi-logo

Earlier this week, Google announced a new Google Labs tool called “Similar Images.” The idea is to improve on the accuracy or Google Image Search for ambiguous terms, such as “Jaguar”, which can be both for a kind of animal or a kind of car.

The idea is that, if you get a wide range of image for a search term, you can use the “Similar Images” link to specify only images that look like the one above, meaning, in the case of the search for Jaguar, searches for the car or the cat.

However, photographers and artists instantly became interested in whether it could be used to help detect misuse of their images on the Web. Though visual searches such as Tineye do an excellent job finding copies, including modified ones, based upon a source image, they do not have the breadth of Google Image Search.

I decided to do some experimenting with Similar Images to see if it could be work, using some regularly-copied images from a good friend of the sitis, Sandi and JW Baker at WolfSongStudio.

What I found was very interesting.

Making it Work

The first problem that I had with with Similar Images was that there was no means to use a photo or image as a reference point, as with Tineye. Instead, I could only perform a standard text search for an image.

sandi-image-search-fail-1

I decided to search for one of their most popular images, “Dance The Moon” by using the title as a search term. I got a wide range of search results back, including two of the image itself, as seen to the right. However, even though there were multiple copies of the image up (both of the ones in this image are legitimate), there was no option to use the Similar Image Search function.

All I could do is scroll through the results the same as with any Google Image Search and see if anyone had happened to copy the image and associate it with its title. After a few pages of searching, it appeared none had.

So I tried a second image, this one entitled “Grandfather Bear” with much the same results. The image appeared once on the first page of results but did not contain the “Similar Images” function.

I then looked for another image of Sandi’s that I know to be well-plagiarized, one entitled “Hawkwoman” but only found images of the comic book character. I narrowed the search by adding Sandi’s name to the search, which in turn located several copies of the image, all on legitimate sites, but no “Similar Images” function.

At this point, I decided to look for something more generic. I searched for “George Washington” and then clicked the Similar Images link below one of the paintings only to find that it didn’t work well at all for finding similar images to it.

gw-image-fail-1

At this point I ran the George Washington image through Tineye and the results were staggering, pages and pages of images that were nearly identical to the source.

tineye-win

It appears that, despite its limitations, Tineye is still the clear winner in detecting image plagiarism.

The Problem with Google Similar Images

The simple truth is that Similar Images was never designed to detect image copying or plagiarism and, without drastic changes, likely won’t be able to fulfill the function.

There are two reasons for this:

  1. Still Tied to Keywords: Similar Image Search only finds similar images that register for the same keyword. If an image is uploaded with a different title or no keywords at all, then it won’t show up as a similar image in Google
  2. Weak Matching Algorithm: The matching algorithm, from what I can see, has very broad guidelines as to what is or is not similar. It seems to look mostly at the color of the images, not the actual content. This creates a problem when trying to detect image copying. Though it can distinguish a cat from a car somewhat well, it can’t tell a painting of George Washington from a building.

Basically, Similar Images was designed to refine Google’s existing image search product, not make it usable for another function. Though it is probably better off with it than without it, it doesn’t make Google Image Search the plagiarism detection system many would have hoped for.

Bottom Line

In the end, this is an offering that is still in beta (within Google Labs) so it will likely change and improve. It could become, with time and work, a useful tool for detecting image copying and misuse. In the meantime though, most artists are better off with Tineye or, if they want a professional solution, PicScout.

Google Similar Images shows a great deal of promise in this area, but doesn’t deliver on it as is.

Short URL to this Post: http://copybyte.com/z/9g

Jonathan Bailey is The Webmaster and author of Plagiarism Today, which he founded in 2005 as a way to help Webmasters going through content theft problems get accurate information and stay up to date on the rapidly-changing field. He is also a consultant to Webmasters and companies to help them devise practical content protection strategies and develop good copyright policies.
Email this author | All posts by Jonathan Bailey

  • Suzanne Matick
    As you covered in another column, if Google had PicScout ImageExchange all those similar images would be distinguished by those with rights information easily accessible. This could make a big difference to image buyers when they make their actual selection of an image to use. http://tinyurl.com/yzs238c
  • Biplab Das
    how can i find my image if it is misused by others. I am scared that if anybody misused my picture which I have been sent them through mail. In recent days I found that one of my mail friend is fake who us using fake picture. we are mailing each other for couple of months. But now I found that it is fake. Thats why I want to be sure that my picture is not misused by others. Is there any way to find out my picture, if it is misused by others or In which site I can search the web (if anyone used my picture on other site) through my picture. plz. help.
  • Nice Google-oriented article by Mr Bailey: http://www.plagiarismtoday.com/2009/04/24/googl...
  • goog image search - read the "bottom line" section - http://www.plagiarismtoday.com/2009/04/24/googl...
  • Great feature. This will help the image authors especially the artists that published their artworks and images online. But tied to keywords will only limited the detection. I agree with you.

    Giving credits to the original image authors is one of the way to appreciate their effort and copyright.
  • Hopefully, as detection gets easier, giving credit for artwork will become a more standard practice...
  • Debora Weber-Wulff
    A program from a colleague of mine, called ImageSorter, now can search Yahoo and Flickr (but not Google) for similar images.

    http://mmk.f4.fhtw-berlin.de/Projekte/ImageSorter/

    The algorithms used are fairly good at finding similarities based on color. You can start off with a keyword search, then select some pictures and narrow it down.
  • Thanks! I just grabbed a copy of the software and am going to try it out in just a bit, this sounds very exciting. I find it interesting though that it can use Yahoo! but not Google. It's not a problem though since the databases will be comparable...
  • Meredith
    I started looking at it and did indeed find some infringement after a bit, but I don't think that was quite due to their technology as me looking up my specific image titles. It seems it works better for coming up with the same general subject and colors.
  • Precisely. It's got a good use, just not one for detecting image matches...
blog comments powered by Disqus