One of the stories that has been making the rounds lately has been involving allegations that the second-largest search engine, Bing, has been copying results from the first, Google.
The results have been a pretty ugly back and forth between the two companies, one that has ended with both sides accusing the others of being unethical and/or dishonest and a lot of negative press hurled at Microsoft.
So what is really going on? It’s not a simple question to answer, partly because we only have a brief glimpse of what either side is doing or has done behind the scenes. The most we can try to do is make heads or tails of it and see what it means both legally and ethically for the two companies.
What Happened: Does Bing Copy Google?
According to Google, after some recent relevancy updates to Bing, they began to notice a startling similarity between their results and Bing’s, especially on extremely unusual keyword results such “torsoraphy”, which is a misspelling for “tarsorrhaphy”, a rare surgery on the eyelids. According to Google, Bing wouldn’t correct the spelling, but would know to direct people to the same first result as Google.
This, in turn, prompted Google to run a “sting” operation where they manipulated the results of one hundred random, nonsensical search terms, such as “hiybbprqag”, to add random pages to the top of the results. Google then sent 20 of its engineers home with new Windows 8 laptops and had them perform test queries on Google from home in Internet Explorer 8 with Suggested Sites, a feature in IE8 that tracks user browsing to recommend related websites, and the Bing Toolbar both enabled.
The result was that, in about eight of the searches, Bing’s results were changed to match Google’s, even though the pages chosen by Google didn’t make sense for the query.
This was enough to convince Google that something was afoot and the story started making the rounds.
Bing later responded saying that they don’t copy from Google directly but that they do use the anonymous click and surfing data from their users as one of over 1,000 points of data to determine results. They went on to say that Google’s experiment was a “click fraud” attack similar to what spammers do and was trying to manipulate Bing’s results on ultra-long tail keywords where it was most vulnerable.
But all of this brings us back to our original question: Is Bing plagiarizing from Google?
No Easy Answers
As I read through the various points/counterpoints while researching this article, there were three facts that seemed to be largely overlooked:
- Google’s Honeypot Worked Less Than 10% of the Time: Google attempted the honeypot with some 100 keywords but only 7-9 actually worked. That means that, in over 90 cases, it didn’t work.
- The Keywords Involved Were Extreme Long Tail: They keywords in the honeypot were ones that showed no results or no relevant results. They keywords that aroused suspicion were primarily typos of strange, rarely-used words. Major searches, it seems, are unaffected by this as there is a lot of variance.
- Engineers Had to Take Active Action: The engineers in the test didn’t simply alter Google’s results and wait for Bing to scrape them, they loaded up laptops, enabled tracking, performed the searches and clicked the desired result.
Clearly, this isn’t a case of Bing scraping Google’s results (which is what Scroogle does by design and with attribution). Instead, it’s a case of Bing’s underlying technology giving weight to actions by Microsoft IE users who visit Google. In short, what most seem to agree happened is:
- Users submit surfing data via IE and the Bing toolbar.
- Those users choose not to use Bing, use Google instead so Bing tracks those clicks.
- On search terms where Bing has nothing or very little, those clicks sometimes get a lot of weight.
So, this brings us to the question I’ve been loathing: Is this plagiarism or otherwise unethical or illegal?
Legally, it seems dubious that Google’s search results could be considered copyrightable. Considering recently wire frame models based off of cars were deemed to lack sufficient creativity for copyrightability, it seems likely results generated solely by an algorithm, with no human involvement, would too. However, I wasn’t able to find a ruling directly aimed at this issue so, if anyone has one please send it my way.
Furthermore, Bing isn’t actually copying directly from Google, but looking at user data and drawing its own conclusions so, in the eyes of the law, it’s unlikely that there would be much in the way of a claim Google could make. It seems, largely, to be a matter between Microsoft and the users of its products.
That being said, the ethics are a much more complicated question.
Bing, when it set about introducing clicktracking as a factor in its search results, had to know that many of those it tracked would use Google and, therefore, it seems logical they knew that they would be getting information about Google results and they did nothing to prevent that. However, I’m not completely sure they should have.
For one, you can learn from your competitor’s results and product without copying it. By tracking clicks, Bing might be able to see that some sites that rank well in Google aren’t worth ranking well in their engine. This seems to be mostly what Bing does though as it was only non-competitive search terms that appears to be copied.
That being said, Bing, as Google’s main competitor, should be trying to create its own unique search experience, not merely trying to recreate Google’s with slightly better results. Though it may only be one factor that Bing considers, considering Google’s results for your own makes it look like you’re trying to build on Google’s back.
So is it unethical? I would consider it a gray area. At least as far as the relationship between Google and Bing goes, I don’t feel completely right giving Bing’s actions the OK, but I can’t outright condemn them either. Bing is walking a thin line here and a lot of what would determine their side on it depends on information we don’t have, such as exactly what information is collected and how it is used in Bing’s algorithm.
Sadly though, there are bigger questions to look at and, with those, even fewer good answers.
As important as the ethical and legal considerations are, there are other questions to ask, including:
- How Aware Are Microsoft’s Users of the Tracking and It’s Use? Many seem to be surprised by what Microsoft is doing. How well was this use of private info disclosed and how clear was it made?
- How Easy is it to Game Bing? Considering that the false Google results were irrelevant to their searches, it seems like “click fraud” as Microsoft calls it might be an easy way to game Bing, especially for long tail terms.
- How, Exactly, is This Info Used? Though Microsoft makes it clear that it is just one of a thousand factors, it appears that click monitoring really has a sharp impact on the results. Exactly how much weight is this info given?
There aren’t any easy answers to these questions right now but I suspect we’ll hear more about the first one as this story spreads.
The ethics of what Bing is doing (at least in regards to Google) are debatable and even I don’t have any solid answers, largely because this concept is still very new and the ethics haven’t been hashed out fully. However, there are bigger, probably more important questions being raised about Bing thanks to this revelation.
Though I don’t believe Bing is plagiarizing Google, at least not in the traditional sense of the word, Google may have exposed even greater mistakes an misdeeds of Bing by releasing the results of this test. In short, they make Bing look lazy, sloppy and easy-to-game, something that for a search engine may be even worse than being a plagiarist.
Clearly, this is a PR disaster for Bing and Microsoft, but I think the worst might be yet to come for them.