What Yahoo!’s Downfall Might Mean for Plagiarism Detection
Times are clearly tough at Yahoo!. With its current CEO recently fired, slipping marketshare and rumors of a pending sale, Yahoo! has certainly seen better days.
With a search engine marketshare of less than 10%, Yahoo! is already largely seen as irrelevant when it comes to general search, especially since it began outsourcing its search results to Bing!
However, there is at least one area where Yahoo! has remained a critical player: Plagiarism detection.
Simply put, many of the most popular plagiarism detection services take advantage of Yahoo! Search Boss API (Application Programming Interface), which has made creating a plagiarism detection service both affordable and relatively simple.
So, as Yahoo!’s future hangs into balance, so does the future and capability of many of the Web’s best-known plagiarism detection services including Copyscape, Plagium, PlagScan and PlagAware, all of which use Yahoo! either exclusively or in part to find their results.
To be clear, there’s no immediate threat to Yahoo! BOSS and its closure has not even been mentioned. This is purely an academic exercise.
However, with such uncertain times ahead for Yahoo!, the question gets raised, what would a Yahoo!-less plagiarism detection landscape look like? The answer isn’t very clear.
Why Yahoo! is Important
Without using a search API of some sort, a plagiarism detection service would have to crawl websites and create its own index, a time-consuming and expensive process that would cause the services to be prohibitively expensive. Fortunately, most major search engines offer APIs that enable plagiarism checkers, as well as other services, to tap into their indexes for a relatively easily and cheaply.
Many plagiarism detection services began using Yahoo! BOSS over competing offerings for a simple reason: Cost.
Historically, Yahoo! BOSS was a free service. But, even after Yahoo! began to charge for the service (shortly after it began to use Bing for search results) the cost of using Yahoo! BOSS was still many times cheaper than using Google’s Search API.
This is why many of the best-known plagiarism detection services are built either in whole or in part on Yahoo! BOSS.
This cost is important because performing a single plagiarism check, usually, requires multiple API queries (the exact amount depends on how the service handles queries, the length of the work involved and other factors). As such, these API costs often become a major expense for these services.
But more than just a cost issue, the presence of a competing API to Google also offers a different perspective. Being able to tap multiple indexes of the Web rather than just one has the potential to ensure the maximum number of results are returned, especially since the different indexes often catch different content.
In short, without the Yahoo! Boss API, we are likely looking at a much more expensive and more limited future for plagiarism detection.
What Does a Yahoo!-less Future Look Like?
If Yahoo! BOSS were to go away, the future is definitely a difficult one for many plagiarism detection services.
Some, such as Copyscape and Plagium, already mix results from multiple sources (Google/Yahoo! and Yahoo!/Bing respectively) and would likely just lose some of their fidelity in their results. Copyscape would, arguably, be in a better position than most as it began life using the Google API.
Others, such as PlagAware and PlagScan, both of which use (or seem to use) Yahoo! exclusively would be forced to write a completely new backend for their service. This could have a drastic impact on how they detect duplicate content and how effective they are (better or worse).
Higher-end services, like Attributor, which use their own index of the Web would be unaffected by any change or closure of Yahoo! BOSS and may even have their position strengthened.
All in all though, there would be a major shuffle ahead for plagiarism detection services as they looked to fill the void left by Yahoo! BOSS.
Where Would the Refugees Go?
Those who depend on Yahoo! BOSS, if they wanted to stay open, would have a tough choice ahead of them as there are only two (major) providers who would remain.
- Google: Google’s API is definitely robust, as is Google’s index, but is also much more expensive than Yahoo! BOSS.
- Bing: Bing’s API is much less established and not as well regarded as Google’s but it is free for unlimited queries, just as Yahoo! BOSS was. However, the API’s TOU may pose challenges in some cases, specifically related to advertising requirements.
In short, for developers it’s a choice between an established and robust API that is more expensive and a newer one that comes with limitations on how the results can be used.
Most would likely go with Bing as it is the most natural replacement (especially since Yahoo! results come from Bing) but it remains to be seen if Bing’s results can compare with Yahoo! or Google’s for this purpose.
That would be something very interesting to test in the future.
To reiterate the good news, there’s no immediate threat to Yahoo! BOSS at this time so all of the above is merely hypothetical. There has been no talk of closing Yahoo! BOSS and, given that it currently is a revenue generator for Yahoo!, it isn’t likely to be first on the chopping block.
That being said, the turmoil at Yahoo should give cause for concern to those who rely on the Yahoo! BOSS and the time may well be now to start looking at alternatives.
After all, if the end of Yahoo! BOSS does come, it will likely be sudden and it may be difficult for companies that rely on it to quickly reconfigure their products.
Even if it seems unnecessary at this time, preparing for the possibility may be the best move these services can make.