Article Updated – See Below
Discussion search engine Omgili has been drawing some attention from forum owners. The owners and administrators are concerned that a new “preview” feature may be nothing more than an attempt at search engine spam at the expense of their conversations.
The complaints of forum owners eerily echo the complaints bloggers had about Spanish blog search engine Bitacle, which drew widespread attention for displaying full content from blogs on their own site without permission.
However, are the comparisons valid? Is Omgili to forum owners what Bitacle is to bloggers? The answer isn’t simple but there are many key similarities as well as a few important differences.
Omgili, which stands for “Oh My God, I Love It!”, is owned and operated by a Ran Geva. It is a discussion search engine that parses forums for information and makes it searchable.
The idea is that, if someone is looking for advice on obtaining the best laptop or a new car, they can search Omgili and receive a list of discussions on the topic to look through. In some situations, this might be more valuable than regular Google results as it is focused on conversations with multiple viewpoints, instead of just one or two.
The search engine also offers a “Thumbs” feature that tracks some of the more popular conversations on the Web. It also offers several tools to forum owners to help them integrate Omgili features into their forum, including finding related conversations.
On the surface, Omgili seems to be an interesting idea for a service targeting a market largely ignored by the new crowd of Web startups. However, much like with Bitacle, the problems lie underneath.
Complaints and Comparisons
Where Omgili seems to run into trouble is with its “preview” feature. Displayed alongside search results, the preview link takes visitors to a page containing the text content of the forum discussion, wrapped in Omgili template and served with ads.
Though Geva claims the previews are intended to assist when the forum is down, Omigili’s cache is in sharp contrast to Google, The Internet Archive or Coral Cache, all of which display the page in its entirety, with full attribution and without advertisements on cached pages.
On the surface, this appears to be very similar to Bitacle’s controversial “Aggregates” feature including the following similarities:
- Text-Only Scraping: The entire site is not cached, instead the text out of the forum is scraped and pasted into a page that is surrounded by Omgili’s template and ads.
- Lack of Attribution: Individual posts, which are the copyright of the poster, are not attributed (unless the poster places their name in the post) and the forum is not attributed on the preview page itself.
- Advertisements: All preview pages contain advertisements from both Targetpoint and Kontera.
- Search Engine Accessibility: The preview pages, despite containing nothing but duplicate content, are ready indexable by search engines.
- Lack of Opt Out: At the moment, there is no way to opt out of Omgili via the Web site. You can use robots.txt, but that option is not available to all forum owners.
- Deceptive Linking: Rather than listing the preview page as being a cache or a preview, results pages link to the page with simply the word “More” leading readers to believe that it is a continuation of the summary, not a scraped copy of the page.
Most forum administrators (as well well as many forum members) are likely to be upset about this and some have already expressed their concern. None seem to have complained yet directly to Omgili or to its host or advertisers, but it may only be a matter of time.
Though there is little doubt that Omgili is entering a gray area when it comes to copyright law, there are several differences to separate it, at least somewhat, from Bitacle and spam bloggers.
- Obeys Robots.txt: Unlike Bitacle and sploggers, Omgili does obey robots.txt. According to Geva, the spider will also follow HTML META tag rules the coming weeks. No word yet on an opt out form for the site.
- Responsive: Where Bitacle has maintained relative silence throughout their controversy, save a few insults, Geva is responsive to emails sent to him regarding the matter. Any forum admin with concerns about the service is free to contact him.
- No Attempt to Replace Original: Where Bitacle and other scrapers accept comments and try to make their sites substitutes for the original (although very poorly), Omgili does not accept comments or replies (save on the thumbs feature that does not host scraped content) and offers nothing but the content itself to visitors. In its current format, it is almost useless as it lacks critical information and is very difficult to read.
These might seem like trivial differences, but they show a willingness to work with the community. It is important for forum owners and members to engage in a dialog and try reach an amicable solution. Most I’ve talked to agree that the idea behind Omgili is at least interesting, but the execution of the previews leaves a great deal to be desired.
In Geva’s defense, he did say that this is the first he has heard about the feature and that his reason for running ads is that, as a start up, money is tight and operating costs are high. He said he plans to remove the ads once he is able to make enough for the results pages.
I am very uncomfortable with the way Omgili handles its cache. It seems to show a great deal of disrespect to forum owners and their posters. It also raises serious copyright issues, including in Israel, Geva’s home, and is most likely a violation of the law.
But unlike most who are accused of scraping and spamming, Omgili seems to be willing to work with content providers. I am going to continue my dialog with Geva and I encourage others who are concerned about the service to do the same.
Whenever I run across a new service that raises concerns like this, I always attempt to extend an olive branch and get the party to talk. Omgili is one of the first to extend the same branch back. If they fail to address these concerns through sensible dialog, then further action is available and may be necessary down the road.
In the meantime, my hope is that cooler heads can prevail and the situation can be resolved. It’s important for forum admins and members to be aware of this site and what it’s doing, but not for them to overreact.
There is still a very good chance at a peaceful and amicable solution.
Update 1/12/2006: Ran Geva just wrote me to let me know that he has created a new form to opt out of the preview section. You fan find that form here if you are interested. Also, Geva has informed me that he has removed all ads from the preview pages for the time being. That seems to take care of two of the major issues. More updates to follow.
Update 07/12/2008: The above form does not work and the link has been removed. Omigili now requires sites to use Robots.txt to remove their sites from the service, something many forum admins may not be able to do.
Special thanks goes to Patrick of the iFroggy Network for tipping me off to this site.