Update: Bitacle’s Insults

10

A brief update to the recent story about Bitacle’s alleged comment on Sparklepanda’s site.

Sparklepanda got back in touch with me and gave me the information that her Sitemeter collected on the individual that left the comment.

Though the information is far from conclusive, I have to agree with Sparklepanda that it is almost certainly from Bitacle or someone closely involved with the site. The commenter visited Sparklepanda’s blog from a page on Bitacle’s site and the link IP address leads to the same city that is listed in Bitacle’s whois information.

According to Sparklepanda, it was most likely a comment she left on one of the entries that Bitacle scraped from her that got their attention. That seems to match the referring link that the anonymous commenter followed to get to her site.

Finally, in related news, Sparklepanda has followed the footsteps of many others and decided to run with shortened RSS feeds to prevent Bitacle’s scraping. It is certainly an understandable decision and, in her case, it’s one that her readers seem to be tolerant of.

Given the history, it is very easy to see why.

In the end, it appears as if Bitacle has indeed stooped to petty name calling. It’s not a shock, It’s not even disheartening.

However, it is a sign that Bitacle has been getting very frustrated by what has been going on. Maybe, just maybe, that means good news is coming in the future.

Tags: , , , , , , , , , ,

Want to Republish this Article? Request Permission Here. It's Free.

Have a Plagiarism Problem?

Need an expert witness, plagiarism analyst or content enforcer?
Check out our Consulting Website

10 COMMENTS

  1. I read through the original complaints about bitacle, checked out their website, and I'm not sure I agree with your opinions on the matter.

    Google, yahoo, and technorati scrape other people's content every day (which I believe, there was a US court case that google won). They also display advertisements. Just like bitacle. The guys over at bitacle are not claiming they wrote the content, they are acting as a search engine and content caching center, and funding it through advertisements, which I feel is perfectly okay and not plagiarism.

    If you want your content off of bitacle, it also needs to be taken off of every other search engine that is caching it.

  2. I read through the original complaints about bitacle, checked out their website, and I’m not sure I agree with your opinions on the matter.

    Google, yahoo, and technorati scrape other people’s content every day (which I believe, there was a US court case that google won). They also display advertisements. Just like bitacle. The guys over at bitacle are not claiming they wrote the content, they are acting as a search engine and content caching center, and funding it through advertisements, which I feel is perfectly okay and not plagiarism.

    If you want your content off of bitacle, it also needs to be taken off of every other search engine that is caching it.

  3. Ricardo,

    Allow me, quickly, to go over the differences between Bitacle and every other legitimate search engine on the planet.

    First, Bitacle is ignoring all robots.txt files, meta tags and removal requests. Want Google to stop indixing your site? Use robots.txt. Want them to stop caching? Use meta tags. Want them to remove everything they have on you, write them. You can do none of those things with Bitacle.

    Second, search engines may store content in its entirety, but only display small portions. Bitacle displays the posts in their entirety, provided you have a full RSS feed. Sure, some search engines have caches, but those display the work in its original context, instead of just taking the post and putting it on their site, and can be switched off with meta tags.

    Third, the main goal of a search engine is to direct users to the original site. Caching is used only as a backup for down or altered sites. Bitacle wholeheartedly intends to keep visitors on their site, even having their own comment feature. They even open all links within posts in a new window.

    Fourth, until recently, Bitacle was displaying ads next to site's full content, something no legitimate search engine, or even Web RSS reader, does.

    Fifth, also until recently, Bitacle was claiming copyright ownership over everything it scraped and placing that content under a Creative Commons license that was, often times, incompatible with the original license.

    Finally, it's become clear to most that the purpose of the "Aggregates" feature is nothing more than search engine spam. When search engines use tags to ensure that cached pages don't wond up in other search engines, Bitacle works hard to ensure their scraped pages do. This dilutes the market for the original content and forces sites to compete with Bitacle for their own keywords.

    These are just some of the major differences between Bitacle and a legitimate search engine. This is why we are upset with Bitacle and not Google.

    The difference really isn't that hard to see.

  4. Ricardo,

    Allow me, quickly, to go over the differences between Bitacle and every other legitimate search engine on the planet.

    First, Bitacle is ignoring all robots.txt files, meta tags and removal requests. Want Google to stop indixing your site? Use robots.txt. Want them to stop caching? Use meta tags. Want them to remove everything they have on you, write them. You can do none of those things with Bitacle.

    Second, search engines may store content in its entirety, but only display small portions. Bitacle displays the posts in their entirety, provided you have a full RSS feed. Sure, some search engines have caches, but those display the work in its original context, instead of just taking the post and putting it on their site, and can be switched off with meta tags.

    Third, the main goal of a search engine is to direct users to the original site. Caching is used only as a backup for down or altered sites. Bitacle wholeheartedly intends to keep visitors on their site, even having their own comment feature. They even open all links within posts in a new window.

    Fourth, until recently, Bitacle was displaying ads next to site’s full content, something no legitimate search engine, or even Web RSS reader, does.

    Fifth, also until recently, Bitacle was claiming copyright ownership over everything it scraped and placing that content under a Creative Commons license that was, often times, incompatible with the original license.

    Finally, it’s become clear to most that the purpose of the “Aggregates” feature is nothing more than search engine spam. When search engines use tags to ensure that cached pages don’t wond up in other search engines, Bitacle works hard to ensure their scraped pages do. This dilutes the market for the original content and forces sites to compete with Bitacle for their own keywords.

    These are just some of the major differences between Bitacle and a legitimate search engine. This is why we are upset with Bitacle and not Google.

    The difference really isn’t that hard to see.

  5. “First, Bitacle is ignoring all robots.txt files, meta tags and removal requests. Want Google to stop indixing your site? Use robots.txt. Want them to stop caching? Use meta tags. Want them to remove everything they have on you, write them. You can do none of those things with Bitacle”

    As of yet, there are no laws against this. You may not like it, but you can’t really stop it from happening.

    “Second, search engines may store content in its entirety, but only display small portions. Bitacle displays the posts in their entirety, provided you have a full RSS feed. Sure, some search engines have caches, but those display the work in its original context, instead of just taking the post and putting it on their site, and can be switched off with meta tags”

    RSS feeds are re-posted on websites and on RSS clients. If you don’t want someone to “scrape” your RSS feeds, do not provide them.

    “Fourth, until recently, Bitacle was displaying ads next to site’s full content, something no legitimate search engine, or even Web RSS reader, does.”

    Take a look at many of the legitimate sites. They all contain full ads. The placement of the ads should have no bearing on its legitimacy.

    “Fifth, also until recently, Bitacle was claiming copyright ownership over everything it scraped and placing that content under a Creative Commons license that was, often times, incompatible with the original license”

    This is the only thing I see wrong with what Bitacale has done and they stopped.

    “Finally, it’s become clear to most that the purpose of the “Aggregatesâ€? feature is nothing more than search engine spam. When search engines use tags to ensure that cached pages don’t wond up in other search engines, Bitacle works hard to ensure their scraped pages do. This dilutes the market for the original content and forces sites to compete with Bitacle for their own keywords.”

    Wow, you finally realized the power of the Internet. If you want people to stop taking your RSS feeds, don’t place them on your website. When I have an RSS feed of anything, I know full well that it will be scraped by websites, clients, and blogs (this is the point).

    “The difference really isn’t that hard to see.”

    It’s hard to see why they are currently breaking the law. It sounds to me like a bunch of bitter people whining to make things seem more important than they actually are. The problem is that you can’t pick and choose who you let the law apply to. If the law says you are allowed to cache content, then people should be able to cache content. If not, then stop everyone from caching content (including google, yahoo, etc).

  6. “First, Bitacle is ignoring all robots.txt files, meta tags and removal requests. Want Google to stop indixing your site? Use robots.txt. Want them to stop caching? Use meta tags. Want them to remove everything they have on you, write them. You can do none of those things with Bitacle”

    As of yet, there are no laws against this. You may not like it, but you can’t really stop it from happening.

    “Second, search engines may store content in its entirety, but only display small portions. Bitacle displays the posts in their entirety, provided you have a full RSS feed. Sure, some search engines have caches, but those display the work in its original context, instead of just taking the post and putting it on their site, and can be switched off with meta tags”

    RSS feeds are re-posted on websites and on RSS clients. If you don’t want someone to “scrape” your RSS feeds, do not provide them.

    “Fourth, until recently, Bitacle was displaying ads next to site’s full content, something no legitimate search engine, or even Web RSS reader, does.”

    Take a look at many of the legitimate sites. They all contain full ads. The placement of the ads should have no bearing on its legitimacy.

    “Fifth, also until recently, Bitacle was claiming copyright ownership over everything it scraped and placing that content under a Creative Commons license that was, often times, incompatible with the original license”

    This is the only thing I see wrong with what Bitacale has done and they stopped.

    “Finally, it’s become clear to most that the purpose of the “Aggregates? feature is nothing more than search engine spam. When search engines use tags to ensure that cached pages don’t wond up in other search engines, Bitacle works hard to ensure their scraped pages do. This dilutes the market for the original content and forces sites to compete with Bitacle for their own keywords.”

    Wow, you finally realized the power of the Internet. If you want people to stop taking your RSS feeds, don’t place them on your website. When I have an RSS feed of anything, I know full well that it will be scraped by websites, clients, and blogs (this is the point).

    “The difference really isn’t that hard to see.”

    It’s hard to see why they are currently breaking the law. It sounds to me like a bunch of bitter people whining to make things seem more important than they actually are. The problem is that you can’t pick and choose who you let the law apply to. If the law says you are allowed to cache content, then people should be able to cache content. If not, then stop everyone from caching content (including google, yahoo, etc).

  7. “As of yet, there are no laws against this. You may not like it, but you can’t really stop it from happening.”

    You’re flat out wrong there. One of the critical points that helped determine search engines were not infringing were that they offered means of opt-out. Even a mere search engine, which doesn’t display full content, is infringing if it doesn’t offer a clear opt out procedure. The robots.txt system was designed to provide that and all legit search engines offer a special opt out page.

    Yes, it is illegal. If Google stopped listening to robots.txt, they would be sued.

    “RSS feeds are re-posted on websites and on RSS clients. If you don’t want someone to “scrapeâ€? your RSS feeds, do not provide them.”

    The implied license to RSS feeds extends to individual private use. That’s what most have said. Look up the previous articles on my site on scraping you’ll see at least three other reasons why scraping is on legally dangerous turf.

    You have no response to my third point, I’ll assume you concede it.

    “Take a look at many of the legitimate sites. They all contain full ads. The placement of the ads should have no bearing on its legitimacy.”

    According to the DMCA, a host can not knowingly profit directly from the infringement. The location of the ads is extremely important. Legally speaking, if they use the text as ad bait, knowing that the content belongs to others, and the person decides it is infringing, the host can be sued directly.

    Location does matter in the eyes of the law.

    “Wow, you finally realized the power of the Internet. If you want people to stop taking your RSS feeds, don’t place them on your website. When I have an RSS feed of anything, I know full well that it will be scraped by websites, clients, and blogs (this is the point).”

    Your basic case in all of this is that, you don’t want them to scrape your RSS feeds, don’t provide it.

    First, I find it repulsive that I need to cut off my legitimate readers, to stop one site that takes the implied license too far. That is unfair to everyone involved. The content in your RSS feed is just as protected as the content on your Web page, you have the same rights, the only difference is the way it is meant to be viewed.

    However, what do you say to the millions who don’t know what an RSS feed is but provide one because it’s an automatic part of their blog or site. How many LiveJournal, Myspace or Xanga users fully understand RSS and the risks? I’d wager almost none.

    I did a previous article about why RSS scraping isn’t acceptable. You may want to read it as it deals with many of the myths you present.

    Oh, and though I’m dealing with American law, remember that the EU is even worse. That’s where Google News was successfully sued, where the notice and takedown requirements are nonexistent and the laws favor hosts and search engines even less.

    Sad, but true.

  8. “As of yet, there are no laws against this. You may not like it, but you can’t really stop it from happening.”

    You’re flat out wrong there. One of the critical points that helped determine search engines were not infringing were that they offered means of opt-out. Even a mere search engine, which doesn’t display full content, is infringing if it doesn’t offer a clear opt out procedure. The robots.txt system was designed to provide that and all legit search engines offer a special opt out page.

    Yes, it is illegal. If Google stopped listening to robots.txt, they would be sued.

    “RSS feeds are re-posted on websites and on RSS clients. If you don’t want someone to “scrape? your RSS feeds, do not provide them.”

    The implied license to RSS feeds extends to individual private use. That’s what most have said. Look up the previous articles on my site on scraping you’ll see at least three other reasons why scraping is on legally dangerous turf.

    You have no response to my third point, I’ll assume you concede it.

    “Take a look at many of the legitimate sites. They all contain full ads. The placement of the ads should have no bearing on its legitimacy.”

    According to the DMCA, a host can not knowingly profit directly from the infringement. The location of the ads is extremely important. Legally speaking, if they use the text as ad bait, knowing that the content belongs to others, and the person decides it is infringing, the host can be sued directly.

    Location does matter in the eyes of the law.

    “Wow, you finally realized the power of the Internet. If you want people to stop taking your RSS feeds, don’t place them on your website. When I have an RSS feed of anything, I know full well that it will be scraped by websites, clients, and blogs (this is the point).”

    Your basic case in all of this is that, you don’t want them to scrape your RSS feeds, don’t provide it.

    First, I find it repulsive that I need to cut off my legitimate readers, to stop one site that takes the implied license too far. That is unfair to everyone involved. The content in your RSS feed is just as protected as the content on your Web page, you have the same rights, the only difference is the way it is meant to be viewed.

    However, what do you say to the millions who don’t know what an RSS feed is but provide one because it’s an automatic part of their blog or site. How many LiveJournal, Myspace or Xanga users fully understand RSS and the risks? I’d wager almost none.

    I did a previous article about why RSS scraping isn’t acceptable. You may want to read it as it deals with many of the myths you present.

    Oh, and though I’m dealing with American law, remember that the EU is even worse. That’s where Google News was successfully sued, where the notice and takedown requirements are nonexistent and the laws favor hosts and search engines even less.

    Sad, but true.

LEAVE A REPLY

Please enter your comment!
Please enter your name here