Why Finding the Host of a Site is Getting Much Harder

Cloudflare Logo ImageHere’s a pop quiz: Who is the host of Plagiarism Today?

If you read the sidebar of the site, you’ll notice that I proudly state that I am hosted by Servint (referral link), a VPS host that has hosted PT for that past year or so.

But, look what happens when you check out Domain Tools or WhoIsHostingThis (Disclosure: I am a paid blogger and a business partner for WhoIsHostingthis). Both say that the site is hosted by CloudFlare.

So who is telling the truth? They both are.

The reason is that I, like many other webmasters, use CloudFlare as an easy means to speed up and secure their sites. It’s a great service but it’s a whole new type of hosting arrangement, one that will make it much more difficult to track down where content is hosted and effectively shut down sites.

And Cloudflare only represents the tip of the change. There is a huge shift taking place in the way sites are hosted and its one that will pose new challenges for copyright holders, especially those dealing with image, text and other small-work misuse.

How CloudFlare Works

The basic idea of CloudFlare is that it works as an intermediary between one’s server and the rest of the Web. Visitors instead of going straight to your server, first visit CloudFlare. Cloudflare then does three things:

  1. Filter Out Known Threats: Prevents malicious users and visitors from accessing the site.
  2. Serves Static Content: Any content, such as images, CSS files, JavaScript files, etc. that don’t change are served from Cloudflare’s servers, which are spread out all over the world to improve speed.
  3. Requests Dynamic Content: Finally, anything it hasn’t cached and can’t serve directly, it pulls from your server.

This has the benefits of, at least theoretically, improving site speed, boosting security and reducing load on the server. CloudFlare can even help if your site goes down by serving a cached copy of any pages it has seen recently, letting your visitors at least see the content.

But as powerful of a tool as it is, CloudFlare also creates a problem for content creators who have their works infringed through it. Since CloudFlare is nothing but a temporary cache, none of the actual content is hosted there. However, all the networking tools will point straight to it.

This makes filing a DMCA notice, or any other kind of abuse complaint, more difficult in these cases. Fortunately, CloudFlare is making the process as easy as possible on its end and is working to be a good neighbor.

How to File a DMCA With CloudFlare

CloudFlare has a DMCA process (see section 15) that is somewhat unique. Rather than requesting the removal of the work, on would be requesting the IP address of the original server, which you would then research, find the host for it and file a proper DMCA notice there.

Caching services, such as CloudFlare, are protected under the DMCA in much the same way hosts are, meaning they are not liable so long as they meet the criteria and work to remove infringing material. However, the nature of CloudFlare’s system, namely that it is only a temporary cache, makes it easier to remove the content from the original server, which also ensure its removal from CloudFlare.

So, in short, one files a DMCA notice with CloudFlare, gets the IP for the original server and files a similar notice with that host. It’s an extra step and an annoyance, but it likely is going to be a growing reality on the Web.

In fact, it’s safe to say that CloudFlare is really just the tip of the “cloud” iceberg. Hosting is undergoing a drastic change for those who file DMCA notices or other abuse complaints would be wise to take notice.

The Shifting Landscape

There is a major change taking place with the way sites are hosted. Previously, it was relatively simple to find out who hosted a site because sites, with only a few exceptions, existed at a single physical location. However, there’s been a growing trend of new hosting arrangements that are focused on spreading the content of one site around to multiple locations.

Many hosts are offering “cloud” hosting systems that are ways of spreading a single site across many servers in the same location to improve performance, adaptability and reliability. However, these accounts, from an abuse perspective, are still largely the same as one company hosts all of the content and networking tools generally point to the correct host.

Services like CloudFlare, which sit between the real host and the end user, present a much different challenge as they infringer can effectively hide behind them. However, even these cases can be dealt with, albeit requiring an extra step, as long as the cache provider is operating in good faith.

But CloudFlare is really just the tip of the iceberg. Already content delivery networks (CDNs) such as Amazon CloudFront and Rackspace Cloud, make it possible to spread content across multiple servers across the world, serving the content from the servers closest to the viewer. Those cases are still fairly easy to address as the content is still hosted by one company.

However it is also, at least theoretically, possible to mirror content across multiple companies and multiple networks. This would make it so that the host you are seeing is different from the host someone across the globe is seeing. The tools used to determine where a content is located would be limited.

Though I’m not currently aware of any services that do this, other than someone simply uploading the content to multiple services themselves, it is clear that the future of the Web involves content being more distributed, more redundant and harder to track down.

While this will certainly affect copyright holders, especially those who create content put into a smaller file size, meaning it can be more easily distributed in this manner, it raises questions for all types of abuse. Questions that, at this time, don’t have good answers at this time.

Bottom Line

The main thing to remember is that the steps I list for finding the host of a site are still largely effective. However, as the Web begins to change and shift in structure, that will be less and less true.

Cloudflare is a great service and one that I am very happy with for Plagiarism Today (even though they seem to be having some trouble today), however, it is an indication of how the Web is changing and it’s a change that those who need to know the host of a site, an image, a video, etc. need to be aware of.

None of this is meant to be a knock on CloudFlare (or any other service), rather, just a warning that there is change in the winds and that the laws written 12 years ago, such as the DMCA, may have a tough time applying to the way the Web is changing and, just as bad, the tools we currently use to track down and stop various issues may be less effective, calling for new tools and new techniques.

Obviously, I’ll update my guides and my information as new tools become available. However, it will likely be a tough curve to stay ahead of.

1 comments
host
host

Adore it. I am interested in this info. Excellent information and facts Let me look pertaining to info on diets that work.

Trackbacks

  1. [...] holders, especially those dealing with image, text and other small-work misuse. Continued: Why Finding the Host of a Site is Getting Much Harder But… where there's a will there's a way. Read the article to find out how. [...]

  2. [...] content across an entire CDN, however, sometimes CDNs also sit in between sites and their visitors, as with CloudFlare, adding an extra layer to pierce. (Disclosure: This site uses [...]

  3. [...] Continued: Why Finding the Host of a Site is Getting Much Harder [...]