Using CSS to Thwart Content Theft

CSS Vinyl
Creative Commons License photo credit: tiagonicastro

Mikey, a contributor at Rusty Lime, recently posted a very interesting idea for deterring content theft or at least frustrating those who would lift your articles.

The basic premise is to use CSS trickery to ensure that would-be plagiarists pick up an image or a block of text, one that most likely denounces the theft or provides a link back to the original site, even though readers of the original site see nothing at all.

It’s a simple idea that that could deter or mitigate against content theft issues and may help some Webmasters add an extra layer of protection against such misuse.

However, as simple as the system is, it does have a critical flaw that greatly limits its usefulness, especially for bloggers.

The Premise

The actual idea behind the technique is strikingly simple.

Cascading Style Sheets (CSS) can instruct Web browsers how to display a certain item on a page. Whether it is an image, a block of text or something else altogether, CSS can be used to determine the position, size and other variables of it.

However, CSS can also be used to completely hide an element, by simply adding the following line to your site’s CSS file and changing the class name to whatever you desired.

.hiddenclass { display:none; }

From then on, anything you want to hide from your visitors, you simply add the class name to the appropriate tag. For example, to hide an image, you might use this code.

<img class="hiddenclass" src="http://www.yoursite.com/hiddenimage.jpg" alt="" />

This will keep the visitors on your site from seeing the content but, should anyone scrape the HTML code, they will not have matching CSS code on their site, causing the image or text to appear.

Theoretically, this can be used to provide attribution to your own site only on pages that misuse the content. It is a potentially great way to punish spammers without putting any burden on legitimate users.

However, there is an issue with the technique that could make it impractical for many sites.

Fly in the Ointment

As exciting as the idea is, the problem is with publishing via RSS feed. The issue is that RSS readers do not have the ability to interpret CSS and, as such, anyone viewing the content over the feed would see the “hidden” content as well.

One could remove the hidden content from the RSS feed, but that would make the technique useless against RSS scraping, which is the most common form of unwanted republication taking place.

This means that, anything you place in the hidden content needs to be something that can comfortably be displayed in the RSS feed as well. meaning that it is something users of the site will continue to see whatever you are trying to hide.

This prohibits you from using strong content theft warnings and other devices that might be tempting to use.

Illustrating the Problem

To help illustrate this point. I’ve added a special class to my site’s CSS file that will hide certain images. Below, I’m going to display the Plagiarism Today logo twice, first without and second with the CSS class.

Begin Visible

End Visible

Begin “Hidden”

End “Hidden”

If you are viewing this article actually on the site itself, the second image will be hidden and nothing will appear between teh two lines. However, if you are reading it on the RSS feed, you should see the image twice.

This is a recurring problem for me on this site as I use CSS attributes to position the inline images on the site but have to continue to add other code to ensure that they display right when viewed in RSS readers.

Either way, please leave a comment on your experience with this test, what RSS reader you are using and what the outcome was.

Conclusions

There are potential uses for this system. It could be especially useful in environments where you can edit CSS files but not add plugins or otherwise manipulate the RSS feed. It may also help with situations where HTML scraping is a bigger concern than RSS scraping.

For most, this technique will not be very useful but it is still a clever idea that might help some Webmasters better protect their content.

Even though it won’t do anything to actually stop the plagiarist or other rip off artist from using the work, it can mitigate against the damage that they do and add a little bit of frustration to the lives of a plagiarist.

Of course, until RSS feed get better support for CSS, this solution will always be an incomplete one. However, it still is a trick worth keeping in mind, if nothing else in case it becomes useful some day down the road.

73 comments
genealogy
genealogy

well, to guard my content but it will be much impossible than to prevent someone steal

genealogy
genealogy

well, to guard my content but it will be much impossible than to prevent someone steal

genealogy
genealogy

well, to guard my content but it will be much impossible than to prevent someone steal

Trevor
Trevor

There are other issues as well. Some content gets copied for innocent reasons like emailing to a friend or for research. Tynt’s Tracer reveals what content is being copied from your website and automatically adds an attribution link back to your original content if it is lifted from your site and pasted into an email, blog or website. This way you get the traffic and the credit.

Visit www.tynt.com <http://www.tynt.com> to sign up now.

Trevor
Do you know what is being copied from your site?

Trevor
Trevor

There are other issues as well. Some content gets copied for innocent reasons like emailing to a friend or for research. Tynt’s Tracer reveals what content is being copied from your website and automatically adds an attribution link back to your original content if it is lifted from your site and pasted into an email, blog or website. This way you get the traffic and the credit. Visit <a href="http://www.tynt.com" target="_blank">www.tynt.com <http://www.tynt.com> to sign up now.Trevor Do you know what is being copied from your site?

Trevor
Trevor

There are other issues as well. Some content gets copied for innocent reasons like emailing to a friend or for research. Tynt’s Tracer reveals what content is being copied from your website and automatically adds an attribution link back to your original content if it is lifted from your site and pasted into an email, blog or website. This way you get the traffic and the credit. Visit <a href="http://www.tynt.com" target="_blank">www.tynt.com <http://www.tynt.com> to sign up now.Trevor Do you know what is being copied from your site?

Jonathan Bailey
Jonathan Bailey

Sorry to hear that it didn't work fo ryou but I'm not wholly surprised, I knew it was a very limited approach to the issue. If you need any help, send me an email and I'll see what I can suggest.

Jonathan Bailey
Jonathan Bailey

Sorry to hear that it didn't work fo ryou but I'm not wholly surprised, I knew it was a very limited approach to the issue. If you need any help, send me an email and I'll see what I can suggest.

Jonathan Bailey
Jonathan Bailey

Sorry to hear that it didn't work fo ryou but I'm not wholly surprised, I knew it was a very limited approach to the issue. If you need any help, send me an email and I'll see what I can suggest.

Eszter
Eszter

Well, it definitely doesn't work for me, tried it with four different posts.

Eszter
Eszter

Well, it definitely doesn't work for me, tried it with four different posts.

Eszter
Eszter

Well, it definitely doesn't work for me, tried it with four different posts.

Jonathan Bailey
Jonathan Bailey

I am sorry to hear about your problems in this area, if I can help in any way, please let me know.

You're very welcome for the help and please let me know if there is more that I can do!

Jonathan Bailey
Jonathan Bailey

I am sorry to hear about your problems in this area, if I can help in any way, please let me know.You're very welcome for the help and please let me know if there is more that I can do!

Susan
Susan

Dear Jonathan, I hesitated to put my website back-up because people were stealing the content from my uncle's book (used with permission) The copyright of the book was updated again during the 1990s. It would be wonderful to so many people to have the information concerning genealogy, with fun journal stories and pioneer courage. I hope I can keep studying this page and figure this out. Unfortunately, when I was uploading my pages, I avoided Java and css, thinking it was just too much to learn. Now, I need to learn better the pages in the correct resolution and need to learn Java and css, anyway. Thank you so much for your help! It will keep my family's hard work safe and not used to make someone else money, by selling it as their own. I was just sharing out of love of my family and gave complete credit to those that worked so hard for our family to have. (((hugs))) Susan Lazenby Santa Paula, California

Susan
Susan

Dear Jonathan, I hesitated to put my website back-up because people were stealing the content from my uncle's book (used with permission) The copyright of the book was updated again during the 1990s. It would be wonderful to so many people to have the information concerning genealogy, with fun journal stories and pioneer courage. I hope I can keep studying this page and figure this out. Unfortunately, when I was uploading my pages, I avoided Java and css, thinking it was just too much to learn. Now, I need to learn better the pages in the correct resolution and need to learn Java and css, anyway. Thank you so much for your help! It will keep my family's hard work safe and not used to make someone else money, by selling it as their own. I was just sharing out of love of my family and gave complete credit to those that worked so hard for our family to have. (((hugs))) Susan Lazenby Santa Paula, California

Jonathan Bailey
Jonathan Bailey

Mike,

I agree that the CSS solution has serious flaws but, then again, so does any DRM technique. The solution you present, for example, won't wok on those who turn of referrals (which includes many of my privacy-buff friends) and will not work in all environments as many bloggers don't have access to their server config files.

That being said, I generally think technology solutions are a waste for many of the reasons you list, still, I discuss them for those who are interested. If you think that a technology-based approach is best, you need to decide what works for you in your situation with your needs.

You definitely present a good idea for some and I agree it is superior in many ways, but I think every method will have its limitations.

That's fair to say...

Jonathan Bailey
Jonathan Bailey

Mike,

I agree that the CSS solution has serious flaws but, then again, so does any DRM technique. The solution you present, for example, won't wok on those who turn of referrals (which includes many of my privacy-buff friends) and will not work in all environments as many bloggers don't have access to their server config files.

That being said, I generally think technology solutions are a waste for many of the reasons you list, still, I discuss them for those who are interested. If you think that a technology-based approach is best, you need to decide what works for you in your situation with your needs.

You definitely present a good idea for some and I agree it is superior in many ways, but I think every method will have its limitations.

That's fair to say...

Jonathan Bailey
Jonathan Bailey

Mike, I agree that the CSS solution has serious flaws but, then again, so does any DRM technique. The solution you present, for example, won't wok on those who turn of referrals (which includes many of my privacy-buff friends) and will not work in all environments as many bloggers don't have access to their server config files.That being said, I generally think technology solutions are a waste for many of the reasons you list, still, I discuss them for those who are interested. If you think that a technology-based approach is best, you need to decide what works for you in your situation with your needs. You definitely present a good idea for some and I agree it is superior in many ways, but I think every method will have its limitations. That's fair to say...

Jonathan Bailey
Jonathan Bailey

Mike,

I agree that the CSS solution has serious flaws but, then again, so does any DRM technique. The solution you present, for example, won't wok on those who turn of referrals (which includes many of my privacy-buff friends) and will not work in all environments as many bloggers don't have access to their server config files.

That being said, I generally think technology solutions are a waste for many of the reasons you list, still, I discuss them for those who are interested. If you think that a technology-based approach is best, you need to decide what works for you in your situation with your needs.

You definitely present a good idea for some and I agree it is superior in many ways, but I think every method will have its limitations.

That's fair to say...

Jonathan Bailey
Jonathan Bailey

Mike,
I agree that the CSS solution has serious flaws but, then again, so does any DRM technique. The solution you present, for example, won't wok on those who turn of referrals (which includes many of my privacy-buff friends) and will not work in all environments as many bloggers don't have access to their server config files.
That being said, I generally think technology solutions are a waste for many of the reasons you list, still, I discuss them for those who are interested. If you think that a technology-based approach is best, you need to decide what works for you in your situation with your needs.
You definitely present a good idea for some and I agree it is superior in many ways, but I think every method will have its limitations.
That's fair to say...

Jonathan Bailey
Jonathan Bailey

Mike, I agree that the CSS solution has serious flaws but, then again, so does any DRM technique. The solution you present, for example, won't wok on those who turn of referrals (which includes many of my privacy-buff friends) and will not work in all environments as many bloggers don't have access to their server config files. That being said, I generally think technology solutions are a waste for many of the reasons you list, still, I discuss them for those who are interested. If you think that a technology-based approach is best, you need to decide what works for you in your situation with your needs. You definitely present a good idea for some and I agree it is superior in many ways, but I think every method will have its limitations. That's fair to say...

Mike Sharp
Mike Sharp

The problem with a CSS-based approach is that it affects accessibility, and isn't very reliable. It also requires you to implement this on every page, and doesn't protect images and non-html property.

A better approach is to handle this server-side with an HTTP module (or the equivalent for your platform). This works by checking the HTTP referer header in the request. This tells your server where the user was when they made the request. If the referer header isn't your domain, then they got the image (or other stolen content) from somewhere else. (btw, referer is mis-spell, but that's the way the HTTP spec shows it.)

One way to handle hotlinked images is to replace the requested image with one that informs the viewer that they are seeing a stolen image.

For example, here's how to do it on Apache with mod_rewrite:

http://www.jibble.org/myspace-hotlinking/

Other platforms have similar approaches, and there are commercial products that can help with this as well.

Hotlinking images is a real problem, since (copyright issues aside) it costs the victim money in terms of bandwidth. Thomas Scott wrote about this back in 2004 on AListApart:

http://www.alistapart.com/articles/hotlinking/

Regards,
Mike Sharp

Mike Sharp
Mike Sharp

The problem with a CSS-based approach is that it affects accessibility, and isn't very reliable. It also requires you to implement this on every page, and doesn't protect images and non-html property.

A better approach is to handle this server-side with an HTTP module (or the equivalent for your platform). This works by checking the HTTP referer header in the request. This tells your server where the user was when they made the request. If the referer header isn't your domain, then they got the image (or other stolen content) from somewhere else. (btw, referer is mis-spell, but that's the way the HTTP spec shows it.)

One way to handle hotlinked images is to replace the requested image with one that informs the viewer that they are seeing a stolen image.

For example, here's how to do it on Apache with mod_rewrite:

http://www.jibble.org/myspace-hotlinking/

Other platforms have similar approaches, and there are commercial products that can help with this as well.

Hotlinking images is a real problem, since (copyright issues aside) it costs the victim money in terms of bandwidth. Thomas Scott wrote about this back in 2004 on AListApart:

http://www.alistapart.com/articles/hotlinking/

Regards,
Mike Sharp

Mike Sharp
Mike Sharp

The problem with a CSS-based approach is that it affects accessibility, and isn't very reliable. It also requires you to implement this on every page, and doesn't protect images and non-html property.

A better approach is to handle this server-side with an HTTP module (or the equivalent for your platform). This works by checking the HTTP referer header in the request. This tells your server where the user was when they made the request. If the referer header isn't your domain, then they got the image (or other stolen content) from somewhere else. (btw, referer is mis-spell, but that's the way the HTTP spec shows it.)

One way to handle hotlinked images is to replace the requested image with one that informs the viewer that they are seeing a stolen image.

For example, here's how to do it on Apache with mod_rewrite:

http://www.jibble.org/myspace-hotlinking/

Other platforms have similar approaches, and there are commercial products that can help with this as well.

Hotlinking images is a real problem, since (copyright issues aside) it costs the victim money in terms of bandwidth. Thomas Scott wrote about this back in 2004 on AListApart:

http://www.alistapart.com/articles/hotlinking/

Regards,
Mike Sharp

Mike Sharp
Mike Sharp

The problem with a CSS-based approach is that it affects accessibility, and isn't very reliable. It also requires you to implement this on every page, and doesn't protect images and non-html property.
A better approach is to handle this server-side with an HTTP module (or the equivalent for your platform). This works by checking the HTTP referer header in the request. This tells your server where the user was when they made the request. If the referer header isn't your domain, then they got the image (or other stolen content) from somewhere else. (btw, referer is mis-spell, but that's the way the HTTP spec shows it.)
One way to handle hotlinked images is to replace the requested image with one that informs the viewer that they are seeing a stolen image.
For example, here's how to do it on Apache with mod_rewrite:
http://www.jibble.org/myspace-hotlinking/
Other platforms have similar approaches, and there are commercial products that can help with this as well.
Hotlinking images is a real problem, since (copyright issues aside) it costs the victim money in terms of bandwidth. Thomas Scott wrote about this back in 2004 on AListApart:
http://www.alistapart.com/articles/hotlinking/
Regards,
Mike Sharp

Mike Sharp
Mike Sharp

The problem with a CSS-based approach is that it affects accessibility, and isn't very reliable. It also requires you to implement this on every page, and doesn't protect images and non-html property. A better approach is to handle this server-side with an HTTP module (or the equivalent for your platform). This works by checking the HTTP referer header in the request. This tells your server where the user was when they made the request. If the referer header isn't your domain, then they got the image (or other stolen content) from somewhere else. (btw, referer is mis-spell, but that's the way the HTTP spec shows it.) One way to handle hotlinked images is to replace the requested image with one that informs the viewer that they are seeing a stolen image. For example, here's how to do it on Apache with mod_rewrite: http://www.jibble.org/myspace-hotlinking/ Other platforms have similar approaches, and there are commercial products that can help with this as well. Hotlinking images is a real problem, since (copyright issues aside) it costs the victim money in terms of bandwidth. Thomas Scott wrote about this back in 2004 on AListApart: http://www.alistapart.com/articles/hotlinking/ Regards, Mike Sharp

Mike Sharp
Mike Sharp

The problem with a CSS-based approach is that it affects accessibility, and isn't very reliable. It also requires you to implement this on every page, and doesn't protect images and non-html property. A better approach is to handle this server-side with an HTTP module (or the equivalent for your platform). This works by checking the HTTP referer header in the request. This tells your server where the user was when they made the request. If the referer header isn't your domain, then they got the image (or other stolen content) from somewhere else. (btw, referer is mis-spell, but that's the way the HTTP spec shows it.)One way to handle hotlinked images is to replace the requested image with one that informs the viewer that they are seeing a stolen image. For example, here's how to do it on Apache with mod_rewrite:http://www.jibble.org/myspace-hotlinking/Other platforms have similar approaches, and there are commercial products that can help with this as well. Hotlinking images is a real problem, since (copyright issues aside) it costs the victim money in terms of bandwidth. Thomas Scott wrote about this back in 2004 on AListApart:http://www.alistapart.com/articles/hotlinking/Regards,Mike Sharp

Mikey
Mikey

Hi Jonathan. I'm glad you found my idea interesting. I might even implement it soon.

Regards,

Mikey.
www.rustylime.com

Mikey
Mikey

Hi Jonathan. I'm glad you found my idea interesting. I might even implement it soon.

Regards,

Mikey.
www.rustylime.com

Mikey
Mikey

Hi Jonathan. I'm glad you found my idea interesting. I might even implement it soon.
Regards,
Mikey.
www.rustylime.com

Mikey
Mikey

Hi Jonathan. I'm glad you found my idea interesting. I might even implement it soon. Regards, Mikey. www.rustylime.com

Mikey
Mikey

Hi Jonathan. I'm glad you found my idea interesting. I might even implement it soon.Regards,Mikey.www.rustylime.com

Jonathan Bailey
Jonathan Bailey

@jardel -
I like the idea of using htaccess but it would become a second job trying to keep up with where the readers were grabbing the feed. Some would be obvious, such as Google Reader, but every new news reader would have to be added.

In the end, I think your idea at the bottom is best, just add the copyright notice and make it as "yours" as possible. It's not perfect, but it's something...

Jonathan Bailey
Jonathan Bailey

@jardel -I like the idea of using htaccess but it would become a second job trying to keep up with where the readers were grabbing the feed. Some would be obvious, such as Google Reader, but every new news reader would have to be added.In the end, I think your idea at the bottom is best, just add the copyright notice and make it as "yours" as possible. It's not perfect, but it's something...

Jonathan Bailey
Jonathan Bailey

@jardel -
I like the idea of using htaccess but it would become a second job trying to keep up with where the readers were grabbing the feed. Some would be obvious, such as Google Reader, but every new news reader would have to be added.
In the end, I think your idea at the bottom is best, just add the copyright notice and make it as "yours" as possible. It's not perfect, but it's something...

Jonathan Bailey
Jonathan Bailey

@jardel -
I like the idea of using htaccess but it would become a second job trying to keep up with where the readers were grabbing the feed. Some would be obvious, such as Google Reader, but every new news reader would have to be added.

In the end, I think your idea at the bottom is best, just add the copyright notice and make it as "yours" as possible. It's not perfect, but it's something...

Jonathan Bailey
Jonathan Bailey

@jardel - I like the idea of using htaccess but it would become a second job trying to keep up with where the readers were grabbing the feed. Some would be obvious, such as Google Reader, but every new news reader would have to be added. In the end, I think your idea at the bottom is best, just add the copyright notice and make it as "yours" as possible. It's not perfect, but it's something...

jardel
jardel

it's a nice idea!

other option should be this:
use the hidden class even for rss readers and do a list based on your feed subscribers of where they read the blog, then go to htaccess and disable hotlinking for these readers. I don't know, but might work, also i don't know if there will appear a broken image or it will not appear at all.

then you could do a text like "this site scrapped our content bla bla bla, if you are in a feed reader please go to http://url and ask for removal from your rss client" etc etc.

Maybe desktop readers suffer seeing the image too.

The best technique i've seen so far is the one that you put "blog by author (C) year - year" in the top, all with links and a related articles in the bottom. If someone scraps via rss, there will be a link for the blog, the author, the copyright disclaimer and more 3 or 5 links in the bottom for other articles in the same blog.

jardel
jardel

it's a nice idea!

other option should be this:
use the hidden class even for rss readers and do a list based on your feed subscribers of where they read the blog, then go to htaccess and disable hotlinking for these readers. I don't know, but might work, also i don't know if there will appear a broken image or it will not appear at all.

then you could do a text like "this site scrapped our content bla bla bla, if you are in a feed reader please go to http://url and ask for removal from your rss client" etc etc.

Maybe desktop readers suffer seeing the image too.

The best technique i've seen so far is the one that you put "blog by author (C) year - year" in the top, all with links and a related articles in the bottom. If someone scraps via rss, there will be a link for the blog, the author, the copyright disclaimer and more 3 or 5 links in the bottom for other articles in the same blog.

Jonathan Bailey
Jonathan Bailey

Sorry to hear that it didn't work fo ryou but I'm not wholly surprised, I knew it was a very limited approach to the issue. If you need any help, send me an email and I'll see what I can suggest.

Jonathan Bailey
Jonathan Bailey

@jardel -
I like the idea of using htaccess but it would become a second job trying to keep up with where the readers were grabbing the feed. Some would be obvious, such as Google Reader, but every new news reader would have to be added.

In the end, I think your idea at the bottom is best, just add the copyright notice and make it as "yours" as possible. It's not perfect, but it's something...

Jonathan Bailey
Jonathan Bailey

@jardel -
I like the idea of using htaccess but it would become a second job trying to keep up with where the readers were grabbing the feed. Some would be obvious, such as Google Reader, but every new news reader would have to be added.

In the end, I think your idea at the bottom is best, just add the copyright notice and make it as "yours" as possible. It's not perfect, but it's something...

Trackbacks

  1. [...] August 20, 2008 Three artices of interest Posted by sharonb under Tips, Typography, Webdesign | Tags: copyright, CSS, hosting, plagerism, Typography |   Plagiarism Today has published an interesting CCS technique to counter or at least frustrate anyone who is scraping your site. Check out the article Using CSS to Thwart Content Theft. [...]

  2. [...] The idea for this comes from Plagiarism Today. [...]

  3. [...] The idea for this comes from Plagiarism Today. [...]