“The ‘New’ Plagiarism”

Updated: See end of article for update

The Investor Relations Web Report calls it "the new plagiarism". (Note: The original blog post is down.) Dan Zarella from Puritan City call those who engage in it "the best plagiarists". Others simply call them bloggers or, as Zarella also put it, "Human Aggregators".

They’re a new breed of content users that walk a gray area between that which is clearly fair use and what is obviously content theft. Their blogs are marked with large swaths of block quotes and heavy content reuse, but also proper attribution and at least some original content.

These sites, as they’ve grown in number, have created a great deal of controversy among bloggers who are left to wonder if they are nothing more than content thieves in disguise.

Block quotes by the Dozen

These sites, which for this article I’ll simply call "gray", are generally identified by a large number of very short posts, with much of it in block quotes or otherwise directly lifted content. Though they meticulously credit their sources, bowing to more traditional rules for blog attribution, and work to add at least some original content, usually over half of their material comes from other sources.

This has caused many bloggers to worry that these grey blogs might be trying to get away with content theft under the guise of legitimate attribution. The idea being that they can create a much larger volume of content if they only have to write a small portion of it. Users will simply visit the gray blogs since they are able to provide so much more information and, due to the use of liberal quoting, the user will then have no reason to visit the original source. After all, they already have most of the critical information.

While certainly grey blogs don’t pose the same threat or raise the same concerns as spam blogs and other content scrapers, the cause for concern is clear. Even though blogging is about sharing and reusing information, excessive sharing threatens the authors penning the original content. The tale of the goose laying the golden egg springs to mind as, quite simply, greed can be the blogging world’s biggest enemy.

A Separation of Degrees

What makes this issue so difficult to address, and so difficult to write about, is that it’s not so much about gray blogs, but rather, various shades of grey blogs. The difference between someone simply quoting blogs and someone trying to tweak the system is not a clear cut matter, but a separation of degrees.

Quoting, even liberal quoting, is expected by blogs. It’s a part of researching a story and covering ongoing stories as well as sharing information. If done properly, it can not only be used to create a new work, but also drive valuable traffic to the original site. In the blogging world, being the source is often a badge of honor.

However, basing your entire site, or even a larger percentage of it, on quoted content is viewed differently. Being a source in a larger article is one thing, but having your content be the majority of the article on another site another. What distinguishes one from the other is unclear at best. There are no math formulas or systems for determining what is right or what is too much.

More confusing still, everyone has a different idea of what constitutes content theft. With Creative Commons Licenses being very common, it’s obvious some feel that copying an entire work is acceptable so long as attribution is affixed. Others would place the boundary well within what is usually considered fair use.

The challenge becomes to strike a balance and set some kind of guideline that is compatible with copyright law, acceptable under the current code of blogging ethics but also able to appease the concerns many bloggers share over grey sites.

A Proposed Solution

When I first looked at the problem, I was tempted to set guidelines by which a blogger should not get more than X percent of their overall content from other sites or use more than Y lines from another entry. All ideas along those lines, however, quickly fell through.

First, some sites like Engadget, gets a majority of their information from other sources and, correctly, have never been accused of content theft. (Correction: Engadget does write their own copy but reuses many photographs. I apologize for the misunderstanding.). Second, given the varied lengths of posts and methods of reuse available, almost any guideline system would quickly run afoul of fair use and, in other cases, would permit reuse that would almost certainly be questionable. Any attempt to work around these factors would complicate a rule that, supposedly, had the sole benefit of being simple.

In lieu of a hard and fast rule, much like the fair use provision itself, we begin to seek out a framework for determining if a reuse is ethical or not. This framework would contain the following elements, many of which are found in the standard fair use provision:

  1. The amount of reused content compared to the amount of original content.
  2. The amount of reused content in relation to the original work.
  3. The frequency with which large blocks of text are used.
  4. What is gained by the original author.
  5. Whether permission was granted in advance, either through a CC license or direct permission.
  6. Whether attribution was provided or not.
  7. Other indications as to the intent of the one reusing the work, including excessive advertisements, links to one’s own sites and other forms of profiteering or over the top promotion.

(Note: As with everything I do like this, these elements are a draft and are open to both comment and revision.)

Such a system, while not perfect or easy, would provide guidelines both for pursuing content theft and reusing others works. Though it might be subjective in many respects, it does give people pause to think about what they are doing beforehand and at least some standard of conduct to follow.

Conclusion

With file sharing, blogging and content trading are more popular than ever, copyright has become something of a dirty word. Many people are obsessed not with how to best disperse information and participate in this sharing revolution, but with how much they can get away with legally and ethically.

In a parallel to the famous John F. Kennedy quote, we need to stop asking what others can do for us, and ask what we can do for them. Rather than simply wondering what we can get away with or how we can get the most for the least amount of work, we need to figure out how we can best participate in this world-wide discussion.

If the ethics of the blogging world are constantly abused to promote the gain of others, high quality writers will have little motivation to post their works on-line and, as the well slowly dries up, there will be less and less work available for either reuse or for simply reading.

It’s not enough to share, we have to support and reward good content creators. It’s the only way to keep the revolution alive.

*****UPDATE*****

Since this article made its appearance on Slashdot, many people have criticized me for allegedly mixing up the terms plagiarism and copyright infringement. This is coming from confusion in dealing with both the title and the first paragraph of this piece, which were both intended to be hat tips to the articles that inspired me to write about this issue.

The quote is attributed in the very first sentence of the piece. I chose to put quotes around the word "New" instead of the entire title because this kind of content reuse has been going on for some time. There really is little "new" about it. I have modified the title to make it more clear.

Throughout the work I use the terms copyright infringement, reuse and content theft, but never the word plagiarism after the first paragraph. I understand the difference between the terms well and need no lectures.

My hope is that this piece and the attention drawn to it will spark real discussion on a very complicated and intricate issue. Instead, I fear that confusion and misinterpretation may prevent a much-needed debate.

I hope that bloggers, in their haste to chop down the work, will look past the poorly-worded intro and into the issue behind the work, the reason it was pushed in the first place.

[tags]Plagiarism, Content Theft, Copyright Infringement, Copyright Law, Scraping, Creative Commons[/tags]

69 comments
Sort: Newest | Oldest
Zach van Draden
Zach van Draden

Is it really that much to ask when we say, "in your own words?"

Zach van Draden
Zach van Draden

Is it really that much to ask when we say, "in your own words?"

nortypig
nortypig

I agree all authors are derivers as well...

Who wrote the first 10 Ways to Improve SEO for example - I've read quite a number. Did those people perhaps absorb it from the atmosphere? Yet none of them to my memory said they learned that on another blog. Its a valid example.

I have to say I would find it very hard ever to say exactly where what I've heard becomes my own opinion. Mind you I was a writer before being a blogger so am well aware of how parts of things soup away in a bubbling brew and come out as something original. Or originalish.

Yes I like Joseph's point and have made it myself to a lesser extent in the past. Quite a few articles on professional blogging sites seem to be a circuitous rehash of the same information which may have begun on Micro Persuasion or somewhere else.

Its kind of icky when you think of it? Human aggregators or programatical is probably irrelevant.

I'm for a free blogosphere where conversation and the sharing of knowledge are more important than click through revenue... on blogs at least.

Is either side of the debate right though? Mmm I probably think the truth and the path are in that grey area you mentioned.

nortypig
nortypig

I agree all authors are derivers as well...

Who wrote the first 10 Ways to Improve SEO for example - I've read quite a number. Did those people perhaps absorb it from the atmosphere? Yet none of them to my memory said they learned that on another blog. Its a valid example.

I have to say I would find it very hard ever to say exactly where what I've heard becomes my own opinion. Mind you I was a writer before being a blogger so am well aware of how parts of things soup away in a bubbling brew and come out as something original. Or originalish.

Yes I like Joseph's point and have made it myself to a lesser extent in the past. Quite a few articles on professional blogging sites seem to be a circuitous rehash of the same information which may have begun on Micro Persuasion or somewhere else.

Its kind of icky when you think of it? Human aggregators or programatical is probably irrelevant.

I'm for a free blogosphere where conversation and the sharing of knowledge are more important than click through revenue... on blogs at least.

Is either side of the debate right though? Mmm I probably think the truth and the path are in that grey area you mentioned.

Joseph Pietro Riolo
Joseph Pietro Riolo

Plagiarism is built on the myth that the authors are
the creators of new works and therefore, the expectation
that their works must be treated with high reverence,
for these authors are on a higher plane than other
people (i.e. non-authors). This leads to the senseless
rule that anyone who copies their ideas or words (or
expressions) must pay the reverence to them by
providing the attribution. Anyone who does not follow
the rule is considered to be dishonest and therefore
is guilty of breaking the rule. Thus, the concept of
plagiarism is born.

In realty, all authors are derivers where their works
are derived from many sources of knowledge. There
is nothing new under the sun. The only difference
between their works and the sources is the different
combination of the pieces of knowledge, in the same
way as the difference between two kids' construction
of the same number of wood blocks. Because the
universe can't hold all possible combinations of pieces
of knowledge in the same way as it can't hold all
different combinations of atoms, very large percentage
of the combinations is left for people to discover
(discover means to uncover something that already
exists).

So, what should we do with plagiarism? Follow the
steps in how to deal with it.

1. Recognize that the authors are the derivers, not
the creators of the new things. If you continue to
believe in the myth that the authors make something
new, stop here and continue believing in this myth.
Otherwise, go to the next step.

2. If copying text without attribution does not
violate any law (i.e. it is public domain or
uncopyrightable or it falls under fair use) or enforceable
agreement (i.e. license, contract), go to the next step.
Otherwise, stop here and ask the original writer for
permission to copy without attribution.

3. If copying text without attribution will not result
in undesirable consequence, go to the next step. Otherwise,
stop here and you are forced to provide attribution when
copying text. An example of this is college where you are
forced to provide attribution with the penalty of being given
a failing grade or expelled from the college.

4. Exercise the freedom of communication by copying text
or ideas without giving attribution.

5. This step is optional. It is your decision to provide
attribution. We know that we all crave attention. If
providing attribution will not lead you into trouble or
undesirable consequence, feel free to provide attribution.
Otherwise, don't give attribution.

By following the steps, we can prevent the rule against
the plagiarism from becoming the 11th commandment that
would hamper the flexibility of communication as seen in
the blog world.

Joseph Pietro Riolo
josephpietrojeungriolo@gmail.com
riolo@voicenet.com

Public domain notice: I put all of my expressions in this
post in the public domain

Joseph Pietro Riolo
Joseph Pietro Riolo

Plagiarism is built on the myth that the authors are
the creators of new works and therefore, the expectation
that their works must be treated with high reverence,
for these authors are on a higher plane than other
people (i.e. non-authors). This leads to the senseless
rule that anyone who copies their ideas or words (or
expressions) must pay the reverence to them by
providing the attribution. Anyone who does not follow
the rule is considered to be dishonest and therefore
is guilty of breaking the rule. Thus, the concept of
plagiarism is born.

In realty, all authors are derivers where their works
are derived from many sources of knowledge. There
is nothing new under the sun. The only difference
between their works and the sources is the different
combination of the pieces of knowledge, in the same
way as the difference between two kids' construction
of the same number of wood blocks. Because the
universe can't hold all possible combinations of pieces
of knowledge in the same way as it can't hold all
different combinations of atoms, very large percentage
of the combinations is left for people to discover
(discover means to uncover something that already
exists).

So, what should we do with plagiarism? Follow the
steps in how to deal with it.

1. Recognize that the authors are the derivers, not
the creators of the new things. If you continue to
believe in the myth that the authors make something
new, stop here and continue believing in this myth.
Otherwise, go to the next step.

2. If copying text without attribution does not
violate any law (i.e. it is public domain or
uncopyrightable or it falls under fair use) or enforceable
agreement (i.e. license, contract), go to the next step.
Otherwise, stop here and ask the original writer for
permission to copy without attribution.

3. If copying text without attribution will not result
in undesirable consequence, go to the next step. Otherwise,
stop here and you are forced to provide attribution when
copying text. An example of this is college where you are
forced to provide attribution with the penalty of being given
a failing grade or expelled from the college.

4. Exercise the freedom of communication by copying text
or ideas without giving attribution.

5. This step is optional. It is your decision to provide
attribution. We know that we all crave attention. If
providing attribution will not lead you into trouble or
undesirable consequence, feel free to provide attribution.
Otherwise, don't give attribution.

By following the steps, we can prevent the rule against
the plagiarism from becoming the 11th commandment that
would hamper the flexibility of communication as seen in
the blog world.

Joseph Pietro Riolo
josephpietrojeungriolo@gmail.com
riolo@voicenet.com

Public domain notice: I put all of my expressions in this
post in the public domain

Andrew Lih
Andrew Lih

Jonathan, re: your comment on my blog, point taken.

Though if you have a post called "The New Plagiarism" on a site called "PlagiarismToday" and the first graf says, "...Dan Zarella from Puritan City call those who engage in it 'the best plagiarists'. Others simply call them bloggers or, as Zarella also put it, 'Human Aggregators'..." then you can't really fault readers for thinking your angle is to at least insinuate that that plagiarism is part of the issue. But let's put that aside.

There are problems with the vague term you use: "content theft." It is not something defined legally (see the Dan Brown case in the UK) nor is it something widely used in academia (while the term plagiarism is).

Some consider "fair use" a form of "content theft" even though it is well established in the US sense. (It is less so in the Commonwealth's concept of fair dealing.) So I wonder if that is the issue you have, with content block-quoted and attributed, but somehow the copyright owner being able to exercise more restrictions over the use of their content? Is it a fundamental objection to the whole idea of fair use (or fair dealing) that you are contending?

Andrew Lih
Andrew Lih

Jonathan, re: your comment on my blog, point taken.

Though if you have a post called "The New Plagiarism" on a site called "PlagiarismToday" and the first graf says, "...Dan Zarella from Puritan City call those who engage in it 'the best plagiarists'. Others simply call them bloggers or, as Zarella also put it, 'Human Aggregators'..." then you can't really fault readers for thinking your angle is to at least insinuate that that plagiarism is part of the issue. But let's put that aside.

There are problems with the vague term you use: "content theft." It is not something defined legally (see the Dan Brown case in the UK) nor is it something widely used in academia (while the term plagiarism is).

Some consider "fair use" a form of "content theft" even though it is well established in the US sense. (It is less so in the Commonwealth's concept of fair dealing.) So I wonder if that is the issue you have, with content block-quoted and attributed, but somehow the copyright owner being able to exercise more restrictions over the use of their content? Is it a fundamental objection to the whole idea of fair use (or fair dealing) that you are contending?

XMLicious
XMLicious

A distinction that hasn't been made above, I don't think... plagiarism is an ethical issue whereas copyright infringement is a legal issue.

Plagiarism is a form of dishonesty. Profits from clickthroughs and other material benefits, as well as other interests of the author such as community repute and status, are irrelevant to the issue of plagiarism. It's still wrong to plagiarize even if the author is unable or unwilling to profit from his or her creation (if the author is dead, for example.)

Copyright infringement is an element in the mechanism of maintenance and enforcement of the legal concept of intellectual property, which is itself an implementation of the public policy that creative people and creative legal entities like corporations must be directly materially compensated for their creations. Other civilizations might choose to compensate creative people in a different manner, or might decide that they don't need to be compensated at all, but the ethical issue of plagiarism would still exist.

XMLicious
XMLicious

A distinction that hasn't been made above, I don't think... plagiarism is an ethical issue whereas copyright infringement is a legal issue.

Plagiarism is a form of dishonesty. Profits from clickthroughs and other material benefits, as well as other interests of the author such as community repute and status, are irrelevant to the issue of plagiarism. It's still wrong to plagiarize even if the author is unable or unwilling to profit from his or her creation (if the author is dead, for example.)

Copyright infringement is an element in the mechanism of maintenance and enforcement of the legal concept of intellectual property, which is itself an implementation of the public policy that creative people and creative legal entities like corporations must be directly materially compensated for their creations. Other civilizations might choose to compensate creative people in a different manner, or might decide that they don't need to be compensated at all, but the ethical issue of plagiarism would still exist.

El Plagarismo
El Plagarismo

I have block quoted this entire article on my site. LMAO :)

El Plagarismo
El Plagarismo

I have block quoted this entire article on my site. LMAO :)

JB
JB

Isaac,

I have no trouble with it in Firefox 1.5 on Windows XP or Linux. Is anyone else having trouble with the comment form in Firefox?

JB
JB

Isaac,

I have no trouble with it in Firefox 1.5 on Windows XP or Linux. Is anyone else having trouble with the comment form in Firefox?

Isaac
Isaac

This is all slashdot is: a huge collection of directly quoted blocks of text with a comment area for each. This is how a site that is relatively obscure can get attention. I would never have visited this site if not for Slashdot's item on it, though in this case the text presented in the /. item is not a direct quote from your article.

Also, your comment form is crap in Firefox.

Isaac
Isaac

This is all slashdot is: a huge collection of directly quoted blocks of text with a comment area for each. This is how a site that is relatively obscure can get attention. I would never have visited this site if not for Slashdot's item on it, though in this case the text presented in the /. item is not a direct quote from your article.

Also, your comment form is crap in Firefox.

domelhor.net
domelhor.net

Plagiarismo e abuso de cpia so pragas dos blogues...

A cpia de textos de outros sites uma praga que assola a blogoesfera - existe uma area mal defenida entre o roubo dos conteudos (claramente ilegal) e o uso adequado da informao de outros blogues (defenido na lei). E muitas das vezes os autores dispes d...

Life Done Right
Life Done Right

Consider this: I never would have found your website unless Slashdot had "excerpted" aka "blockquoted" some portion of the material on your website. You now have a new reader thanks to this practice.

You can lead a reader to content, but if you don't give 'em a sip to get there, they'll find sustenance elsewhere, my friend.

Life Done Right
Life Done Right

Consider this: I never would have found your website unless Slashdot had "excerpted" aka "blockquoted" some portion of the material on your website. You now have a new reader thanks to this practice.

You can lead a reader to content, but if you don't give 'em a sip to get there, they'll find sustenance elsewhere, my friend.

Flajann
Flajann

While Plagiarism is plainly bad, as long as there is attribution to the quoted article, that should be enough. In this ephemeral world of web content that comes and goes, many times I have seen sites disapear that I had merely direct links to.
Nowadays, I will copy the entire article and include an attribution and a link to the original, just in case the original website disapears someday.

In the old days before blogging, in news groups and UseNet, it was not only customary, but typical to quote the entire content of someone else's post and add your own commmentary to it. There were no issues raised about copyright and plagiarism, or few if ever.

Today, I think it's more an issue of the blogger wanting more traffic to go to *his* site so that ads can get greater exposure. If that's the case, the blog is really being "monitized", and that's a different issue altogether.

I would say, don't place anything on the Internet if you don't want it massively copied everywhere, and around 20 years from now. We can bitch and bemoan all we want about what others should do, but the reality is that the Internet is a medium of copy and replication and archive. Don't like it? Go back to dead-tree publications and commentary.

Flajann
Flajann

While Plagiarism is plainly bad, as long as there is attribution to the quoted article, that should be enough. In this ephemeral world of web content that comes and goes, many times I have seen sites disapear that I had merely direct links to.
Nowadays, I will copy the entire article and include an attribution and a link to the original, just in case the original website disapears someday.

In the old days before blogging, in news groups and UseNet, it was not only customary, but typical to quote the entire content of someone else's post and add your own commmentary to it. There were no issues raised about copyright and plagiarism, or few if ever.

Today, I think it's more an issue of the blogger wanting more traffic to go to *his* site so that ads can get greater exposure. If that's the case, the blog is really being "monitized", and that's a different issue altogether.

I would say, don't place anything on the Internet if you don't want it massively copied everywhere, and around 20 years from now. We can bitch and bemoan all we want about what others should do, but the reality is that the Internet is a medium of copy and replication and archive. Don't like it? Go back to dead-tree publications and commentary.

nortypig
nortypig

Ha sorry not a quote by me I meant the quote on the commenter Id.Ology - which is a comment anyway... but you get what I mean I guess so you might want to scrub that line if you've got the chance lol...

anyway i think you're right that these things do need to be discussed in an open forum. unfortunately i don't think it will go very far in the long run bar raising awareness.

its like separating the email marketer from the spammer - my politicians seem to not consider their emails spam but to me they are... very grey too.

glad to see you survived the slashdotting too. Congrats.

nortypig
nortypig

Ha sorry not a quote by me I meant the quote on the commenter Id.Ology - which is a comment anyway... but you get what I mean I guess so you might want to scrub that line if you've got the chance lol...

anyway i think you're right that these things do need to be discussed in an open forum. unfortunately i don't think it will go very far in the long run bar raising awareness.

its like separating the email marketer from the spammer - my politicians seem to not consider their emails spam but to me they are... very grey too.

glad to see you survived the slashdotting too. Congrats.

introduced