<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Synonymized Plagiarism: A New Threat</title>
	<atom:link href="http://www.plagiarismtoday.com/2005/12/05/synonymized-plagiarism-a-new-threat/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.plagiarismtoday.com/2005/12/05/synonymized-plagiarism-a-new-threat/</link>
	<description>Content Theft, Plagiarism, Copyright Infringement</description>
	<lastBuildDate>Wed, 10 Mar 2010 06:55:07 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=abc</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Spinning Spammers Steal Our Blog Content &#171; Lorelle on WordPress</title>
		<link>http://www.plagiarismtoday.com/2005/12/05/synonymized-plagiarism-a-new-threat/comment-page-1/#comment-67659</link>
		<dc:creator>Spinning Spammers Steal Our Blog Content &#171; Lorelle on WordPress</dc:creator>
		<pubDate>Thu, 15 Nov 2007 07:04:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=137#comment-67659</guid>
		<description>[...] content through your blog&#8217;s feed and inserting or replacing synonyms in the content, typically keywords the splogger needs to get the page ranking and search terms to attract [...]</description>
		<content:encoded><![CDATA[<p>[...] content through your blog&#8217;s feed and inserting or replacing synonyms in the content, typically keywords the splogger needs to get the page ranking and search terms to attract [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Protecting Your Content From the Spinning Spammers : The Blog Herald</title>
		<link>http://www.plagiarismtoday.com/2005/12/05/synonymized-plagiarism-a-new-threat/comment-page-1/#comment-67465</link>
		<dc:creator>Protecting Your Content From the Spinning Spammers : The Blog Herald</dc:creator>
		<pubDate>Mon, 12 Nov 2007 17:31:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=137#comment-67465</guid>
		<description>[...] type of scraping is not as uncommon as we might wish and the technology to do it has been around for several years. Worse still, this type of scraping is growing much more popular as search engines clamp down on [...]</description>
		<content:encoded><![CDATA[<p>[...] type of scraping is not as uncommon as we might wish and the technology to do it has been around for several years. Worse still, this type of scraping is growing much more popular as search engines clamp down on [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: PlagiarismToday &#187; Modified Scraping on the Rise</title>
		<link>http://www.plagiarismtoday.com/2005/12/05/synonymized-plagiarism-a-new-threat/comment-page-1/#comment-67143</link>
		<dc:creator>PlagiarismToday &#187; Modified Scraping on the Rise</dc:creator>
		<pubDate>Thu, 08 Nov 2007 22:34:21 +0000</pubDate>
		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=137#comment-67143</guid>
		<description>[...] technology behind modified scraping has been around for several years. I first wrote about it in December of 2005. Back then the problem was fairly rare and the concept was still somewhat new. [...]</description>
		<content:encoded><![CDATA[<p>[...] technology behind modified scraping has been around for several years. I first wrote about it in December of 2005. Back then the problem was fairly rare and the concept was still somewhat new. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jack of All Blogs &#187; Blog Archive &#187; Three Strikes &#38; Youâ€™re A Splog!</title>
		<link>http://www.plagiarismtoday.com/2005/12/05/synonymized-plagiarism-a-new-threat/comment-page-1/#comment-9707</link>
		<dc:creator>Jack of All Blogs &#187; Blog Archive &#187; Three Strikes &#38; Youâ€™re A Splog!</dc:creator>
		<pubDate>Mon, 28 Aug 2006 17:49:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=137#comment-9707</guid>
		<description>[...] Since Sentinel, when parsing RSS feeds, ignores all punctuation and most extremely short words, it can easily see through most simple text manipulations such as restructuring sentences and introducing false paragraph breaks. However, Blogwerx took things a step further and built in a thesaurus to Sentinel&#8217;s algorithm, making it capable of detecting copies that have been rewritten in minor ways and, potentially, even articles that have been &#8220;spun&#8221; by synonymizing software. [...]</description>
		<content:encoded><![CDATA[<p>[...] Since Sentinel, when parsing RSS feeds, ignores all punctuation and most extremely short words, it can easily see through most simple text manipulations such as restructuring sentences and introducing false paragraph breaks. However, Blogwerx took things a step further and built in a thesaurus to Sentinel&#8217;s algorithm, making it capable of detecting copies that have been rewritten in minor ways and, potentially, even articles that have been &#8220;spun&#8221; by synonymizing software. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: PlagiarismToday &#187; Product Preview: Blogwerx Sentinel</title>
		<link>http://www.plagiarismtoday.com/2005/12/05/synonymized-plagiarism-a-new-threat/comment-page-1/#comment-3904</link>
		<dc:creator>PlagiarismToday &#187; Product Preview: Blogwerx Sentinel</dc:creator>
		<pubDate>Mon, 05 Jun 2006 20:53:09 +0000</pubDate>
		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=137#comment-3904</guid>
		<description>[...] Since Sentinel, when parsing RSS feeds, ignores all punctuation and most extremely short words, it can easily see through most simple text manipulations such as restructuring sentences and introducing false paragraph breaks. However, Blogwerx took things a step further and built in a thesaurus to Sentinel&#8217;s algorithm, making it capable of detecting copies that have been rewritten in minor ways and, potentially, even articles that have been &quot;spun&quot; by synonymizing software. [...]</description>
		<content:encoded><![CDATA[<p>[...] Since Sentinel, when parsing RSS feeds, ignores all punctuation and most extremely short words, it can easily see through most simple text manipulations such as restructuring sentences and introducing false paragraph breaks. However, Blogwerx took things a step further and built in a thesaurus to Sentinel&#8217;s algorithm, making it capable of detecting copies that have been rewritten in minor ways and, potentially, even articles that have been &quot;spun&quot; by synonymizing software. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JoeChongq</title>
		<link>http://www.plagiarismtoday.com/2005/12/05/synonymized-plagiarism-a-new-threat/comment-page-1/#comment-234</link>
		<dc:creator>JoeChongq</dc:creator>
		<pubDate>Sat, 10 Dec 2005 01:33:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=137#comment-234</guid>
		<description></description>
		<content:encoded><![CDATA[<p>I like your idea of representing an article as a series of numbers.  That is something that could be done pretty easily.  But it could easily be defeated though with a bit more work by the synonymizing programs.  Certain phrases could easily be automatically flipped within a sentence and still keep most of the meaning.</p>
<p>â€œTo exist, or not to exist, that is the query,â€? could be written as â€œThe query is that to exist, or not to existâ€? without removing any word.  That is worse English than the original synonym translation, but it is still readable.  It would even be better if &#8220;that&#8221; was dropped.</p>
<p>Another example I just discovered while trying to figure out how to spell plagiarist.  I got close and spell checked it in WordPerfect, then I looked at its thesaurus entries for the word.  It included as a synonym, literary pirate.  Multi word replacements are going to be even harder to detect in any automated method.</p>
<p>But it is a start and would catch a lot of the spam since it would deal with identical plagiarism as well as synonymized plagiarism.  Search engines need to do something, not only to fight plagiarism, but to maintain at least mostly spam free search results.  The problem though is how to determine what is plagiarism spam and what is legitimate syndication.</p>
<p>Even the best random text generators can&#8217;t create a meaningful paragraph except possibly by accident.  Even though these synonymized articles clearly don&#8217;t appear well written, they are good enough to trick humans who don&#8217;t know about all these spammer/plagiarist tricks.  And many good blogs are written by people who are not native English speakers so strange sentence structure and odd word connotation is not that unusual.</p>
<p>Another interesting method would be to find some article in another language and translate it to English.  Some free translation services are pretty good at converting certain languages to English.  The opposite could be done too.  No reason to limit plagiarism to English content.  This could be nearly impossible to detect since sentence structures would be changed and certainly almost all the words.  And even if the original author found the derivative work, he may not recognize it.</p>
<p>I think the ping idea you had has some merit, but remember the vast majority of blog plagiarism victims will never find out and I suspect most who do will not have the resources or inclination to go any further than contact the offender&#8217;s host if even that much.</p>
<p>It is interesting that you mentioned in this post you are a writer.  Further up in this post I was just thinking, I bet this guy is a writer.  Your posts are always interesting, well written, and clearly thought out.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JoeChongq</title>
		<link>http://www.plagiarismtoday.com/2005/12/05/synonymized-plagiarism-a-new-threat/comment-page-1/#comment-122424</link>
		<dc:creator>JoeChongq</dc:creator>
		<pubDate>Sat, 10 Dec 2005 00:33:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.plagiarismtoday.com/?p=137#comment-122424</guid>
		<description>I like your idea of representing an article as a series of numbers.  That is something that could be done pretty easily.  But it could easily be defeated though with a bit more work by the synonymizing programs.  Certain phrases could easily be automatically flipped within a sentence and still keep most of the meaning.&lt;br&gt;&lt;br&gt;Ã¢Â€ÂœTo exist, or not to exist, that is the query,Ã¢Â€? could be written as Ã¢Â€ÂœThe query is that to exist, or not to existÃ¢Â€? without removing any word.  That is worse English than the original synonym translation, but it is still readable.  It would even be better if &quot;that&quot; was dropped.&lt;br&gt;&lt;br&gt;Another example I just discovered while trying to figure out how to spell plagiarist.  I got close and spell checked it in WordPerfect, then I looked at its thesaurus entries for the word.  It included as a synonym, literary pirate.  Multi word replacements are going to be even harder to detect in any automated method.&lt;br&gt;&lt;br&gt;But it is a start and would catch a lot of the spam since it would deal with identical plagiarism as well as synonymized plagiarism.  Search engines need to do something, not only to fight plagiarism, but to maintain at least mostly spam free search results.  The problem though is how to determine what is plagiarism spam and what is legitimate syndication.&lt;br&gt;&lt;br&gt;Even the best random text generators can&#039;t create a meaningful paragraph except possibly by accident.  Even though these synonymized articles clearly don&#039;t appear well written, they are good enough to trick humans who don&#039;t know about all these spammer/plagiarist tricks.  And many good blogs are written by people who are not native English speakers so strange sentence structure and odd word connotation is not that unusual.&lt;br&gt;&lt;br&gt;Another interesting method would be to find some article in another language and translate it to English.  Some free translation services are pretty good at converting certain languages to English.  The opposite could be done too.  No reason to limit plagiarism to English content.  This could be nearly impossible to detect since sentence structures would be changed and certainly almost all the words.  And even if the original author found the derivative work, he may not recognize it.&lt;br&gt;&lt;br&gt;I think the ping idea you had has some merit, but remember the vast majority of blog plagiarism victims will never find out and I suspect most who do will not have the resources or inclination to go any further than contact the offender&#039;s host if even that much.&lt;br&gt;&lt;br&gt;It is interesting that you mentioned in this post you are a writer.  Further up in this post I was just thinking, I bet this guy is a writer.  Your posts are always interesting, well written, and clearly thought out.</description>
		<content:encoded><![CDATA[<p>I like your idea of representing an article as a series of numbers.  That is something that could be done pretty easily.  But it could easily be defeated though with a bit more work by the synonymizing programs.  Certain phrases could easily be automatically flipped within a sentence and still keep most of the meaning.</p>
<p>Ã¢Â€ÂœTo exist, or not to exist, that is the query,Ã¢Â€? could be written as Ã¢Â€ÂœThe query is that to exist, or not to existÃ¢Â€? without removing any word.  That is worse English than the original synonym translation, but it is still readable.  It would even be better if &#8220;that&#8221; was dropped.</p>
<p>Another example I just discovered while trying to figure out how to spell plagiarist.  I got close and spell checked it in WordPerfect, then I looked at its thesaurus entries for the word.  It included as a synonym, literary pirate.  Multi word replacements are going to be even harder to detect in any automated method.</p>
<p>But it is a start and would catch a lot of the spam since it would deal with identical plagiarism as well as synonymized plagiarism.  Search engines need to do something, not only to fight plagiarism, but to maintain at least mostly spam free search results.  The problem though is how to determine what is plagiarism spam and what is legitimate syndication.</p>
<p>Even the best random text generators can&#8217;t create a meaningful paragraph except possibly by accident.  Even though these synonymized articles clearly don&#8217;t appear well written, they are good enough to trick humans who don&#8217;t know about all these spammer/plagiarist tricks.  And many good blogs are written by people who are not native English speakers so strange sentence structure and odd word connotation is not that unusual.</p>
<p>Another interesting method would be to find some article in another language and translate it to English.  Some free translation services are pretty good at converting certain languages to English.  The opposite could be done too.  No reason to limit plagiarism to English content.  This could be nearly impossible to detect since sentence structures would be changed and certainly almost all the words.  And even if the original author found the derivative work, he may not recognize it.</p>
<p>I think the ping idea you had has some merit, but remember the vast majority of blog plagiarism victims will never find out and I suspect most who do will not have the resources or inclination to go any further than contact the offender&#8217;s host if even that much.</p>
<p>It is interesting that you mentioned in this post you are a writer.  Further up in this post I was just thinking, I bet this guy is a writer.  Your posts are always interesting, well written, and clearly thought out.</p>
]]></content:encoded>
	</item>
</channel>
</rss>


<!-- W3 Total Cache: Db cache debug info:
Engine:             disk
Total queries:      16
Cached queries:     0
Total query time:   0.124
SQL info:
    # | Time (s) |    Caching (Reject reason)     |   Status   | Query
    1 |    0.001 |            enabled             | Not cached | SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'
    2 |        0 |            enabled             | Not cached | SELECT option_value FROM wp_options WHERE option_name = 'aiosp_post_title_format' LIMIT 1
    3 |    0.001 |  disabled (query is rejected)  | Not cached | SHOW TABLES LIKE 'wp_feedfooter_rss_map'
    4 |    0.001 |            enabled             | Not cached | SELECT comment_date_gmt FROM wp_comments WHERE comment_approved = '1' ORDER BY comment_date_gmt DESC LIMIT 1
    5 |    0.001 |            enabled             | Not cached | SELECT   wp_posts.* FROM wp_posts  WHERE 1=1  AND YEAR(wp_posts.post_date)='2005' AND MONTH(wp_posts.post_date)='12' AND DAYOFMONTH(wp_posts.post_date)='5' AND wp_posts.post_name = 'synonymized-plagiarism-a-new-threat' AND wp_posts.post_type = 'post'  ORDER BY wp_posts.post_date DESC
    6 |    0.051 |            enabled             | Not cached | SELECT wp_comments.* FROM wp_comments  WHERE comment_post_ID = '137' AND comment_approved = '1'  ORDER BY comment_date_gmt DESC LIMIT 10
    7 |    0.002 |            enabled             | Not cached | SELECT t.*, tt.*, tr.object_id FROM wp_terms AS t INNER JOIN wp_term_taxonomy AS tt ON tt.term_id = t.term_id INNER JOIN wp_term_relationships AS tr ON tr.term_taxonomy_id = tt.term_taxonomy_id WHERE tt.taxonomy IN ('category', 'post_tag') AND tr.object_id IN (137) ORDER BY t.name ASC
    8 |    0.001 |            enabled             | Not cached | SELECT post_id, meta_key, meta_value FROM wp_postmeta WHERE post_id IN (137)
    9 |        0 |            enabled             | Not cached | SELECT post_id, meta_value FROM wp_postmeta WHERE meta_key = '_pprredirect_url'
   10 |    0.001 |            enabled             | Not cached | SELECT COUNT(comment_ID) FROM wp_comments WHERE comment_post_ID = 137 AND comment_parent = 0 AND comment_approved = '1' AND comment_date_gmt < '2007-11-15 07:04:51'
   11 |    0.001 |            enabled             | Not cached | SELECT COUNT(comment_ID) FROM wp_comments WHERE comment_post_ID = 137 AND comment_parent = 0 AND comment_approved = '1' AND comment_date_gmt < '2007-11-12 17:31:22'
   12 |    0.057 |            enabled             | Not cached | SELECT COUNT(comment_ID) FROM wp_comments WHERE comment_post_ID = 137 AND comment_parent = 0 AND comment_approved = '1' AND comment_date_gmt < '2007-11-08 22:34:21'
   13 |    0.003 |            enabled             | Not cached | SELECT COUNT(comment_ID) FROM wp_comments WHERE comment_post_ID = 137 AND comment_parent = 0 AND comment_approved = '1' AND comment_date_gmt < '2006-08-28 17:49:34'
   14 |    0.004 |            enabled             | Not cached | SELECT COUNT(comment_ID) FROM wp_comments WHERE comment_post_ID = 137 AND comment_parent = 0 AND comment_approved = '1' AND comment_date_gmt < '2006-06-05 20:53:09'
   15 |    0.002 |            enabled             | Not cached | SELECT COUNT(comment_ID) FROM wp_comments WHERE comment_post_ID = 137 AND comment_parent = 0 AND comment_approved = '1' AND comment_date_gmt < '2005-12-10 01:33:52'
   16 |    0.001 |            enabled             | Not cached | SELECT COUNT(comment_ID) FROM wp_comments WHERE comment_post_ID = 137 AND comment_parent = 0 AND comment_approved = '1' AND comment_date_gmt < '2005-12-10 00:33:52'
-->