The Sunlight Foundation, a non-profit organization that works to promote government transparency, has announced the launch of a new service that it hopes will help identify passages in news articles that were recycled from Wikipedia and press releases.
The service, named Churnalism US, is actually built upon earlier work of Media Standards Trust (MST). The MST built the first Churnalism tool, which launched in the UK previously. That tool, in turn, was built upon an engine that MST open sourced, making it available for The Sunlight Foundation to use.
The big idea behind both versions of Churnalism is not so much to detect all kinds of plagiarism or spot instances of copyright infringement. Rather, the goal is to find instances where journalists recycle content from sources that readers might not deem trustworthy or, in other cases, may take quotes out of context.
But does the tool work and how effective is it? I decided to give it a spin and see what I found out.
How it Works
The way Churnalism US works is fairly straightforward and anyone who has used a plagairism checker before, such as Copyscape or PlagSpotter, will be very comfortable using it.
If you visit their site, you’re prompted to either paste the text that you want to check or simply enter the URL and let the site pull down the copy for you.
If Churnalism US finds any matching text with the known sources, it will present you with a readout similar the one below, that highlights the duplicated text. You can also click a passage to see where its pair is. (Note: I selected an article with known matching material for the screenshot).
The result, if there is a decent amount of copied text, is a simple side-by-side comparison of the work that’s easy for anyone to follow.
If you install one of the plugins into your browser, when you visit a news site (or any site that you opt to check) Churnalism US will automatically scan the contents of the article and, if it detects any copied text it will let you know with a popup alert.
That makes using Churnalism through the plugin a “set and forget” operation. In fact, if you go a while without seeing a news article with a great deal of recycled content, you might forget its there until you get your next popup.
However, the plugin, initially, is only set up to check a relatively small number of popular news sites as well as a longer list of local affiliates. Though most popular news sites are on there, including many tech news sites, some slipped through the cracks including, in my case, Today.com (which was linked to in an article off of NBC News).
Fortunately though, if a site isn’t on the list, it’s trivial to add it. You can also have Churnalism US automatically check for the words “news” and “article” in the URL, though that feature is turned off by default.
Thoughts on Churnalism US
Though some news articles have referred to Churnalism US as a plagiarism checker, in reality it isn’t meant for that purpose. Instead, it’s about detecting a limited type of questionable behavior that some journalists engage in that sometimes overlaps with plagiarism.
The ways in which journalists can approach and use content found in press releases and Wikipedia is the subject of a great deal of debate right now. For example, when Jonah Lehrer’s misdeeds were investigated, he was found to have copied text without attribution from press releases in five of the seventeen articles checked, though those misdeeds were overshadowed by his traditional plagiarism and fabrication.
This issue is particularly serious in cases, as with Lehrer, where the plagiarism is used to imply that the journalist interviewed a person when, in truth, they just pulled the quotes from the release.
However, a lot of use of press release content goes undetected by readers and it’s assumed that every word is written by an impartial source. Churnalism US aims to change that.
Though it isn’t a plagiarism checker and there are many examples of copy/pasting that it will not catch, it does what it sets out to do very well and the plugins, in particular, have the potential to be very eye-opening. Though I’m not sure how many people will use the site itself, the plugins make it easy to participate without taking any action, encouraging people to be aware of what is going into the journalism they’re reading.
If The Sunlight Foundation can reach out to news readers and get enough to install the plugin, it could be very interesting to see the impact it has on journalism, in particular, how it works with press releases.
In the end, I can’t really test Churnalism US like I can other plagiarism checkers. It’s not meant to be compared to Copyscape or PlagSpotter. It’s a different product with a different goal.
Due to the limited number of sources in its database, Churnalism US is going to appeal less to people who are interested in plagiarism and more to people who are interested in broader issues of journalistic integrity.
If you want to know more about the thorny relationship between public relations and journalism, this is your tool. If you want to track every time an article appears on the Web, this is clearly not for you.
In the end, my brief testing of Churnalism US worked reasonably well. Though I didn’t see many major examples of such misuse “in the wild” I’ve only had the plugin installed for a few hours and only tested about a dozen articles on the site.
I’ll be interested to see what the plugin finds over the weeks and months to come and will report anything exception or interesting that I find…
It should be an interesting experiment in academic integrity.