Linkrot Killing Blogger Citation?

When Liz from I Speak of Dreams posted a comment to my Citation Culture Clash article, she made a very interesting point that I had overlooked.

As the entry pointed out, hyperlinks have become the standard for citation in the online world, especially for bloggers. This format of citation has become so entrenched that the Creative Commons Organization even integrated a Uniform Resource Indicator (URI) requirement into the legal text of its licenses, codifying the practice.

Traditionalists, on the other hand, have often balked at the simple hyperlink as a means of attribution, looking instead to requirements put down by groups such as the MLA, despite the obvious advantages of linking cited pages.

But even though hyperlinks are much easier for both author and user, they do present a difficult challenge: Linkrot.

For, while nothing is ever truly deleted off of the Internet, it doesn’t necessarily stay in the same place. A link that provides perfect attribution today might, even without anyone  knowing, point to nothing at all tomorrow.

The Problem of Linkrot

The term "permalink" is something of a misnomer. While it’s true that the link will most likely remain valid long after the entry or item has slid off of the main page, it’s a far cry from permanent.

Sites close down, move to new locations, change their directory structure and remove content every day. While a permalink will likely be around for weeks or months, whether or not it will still be there years later remains debatable. While that might be fine for articles and entries only likely to be relevant for a few months, research papers and static sites that could be around for years need to take note.

The simple fact is that, the older a piece gets, it will have fewer and fewer working links. This severely reduces the effectiveness of hyperlink citation over time.

In short, authors get no attribution, save what is in the article itself, and reap no benefit from having their work reused. Furthermore, users are frustrated and authors using cited works lose their supporting evidence.

It’s a losing situation all around.

Beating Linkrot

The most obvious solution for beating linkrot is for Webmasters to simply not let links go bad. If Webmasters never closed their sites, forwarded traffic when they changed URLs and generally never let links die, there would be no linkrot problem.

On the other hand, if authors citing works meticulously maintained their links and followed up swiftly on bad ones, the problem would be minimal at worst.

Of course, neither solution is practical. Sites will always go down and there will be too many links for anyone to effectively patrol. Even with link checking software to automate detection, the process of updating and maintaining hundreds or thousands of links can be very time prohibitive.

Sadly, there are no 100% effective ways to prevent linkrot, at least not right now. However, there are at least two ways to reduce the problem and, potentially, ease the frustrations.

Two Potential Solutions

The first and most obvious way to deal with the problem of linkrot is to use the Internet Archive (or, to a lesser degree, the Google cache) to help you maintain a cache of your cited pages.

Since the Internet Archive uses a standard format for its links, the process could be done automatically. For example, if I wanted to link to my own "Citation Culture Clash" article, I could do it in the following format: Citation Culture Clash (a)

In the example above, the link remains as it would normally with the addition of an "(a)" to the side for the archived link.

However, the example above also illustrated a critical flaw in the use of the Internet Archive for this purpose, it doesn’t grab everything. That is especially true for new content and some dynamic content. In fact, as of right now, nothing from Plagiarism Today is in the Internet Archive at all (though all of my other sites have been indexed fine).

A more refined solution would be to use WebCite to create cached versions of pages that you wish to reference. In that case, the link would look something like this: Citation Culture Clash (a)

In that example, the archive link works fine, pointing to a custom-made cached version of the original work. The frame around the archived version even contains critical information including a link pointing to the original, the date the cache was created and information about WebCite itself.

Unfortunately though, WebCite requires the author to manually input links he or she wishes to cache. This can be a very time-consuming process and, since the returned links are mostly gibberish, can also be an organizational nightmare. Though their bookmarklet reduces much of the burden, automation, such as through a Wordpress Plugin, would be needed to make the process efficient enough for bloggers to seriously consider.

Of course, neither of these issues address the issue of what happens if either the Internet Archive or WebCite closed down. Though both have been very stable long-term establishments with no signs of going anywhere, the danger is always there.

But while neither of these methods might be able to eliminate link rot, they can certainly reduce it and its impact on readers, authors and researchers alike. A link format like the one described above would continue to drive almost all of the traffic to the original author, but also provide users with an alternative link in the event that the original one is down for some reason.

It’s not a perfect solution, but it’s at least a start.

Conclusions

While linkrot certainly is a major concern when citing sources via hyperlink, it should not be the end of the practice. There are already very simple ways to reduce and nearly eliminate the problem. While new software and new conventions may need to be drafted in order to encourage widespread use, the solutions and formats outlined earlier provide at least a foundation to start the dialog.

But even if we aren’t able to solve the problem of linkrot, there’s no reason to believe that switching to a more traditional style would offer any improvement. As almost any teacher knows, even with a complete research library and proper MLA citations in hand, there’s no guarantee that a work cited will be found. Books, magazines and journals get misfiled, removed and destroyed the same as links.

Sadly, the problem with linkrot isn’t so much a sign of a lack of permanence on the Web, but rather, a lack of permanence in information itself.

No matter what format you use, there’s a decent chance that, some day, your work will outlive some or all of the work cited in it. It’s a sad possibility every author should be ready for and at least try and prevent.

[tags]Plagiarism, Content Theft, Copyright Infringement, Webcite, Internet Archive, Linkrot[/tags]

Want to Reuse or Republish this Content?

If you want to feature this article in your site, classroom or elsewhere, just let us know! We usually grant permission within 24 hours.

Click Here to Get Permission for Free