Wednesday, October 13, 2010

A National Digital Library for the US

In the current issue of the New York Review of Books there's an article by Robert Darnton, Director of Harvard University Library, from a recent talk that opened a conference on the possibility of creating a National Digital Library for the US. His ideas on the library in the 'new age' are fleshed out rather more extensively in an earlier article from 2008, featured in the same publication.

In Darnton's writings he gives a laudible defense of the traditional book and library institution. This perhaps appeals more to a minority (even a small minority) in the current research climate, but it's an important argument that often gets drowned out. While agreeing entirely with the ideas of access and traditional preservation that he describes, it's a little concerning to see what has been left out of this discussion surrounding a National Digital Library.

To begin, it isn't just the 'modern' and 'postmodern' student who performs most of their research digitally - all the signs show that, within the sciences in particular, we are being inundated with born-digital material. We will find that even the most bookish scholar - should we decide to value his output sufficiently to archive it - will at least have left behind an email correspondence. Indeed, the first hits for 'born-digital data' via Google find an explanation of why the Crafts Study Centre at the Surrey Institute of Art & Design chose born-digital storage for the 'reusability of the resource', and an article in the New York Times praising Emory University's acquisition of Salmon Rushdie's digital files. It should go without saying that, meanwhile, the scientific community have long since entered the age of the petabyte.

While many books have indeed lasted many hundreds of years, they, like digital data, also get lost and destroyed - any advantage they have displayed in longevity doesn't seem to compensate for their limitations in time and space as research tools. With a focus on the printed book, dismissing born-digital as an 'endangered species', we are throwing out the majority of modern scholarship. It therefore seems that this approach will create exactly what Darnton claims to want to avoid: the library as museum. It's a museum of past research at the expense of the future, dictating the centrality of the traditional library when in fact the modern researcher expects resources to come to them, and not vice-versa.

Just digitising books is really only a part of the digital puzzle when it comes to libraries and, for the reasons mentioned above, doesn't reflect the current and future trends in scholarship. Nor is it a progressive response to the question of a National Digital Library: the first digital library started in 1971 with Project Gutenburg; the first ISBN issued to an e-book was in 1998; Google Books was launched in 2004. The push for digitization presented here sticks to a rigid hierarchy surrounding the supremacy of the book and simply doesn't accommodate born-digital (or even archival) content. Copying every book around is not going to address the most pressing concerns for a National Digital Library and will never further the scope of scholarship.

In a recent survey of 275 US insitutions (with a 70% return), the OCLC identified that special collections in the US were primarily concerned with issues of space, followed by born-digital content, and then digitisation. Only 50% of insitutions had assigned responsibility to born-digital collections. Ignoring born-digital collections and focusing on books does not take care of the problem, and while we'll probably have our Folger First Folios to consult for years to come, much of modern research will be left uncollected and unpreserved, and the real potential for new avenues in digital scholarship lost. It may well be that the scale of the problem does necessitate the creation of a new, exclusively digital insitution, but the realities of digital scholarship are far more dynamic than they're given credit for here.

  1. Two interesting projects covering issues of archiving born-digital materials are PARADIGM ( and CAIRO ( So the issues are being addressed in the digital library community but not by Darnton's article.