Wednesday, September 29, 2010

Data about data...

Our first real class conversation was nicely framed by a comment that Simon Tanner had made the day before during his introduction to the course: in digital systems it's not so much about what you do, but why you do it - digital systems are receptive to logic, but we don't do things because they're logical, we do things because we're human.

Welcome, then, to the metadata 'universe'. For a visual representation to prove that 'universe' is really no exaggeration, Jenn Riley from the Indiana University Digital Library Program has created this wonder here. The information professionals commenting on metadata issues online are probably the first I've known to plump for statements that amount to 'people are terrible at communication', but perhaps they have good reason.

Since metadata is describing data, we have the human ingredient. It becomes apparent early on that a discussion of metadata - something of practical use - has the potential to descend all too easily into a discussion of semantics, which, while interesting from a philosophical perspective, elevates the existing challenges of getting to grips with metadata towards something Sisyphean.

This concern, however, is balanced with the genuine value of an intellectual (i.e. human) hierarchy applied to a collection with digital metadata by someone familiar with the collection and, hopefully, its users. Perhaps inevitably, Google also entered the conversation - doesn't Google actually take care of all our searching needs anyway?

This question does help to highlight the fact that the intellectual work of various information communities on the myriad metadata out there isn't necessarily so confounding after all, from a collection-by-collection perspective. Perhaps Google is more intuitive, perhaps it tackles more data - it's a great tool with a lot of potential, but it also has limits on what it can do.

UC Berkeley's Geoffry Nunberg provides a rather good deconstruction of some of the recent problems with Google's metadata (in the case of Google Books, specifically, where one would think metadata might be a priority) here. A quick check of the problems that he describes shows that a number still remain. It's at this point that you might feel that the complexity and compartmentalisation of the metadata universe aren't such a bad thing after all.

