So, this past Saturday was the first class in Digital Archiving with Open Source Software. It was a good opportunity to meet the other students I will be working with (Hi everyone!) as well as start to see what we will be working with. This is my first time taking a course this time-compressed and intensive. Needless to say, by the end of the 8 hours, my brain was starting to feel a little mushy and it was getting harder to concentrate. This coming Saturday we’ll venturing out of the computer lab (which is very much a fluorescent-lit cave) and head to the VOICES of September 11th office in New Canaan, CT. Here we will most likely meet some of the people responsible for the VOICES of September 11th project as well as see their operation. We may also be involved in scanning documents, so we’ll be able to see the work process from the original materials given to the organization all the way to adding them into our collection and exhibits.
The Project
The VOICES of September 11th is serving a very noble cause trying to memorialize all the people involved and affected by the tragedies that struck the World Trade Center (the project also includes materials from the 1993 bombing as well) as well the attack on the Pentagon and the crash of United 93. We got to see the materials being collected by VOICES and it’s all very heart touching. I’ve seen family images, wedding photos, baby photos, news articles, and ephemera from memorial services. All of it shows how much these tragedies affected the people involved and the importance to remember that the people who died are not just a number or statistic to be used for political capital.
That being said, I feel it is clear that VOICES is a project that grew from a small operation to something larger, and the growing pains show. For example, the files I and my partner worked with this past Saturday were only nominally ordered by date in their directory. There were multiple copies of files (some of which were most likely edited or Photoshopped), but their file names did not note those changes. There have definitely been attempts made at organization of the files, but it seems based on an ad hoc system that developed out of necessity. Enough criticizing though, the materials collected, as I mentioned before, are quite remarkable and can work well towards memorializing the person they represent.
Metadata
One of the issues we came across while working on Saturday was some confusion about the terminology of the Dublin Core metadata elements we were working with and the information that was expected to be put into each field. It got me thinking about the concept of metadata itself (at least how it is used in a digital archiving sense) and I realized that it has an identity crisis.
In traditional cataloging, there is usually only one item that is being represented in the cataloging notation. The cataloger is often working with the physical item. This same sort of idea applies when dealing with digitally created items that only exist in the virtual space. A good example of such an item is this blog post. I’ve created it digitally by typing it into my blog editing software, and unless someone prints it out, there is not going to be a physical representation. This can be represented through its metadata in much the same way that a physical item can be represented through its catalog record. One set of information for one item.
The crisis arises when you’re dealing with digitized versions of physical objects, like scanned photographs. Now the cataloger or digital archivist has to account not only for the representation of the physical object, but the digital one as well. There are some areas that can cover both objects like subject coverage, title, etc. There are some areas which are specific to the object though – which is where we faced confusion. The two elements that caused a lot of discussion was “Type” and “Format”. Looking at the documentation provided, I came to the conclusion that “Type” is meant to apply to the digital object (as in file type), whereas “Format” dealt with the original object, but the terminology is still unclear. One of the problems that arises with this is that commonly the file type is also known as the file format (such as PDF format or JPEG format). The professor has decided to modify the “Format” element in Omeka to be “Original Format” to make it a clearer representation to what it is referring.
This little incident does show one of the niggling issues with Dublin Core and metadata standards in general. The standards are not firm with a clear set of rules, like how American cataloging has its specific formats and rules in the Anglo-American Cataloging Rules. Different archivists and digital catalogers use different elements and standards based on their preference or what information they have available. We saw this at the beginning of class when we examined several digital collections and compiled the elements from them into a spreadsheet to see the differences. We are currently at a precipice in information trends right now as more information is being created and stored digitally, it is important that metadata standards organize and mature to be able handle the substantial rise in materials – or we will be facing an organizational crisis.
Omeka
Finally, I wanted to talk a bit about the software that we are using in this class. Omeka is an open source initiative that allows the easy publishing of digital exhibits and collections. From the little bit that I have used it so far, it is a nice piece of software, still a little rough around the edges as most open source software is. It reminds me a lot of the information I gathered when I was researching Drupal for another class. Both of them are open source content management systems, although Omeka is more single-minded in its purpose. This is good because it allows a focusing of the plugins and tools developed for it to meet the unique needs of the archival community. Drupal, though, is much more flexible in its overall capabilities. In both cases, I wish I still had an old desktop computer lying around (I used to have 3 in my basement before recycling them) that would be easy to resuscitate into a personal server to play with installation and customization on my own time. I do like the focused nature of Omeka though.
To give credence to the meteoric rise of open source software, I am surprised at some of the big institutions that are using Omeka. For example there is this site created by the Smithsonian Institution: Journey Stories that chronicles the stories of US immigrants. What I particularly like (and from what I understand will become part of the exhibit we’re working on creating) is the capability for visitors to add their own stories. It blends the richness of a museum exhibit, but gives it a Web 2.0 twist and allow people to contribute and add parts of themselves to it. In addition, the look of the site appears professional, but from what I’ve seen in Omeka is not that difficult to achieve. Something which would have taken a team of designers a good deal of time to do 10-12 years ago, can now be done by a couple of people and much more inexpensively.
Another example that shows the range of what Omeka can do is: Ars Synthetica. Instead of being an archival site, this actually works to display and collect research in synthetic biology. This site really shows the variety of different materials that can be organized and displayed through an Omeka collection. It includes articles, animations, videos, and more. I can see something like this developing into a next-generation scientific journal with the ability to include experiment videos with articles and a system for collaborators to connect and work together.
Overall, I’m still thrilled to be taking part in this class. It is giving me valuable experience working with a new software system as well as the ability to be able to “play” around in digital archiving, which is something that ties together both my interest in technology and history. In addition, we’re doing this all for the great cause of the VOICES of September 11th project, so we get good karma points to. Well that’s the end, until next time folks.