Entries Tagged 'Archives' ↓

A Bird’s Eye View of Archival Collections

Mitchell Whitelaw is a Senior Lecturer in the Faculty of Design and Creative Practice at the University of Canberra and the 2008 winner of the National Archives of Australia's Ian Maclean Award. According to the NAA's site, the Ian Maclean Award commemorates archivist Ian Maclean, and is awarded

to individuals interested in conducting research that will benefit the archival and historical profession in Australia and promote the important contribution that archives make to society.
Dr. Whitelaw has been keeping the world up to date on his work using his blog, The Visible Archive. His work fits well with my colleague Jeanne Kramer-Smyth's archival data visualization project, ArchivesZ, as well as the multidimensional visualization projects underway at the Humanities Advanced Technology & Information Institute at the University of Glasgow. However, his project fascinates me for a few specific reasons.

First of all, the scale of the datasets he's working with are astronomically larger than those that any other archival visualization project has tried to tackle so far. His visualizations include analysis at the Commonwealth Record Series level, which can be as large as about 10,000 linear meters of material. In addition, he'll be working directly with Series A1, which holds 20,000 items in approximately 450 linear meters.

Secondly, he's been using Processing, an open source programming language for visualization and interaction design to do a lot of the heavy lifting to create interactive visualizations like this one, depicted in the screenshot below. Processing works well for this because of its extensive third party libraries, such as proXML, which allowed him to parse the descriptive data Dr. Whitelaw received from the NAA (note: it's not EAD, thank heavens).

The area of each square is proportional to the number of shelf metres that series occupies, while the size of the grey void in each square is related to the number of described items in the series. So a square with a large void (thin walls) has relatively fewer items than one with a small void (thick walls) - or no void at all. Theres a minimum wall thickness of one unit, which is why the smallest squares have no voids.
A visualisation of 57000 series in the collection of the National Archives of Australia. The area of each square is proportional to the number of shelf metres that series occupies, while the size of the grey void in each square is related to the number of described items in the series. So a square with a large void (thin "walls") has relatively fewer items than one with a small void (thick "walls") - or no void at all. There's a minimum wall thickness of one unit, which is why the smallest squares have no voids.
Finally, I also have to give Dr. Whitelaw a lot of credit for sharing the source code, as well – this will really jumpstart my efforts to start playing around with Processing!

Seriously, Follow Our Lead

OCLC's Lorcan Dempsey makes a great point as usual in his post "Making tracks":

In recent presentations, I have been suggesting that libraries will need to adopt more archival skills as they manage digital collections and think about provenance, evidential integrity, and context, and that they will also need to adopt more museum perspectives as they think about how their digital collections work as educational resources, and consider exhibitions and interpretive environments.
I doubt that any archivist would disagree with this. Even better, I think this offers a great opportunity to reach out and have those in allied fields really understand how and why we've done things slightly different for so long. I'm glad to see that my new employer has picked up on this holistic approach with platforms like the NYPL Blogs.

Movin’ and shakin’ in the archives world

ArchivesNext recently discussed Library Journal's annual list of "Movers and Shakers," pondering what a comparable list in the archival profession would look like. For those who don't know, the list recognizes "library advocates, community builders, 2.0 gurus, innovators, marketers, mentors, and problem solvers transforming libraries." After some rumination, ArchivesNext is now calling for nominations to generate a similar list. Do your civic duty and nominate either a project, an individual, or even a situation worthy of this recognition!

DataPortability.org and the Dream of a Web 2.0 Backup System

I just discovered DataPortability.org through Peter Van Garderen's blog post about it. I was entirely surprised that I'd heard nary a peep about it. Some basic examination (running a WHOIS query on the domain) shows that it's still a fairly new project. I have to say, though, that I'm entirely impressed. Those involved have given a whole lot of thought to how they're going to be doing things, as evidenced by those who have signed up to be involved and the DataPortability Charter. To wit, the Charter's principles tend to speak for themselves:

  1. We want sovereignty over the profiles, relationships, content and media we create and maintain.
  2. We want open formats, protocols and policies for identity discovery, data import, export and sync.
  3. We want to protect user rights and privacy.
And, of course, the thing that made me squeal with delight like a pig in mud:
4. DataPortability will not inventing any new standards.
I mean, that's probably the best news that someone like me could get. They have a graphic on their home page that sums it all up perfectly:

Description of DataPortability Project

Now, naturally they didn't have preservation in mind at first, but as Peter's post notes, it's ripe for that sort of use. This also got me thinking about Dan Chudnov's old brainstorm about blog mirroring using Bittorrent and Atom, too. In particular, note this comment of his:

It's a pretty simple idea: you extend an aggregator system to "archive" entries posted each day into bittorrent files, and then build a secondary system to turn the data distributed over bittorrents back into browseable "blog" mirrors if/when you need to. The best part is that you don't really need any new technology to do it.

I feel like things are coming full circle. I also feel like I could really have fun and find new ways to extend ArchivesBlogs, at least when I finish the other countless little projects that litter my mind. Anybody got some free time they want to contribute?

Web 2.0, Disaster, and Archives

Many of Web 2.0's detractors argue about it's real value, but given the wildfires in Southern California, I was happy to see it really put to good use. KPBS, a San Diego radio station, has been using Flickr and, even more shocking (at least for some), Twitter as ways to disseminate information and news quickly. The use of Twitter is particularly interesting as it can send out SMS messages. You might recall a few years ago when protesters in the Philippines used SMS to organize political rallies and warn of police retaliation. The California State Library Blog also has provided information from the California State Archivist about archives affected by the fires. In addition, information about disaster recovery for libraries and archives is available both on a regional level by the San Diego/Imperial County Libraries Disaster Response Network and on the state level by the California Preservation Program. Please hold those affected by the fires in your thoughts, and if you can, contact SILDRN or the CPP to help.

Archives Camp: Talking About Archives 2.0

ArchivesNext recently discussed the possibility of having some "Archives 2.0"-themed events this summer, and I think it's a great idea. Now, we may not be able to throw something together in time for SAA, but it seems like the idea of at least meeting up informally is percolating. There's a wealth of opportunities available for archives and archivists to improve access to their holdings through social software and the like. My vision, as I said in a comment on the post, would be to end up with an unconference along the lines of a Library Camp (or more generally, a BarCamp), maybe with lightning talks if enough of us have something to show off or talk about. Like Library Camp, I'd like to see a "bridging the gap" session where we learn and share ways about how to talk to IT staff and other stakeholders essential to our ideas taking off. I facilitated a such a session at Library Camp East, and although trying at times, it was really instructive. But really, it's not just about what I want - what do you all want to see at an Archives Camp?

NARA Frees Their Data, Somewhat

I'm a bit surprised that this hasn't come across anyone's radar, because it seems awfully damn significant to me. According to this post on the A&A listserv by Michael Ravnitzky, the National Archives and Records Administration released an exhaustive database of box holdings of all the Federal Records Centers. He doesn't really say how he obtained this database, but my guess is he just asked based upon his background and interest in public access to government information - I've come across his name on material relating to FOIA before. The file he received from NARA is a 155 MB Microsoft Access database, and soon after he posted about it to the listserv, Jordan Hayes and Phil Lapsley took the opportunity to host the database, converted it to MySQL, and wrote a few simple query forms for the database in PHP. Hayes also provided some basic documentation on how to use the forms since MySQL query syntax is probably not familiar to most people on the A&A list.

While I'm glad to see that this database has been made available, it seems a little strange to me that this appears to be pretty insignificant for NARA. While I don't expect that they'll be posting links to Hayes & Lapsley's query forms, I have yet to find any reference to this database in any form (downloadable or not) on Archives.gov. I'd think any sort of mention would bode well for NARA, so it's a bit puzzling to have it be so overlooked. Maybe we can do more with the data. My first thought was to throw something together and then maybe sacrifice a goat to the fine folks at ibiblio (where ArchivesBlogs is hosted) in exchange for making the database available at a reliable site that works in the public interest. I then wised up and realized that I don't currently have the time to do that alone. If someone's interested, though, I know where you can get goats wholesale.

The files in question:

Sticking My Neck Out

It's been some time since I've had a substantive post, and I don't really intend to write one now. I figured I should mention, however, that I've been featured lately in print and in the blogosphere. Jessamyn West of librarian.net interviewed me for an article ("Saving Digital History") in Library Journal netConnect. In addition, I was tapped by the wonderful folks at Booktruck for the latest installment in their "Ask a Male Librarian" series. I swear someday soon I'll write something much more interesting and less self-promotional.

Protection From Human Pests

A few months ago (while I was at NACO training) I got a reader's card at the Library of Congress. For a while I pretty actively went and requested books on Saturday afternoons. In particular, I was interested in archival manuals from outside the United States. One of the most interesting books I found was S. M. Jaffar's Problems of an Archivist, a manual written in Pakistan in 1948. I was struck by the following passage ("Protection From Human Pests"), taken from pp. 28-29:

"Human pests" and "White Huns" are the common epithets applied to human species acting as enemies of archives. History has recorded many such instances of vandalism as the wholesale destruction of priceless treasures of art and literature, the burning of big and beautiful libraries, the transport of camel-loads of books to distant countries and the sale of valuable manuscripts at ridiculously low prices. The transfer of artistic and literary treasures of subjugated countries by the conquerors to their homelands to adorn their own museums and libraries has depleted those countries of that wealth. In their anti-archival activities insects are impartial, but the human pest or "Homo Sapiens", as he is significantly called, goes for the most precious papers. Being rational, his ravages are more thorough and fatal than those of insect pests. As such he is capable of maximum harm in minimum time, especially when he is selfish, callous, misguided and bent upon mischief. The removal of entire pages ond [sic] pictures from valuable volumes and of seals and signatures of historical persons from rare records is the common experience of archivists and librarians.

An archivist (or a librarian) may rightly with-hold certain classes of documents as being fragile, but he cannot legitimately stand between the research scholar and the primary sources of his information -- the raw materials of history. Damages due to carelessness are also common: By resting his elbows on a batch of brittle papers or by spilling ink on important records, a research student may do incalculable harm. Old but important manuscripts have often been destroyed for want of space or thrown into wells and rivers out of pure piety to prevent their pollution by falling into unclean or impious hands. These are important problems which must be tackled by the authorities and the archivist. In order to ensure that the literary wealth of a country remains within it, it must be free and strong, capable of warding off external danger and guarding its cultural heritage. To prevent the export of valuable records, prohibitory legislation is absolutely essential.* To check damage being done through carelessness, a set of restrictive rules may be framed and enforced in archive offices and libraries. Documents scattered here and there and subject to premature decay and deterioration may be surveyed and salvaged by the state, if not by private enterprise. The archivist can frame and enforce rules within his sphere. Beyond that he is powerless. But he can draw the attention of the authorities to the problems beyond his control and suggest solution [sic].

  • That which pertains to a countries past cannot form the exclusive property of a private individual. In its ultimate analysis, it belongs to the whole nation -- the State. The principle applies with equal force to the collections of private owners. Within the country they may form the property of private persons, but their export cannot be tolerated.
Much of this still rings true today, of course, but the context is particularly interesting considering it was written shortly after India and Pakistan achieved independence from Britain. I find books like this one to be the most telling about how both my profession has changed and has remained the same over the years.

Two Work-Safe Tidbits about Archives and Erotica

First, via my associates at booktruck.org, I came across a review of the comic book Demonslayer v. 2.2, by a certain Marat Mychaels, et al. at Comics Should Be Good. While the fact that the reviewers pan the comic book seems only marginally of interest to those of us wading in archivy, I should draw your attention to a specific part of this issue. Apparently one of the characters goes to visit the Director of Archives at the New York Museum of Natural History, who has chosen to decorate his office in the style of some seemingly life-sized works by (fellow Peruvian) Boris Vallejo.

Demonslayer clip

Secondly, everyone knows how much of a pain digital preservation is, particularly in terms of born-digital cultural materials. So, who should archivists and curators look to for guidance? Kurt Bollacker, digital research manager at the Long Now Foundation (and formerly of the Internet Archive), holds up the pornography industry as a potential leader of the pack. He states that he guarantees "that a wealth of pornography from the late 20th century will survive in digital distributed form (because) it's a social model that's working extremely well." If you read the rest of the article it's not clear if he's talking about just the producers trailblazing these distribution paths or the "consumers" as well (e.g. using peer-to-peer file sharing). Either way, though, I think this idea is a lot like the LOCKSS model for distributed preservation and Dan Chudnov's idea for preserving blogs using METS (or Atom) and BitTorrent. I've intended for a few weeks now to dedicate an entire post to Dan's idea, but after mentioning it in this one I feel like I've covered that sufficiently.