Hillel Arnold is the author of the second post in the Drupal for Archivists series on thesecretmirror.com. In addition to being a recent graduate of the Archives and Public History program at New York University and a relatively recent graduate from the Palmer School of Library & Information Science at Long Island University, Hillel also worked for me as an intern with the Digital Experience Group at the New York Public Library. He was previously employed as an archivist at the Woody Guthrie Archives, the Digital Projects Manager at the Foundation for Landscape Studies, and currently works at NYU’s Tamiment Library/Robert F. Wagner Labor Archive where he coordinates EAD production and implementation of the Archivist’s Toolkit.
Over the course of the last academic year, I have been part of a team working on survey project aimed at identifying and describing archival collections relating to the Asian and Pacific American community in the New York City metropolitan area. The results of the fifty-plus collections we surveyed have been posted on our Drupal-powered website, which has been an excellent fit for the needs of this project and has also enabled us to engage many of the challenges the project has presented.
By way of introduction, this survey project seeks to address the underrepresentation of East Coast Asian/Pacific Americans in historical scholarship and archival repositories by working with community-based organizations and individuals to survey their records and raise awareness within the community about the importance of documenting and preserving their histories. Funded by a Documentary Heritage Project grant from METRO: Metropolitan New York Library Council, the project is a collaborative effort between the Asian/Pacific/American Institute and the Tamiment Library/Robert F. Wagner Labor Archive at NYU. Three graduate students – I-Ting Emily Chu, Nancy Ng Tam and I – were hired to do the survey work.
For close to nine months, we dug through the basements, closets and storage facilities of artists, activists, scholars and collectors. We visited the offices of arts organizations, theatre companies and social service agencies. We looked at paper files, stage props, moving image materials, digital photographs and emails. Despite the diversity of institutions, people and materials we encountered, a common theme began to emerge.
Due to the nature of many of the organizations we worked with and the cost of space in New York City, many of the collections were spread across several different locations, including private apartments and other publically inaccessible spaces. This problem is even more acute on a larger level; there is no significant archival repository in the NYC area dedicated to collecting documentation of the Asian/Pacific American community.
The website initially began primarily as a way to publicize the project and fulfill the grant requirements. However, as we began thinking about the site's structure, content and audience, we realized that we had the potential to do something far more interesting; to build a research center for scholars and members of the Asian/Pacific American community, and to bring together these physically dispersed collections via standardized descriptions. I was introduced, through Mark’s timely prodding, to the wonders of Drupal at DrupalCamp NYC and quickly realized that this project would be a perfect application for Drupal, since we were dealing largely with structured data and wanted the ability to present that data in a variety of ways.
With Mark’s good advice and the assistance of Brian Hoffman of NYU’s Digital Library Technology Services, I was able to get a site up and running in a few weeks. The majority of the site’s content is based on three content types, built with the Content Construction Kit module. The main content type, Archival Resource, contains collection-level information including dates, extent, language, arrangement, an abstract and a scope and content note. The Archival Resource content type also links to an Entity content type via a node reference field. This Entity content type describes the person or corporate body responsible for creating the records, including dates of existence, authorized form of name, and a historical/biographical note. A Location content type, with repository-specific information such as address, hours and contact person, is also tied to the Archival Resource content type via a node reference field.
Taken together, the three content types amount to the front matter for a finding aid. Separating the content into three different types avoided repeated entry of the same data, which in turn prevented wasted effort and data inconsistency. Drupal also allows for field-level data validation and formatting, which dramatically reduces the chances of human error in data entry, which was especially important as there were a number of people responsible for creating content. The display of the of the content is controlled through the Views module, which gives us the ability to programatically create displays from a collection list with brief abstracts to a complete view of the survey data, all with the same data.
We also created four very simple taxonomies - ethnic context, geographic coverage, organization type and person type - and applied these to the collection level description in the Archival Resource content type. These taxonomies allow users to browse through the collections via facets, a critical function on a site that aims to expose hidden collections.
In terms of this project, the real strengths of Drupal have been its ability to handle structured data in flexible and powerful ways via customizable content types. Having developed a number of static HTML sites in the hazy past, I’ve also been grateful for the way Drupal separates the development of infrastructure and function from the generation of content. This has allowed others a significant hand in creating the site’s content, and has prevented me from having the dubious responsibility of being the only person who can update the site.
The site is still very new, and we’re looking for ways to publicize it, generate more content, and create a user community. The survey project will continue for another year, thanks to a funding extension, and additional descriptions will be added during this time.
Jim Gerencser is the author of the first post to the Drupal for Archivists series on this blog. Jim is the College Archivist at Dickinson College in Carlisle, Pennsylvania, and has chosen to write about the Drupal-based archives reference blog that he and his colleagues put together.
When Mark asked me to write about our use of Drupal at the Dickinson College Archives and Special Collections, the first thing I thought about was when our Archives Reference Blog was initially launched in April 2007. I couldn’t believe that it has been two years already. I am pleased to report that my colleagues at Dickinson and I are enormously happy with the results of those two years. I hope others may find this brief explanation of how and why we are using Drupal as a reference management tool to be helpful and instructive.
The concept for our implementation of Drupal was a simple one. I was thinking about the fact that we help researchers everyday to locate information that they want, but that what they discover among our collections or learn from them seldom gets shared, except by those who write for publication. So, what if we shared via the web, through a simple blog format, the basic questions posed by our researchers along with a simple summary of the results? Wouldn’t that provide an additional access point by which a future user might stumble across our resources, being led to materials that might otherwise have been overlooked? It certainly seemed worth a try, so we talked with our IT colleagues about starting an archives reference blog.
But if all that was needed was a simple blog, then why use Drupal instead of WordPress or some other blogging software? The answer for us was the added flexibility in the kinds of information we could record, the ease with which we could change the types of information we were recording, and the level of granularity with which we could manage different types of information. Here’s how we changed our reference workflow and inserted Drupal into the mix, creating a reference management tool instead of a mere blog of questions and answers.
In the past, when we received a reference request via email, we recorded contact information for the researcher on a sheet of paper, took notes about the resources we consulted, and wrote down the final results of the search before emailing the researcher (or mailing photocopies, or whatever). What Drupal allowed us to do was to recreate that paper request form in an electronic format. More importantly, Drupal allowed us to keep the contact information hidden from the world, accessible only to authorized administrators, while the information about the subject of the research and the results are publicly available – all within a single node. Having this public side combined with the private side, without needing a relational database, permitted us to manage our reference requests in a whole new way.
On the public side, our Drupal implementation appears much like any other blog. We have a title field, a date field, a narrative text field, and tags. The date simply records when the request was filled. The title, text, and tags are the fields that we fill with proper nouns – the names, places, and events that are relevant to the question that was asked. These proper nouns, along with the text about the transaction, then serve as the bait that will draw researchers to our resources via their preferred search engines – simple enough, but very effective.
On the private side, we have data fields to record name, address, phone, and email. (If you would like a glimpse “behind the curtain,” you can view slides from a recent presentation at Educause Mid-Atlantic.) We have fields to indicate what products we may have provided – photocopies, photographs, and scans – along with any charged fees. We also have a text block where we can record what resources we used to answer the request, and who within the Archives was responsible for the research. Finally, we have some fields that were designed to tell us something about our researchers. Is the person researching family history or doing work for a school project? Is the person a professional scholar, or a student in a K-12 environment? Did the person contact us by phone or by email? Did the person locate our materials through a web search or through a printed source? Which of our existing online resources did the person already discover or explore before contacting us?
Returning to why we chose Drupal, I mentioned flexibility. Each of the fields can be readily customized depending on the type of data being recorded; some fields utilize free text, some use drop down menus, some are formatted for date, and some will auto-fill as you begin to enter text. Regarding ease of change, I can go into the field manager and add a new term to a drop-down menu list in a matter of minutes, and adding that new term will not disrupt any of the existing data that was previously recorded in that field. I could also change the format defaults just as quickly. And regarding granularity, I have the ability to assign different access permissions to individual fields, which is what allows us to maintain some fields on a public side (access permissions to all) as well as some fields on a private side (password-protected access for authorized users only).
So how has the Archives Reference Blog improved our workflow? For one thing, when a researcher calls or emails with follow-up questions months or even years later, we no longer need to sift through paper files to discern exactly what services we provided in the past. And just as important as learning what services were provided is finding where any items that we copied or scanned are housed. Even further, when a question arises that is similar to one we handled previously for a different patron, we can now use the blog as an additional research and reference tool.
Besides the obvious implications for reference activity, the blog also makes the task of gathering statistics far easier. A mere click aggregates results from select fields to retrieve the number of requests filled, the number of family history and genealogy requests, the number of patrons from foreign countries, the number of requests from K-12 students, the number of photocopies provided, and the amount of revenue generated for copying services. Results can be obtained by month or by year. Since our previous method of tracking reference requests was on paper, you can quickly see the amount of time that is being saved with this particular convenience.
We can also use our tags, which essentially reveal the people and subjects being researched, to discern what resources are being used most heavily. This usage data can help us to prioritize what materials we should process and/or digitize. This usage data may also help us make the argument to a granting agency about what collections warrant processing or digitization. We will actually be able to point to direct usage data, and maybe even solicit letters of support from previous researchers.
Finally, and most importantly, we now attract new researchers who, if not for the blog, would probably never have known that we had resources of use to them. While I could share numerous stories to illustrate this point, one solid example should suffice.
Due to our proximity to Three Mile Island (TMI), several collections have been donated to Dickinson College over the years that document the 1979 accident and the aftermath. The collections amount to more than 150 linear feet, so even with the folder-level inventories that we have for these materials, some useful resources remain hidden within those folders. One researcher, knowing of our extensive TMI holdings, asked if we had a report produced by Admiral Hyman Rickover. The name Rickover did not appear in any of the finding aids, nor did the full title of his report. With some hunting, we were able to locate the report in a folder among other related papers. A copy of the report was made for the researcher, and the reference transaction was recorded in the blog. A few months later, another researcher emailed us in search of the Rickover report. This second researcher had been led to our blog entry by Google. If not for the blog entry, this second researcher probably would not have contacted us for assistance and may not have succeeded in locating a copy of the report. (Incidentally, we have since filled three additional requests for that same report, two of those within another three months and one, coincidentally, just as I was writing this brief article.)
So, where do we go from here? Well, one small thing that we have done is to create a new blog as a way to quickly and easily share information about resources pertaining to the history of women at Dickinson College. The college will celebrate the 125th anniversary of going co-ed during the next academic year. In preparation, we hired three undergraduate interns and are enjoying the additional effort of a group of volunteers who are all researching the women’s experience. The blog provides a simple way for this group of researchers to post a little information about the resources they are uncovering, and in time we hope that alumnae will begin to use the blog to share their own stories and comment on the posts that are already there. This blog will thus serve as the focal point for next year’s celebration of coeducation and will provide some background from which we will develop exhibits, arrange lectures, and collaborate with colleagues throughout the college community.
A project that potentially looms even larger is a plan for an implementation of Drupal that will allow us to create a kind of interactive digital repository. (We have submitted an NEH Digital Humanities Start-Up Grant to seek support for this initiative.) So, building from the Archives Reference Blog, what if we had a place online to post the original content that a researcher had requested? (For instance, what if that report of Hyman Rickover could have been posted online when the first request was received? Then subsequent researchers could have helped themselves and no archives mediation would have been required.) Sure, we can do that much now with CONTENTdm or any number of other digital asset management tools. But what I would like to do is post that letter, or report, or photograph, or whatever, in an easily accessible format, with minimal metadata at the time of upload (since metadata creation can be rather labor intensive and thus inefficient to do on an ad hoc basis), and with the possibility that later visitors can add value through comments, through transcription, through tagging, or through correcting earlier errors in the descriptive information. (If this sounds familiar to anyone, it’s essentially what Max Evans was suggesting in his article “Archives of the People, by the People, for the People” in the American Archivist, vol. 70, #2.)
What we plan to do is to digitize content on demand for the person that requests it, invite that person to comment on or otherwise enhance the description of the content, and then leave both the original content and that person’s comments out in the public eye, encouraging a dialogue. We also plan to digitize content that we select – material from our collections that we believe would be of value to researchers if they knew it existed – in the hopes that this content, too, will generate a dialogue. As more unique material enters our repository, more material will be available for scholars, as well as the general public, to examine and add value to through comments, criticisms, and discussions. What I am suggesting here is a model whereby we not only share our unique content in a totally open environment, to be used and repurposed by our users in various ways, but that we also facilitate active commentary and discourse on that content by professional scholars and amateurs alike. In the twenty-first century, archives should be viewed not merely as the repositories from which single individuals request materials for personal use; they should be thought of as open spaces where the ideas contained in and generated by those materials can be discussed. Web technologies allow us to explore this vision of archives and special collections repositories being facilitators of a dialogue instead of being mere purveyors of informational goods.
Hopefully the reviewers of this NEH proposal will agree that such a project would be valuable, and in a year’s time I will be able to announce the launch of a prototype Drupal module. And if any of you out there might also like to participate in such a project, or might have thoughts about how to proceed (or why not to), I would certainly welcome your comments and criticisms.