Mél Hogan is a digital archivist doing a two-year research fellowship in digital curation for her post doc at CU Boulder. She contacted Gnip about having a class she teaches come in to learn about what we do, but we were equally interested in what a digital archivist does, so we asked if she’d interview for a Data Story. Last year we signed a partnership to provide all of the public Tweets to the Library of Congress, so we were curious about this emerging field.
1. What are the big concerns among digital archivists?
Generally I think digital archivists who focus on the web are concerned with the quantity of information continuously created and shared across the globe, tracking this proliferation, its speed, and the various networks through which data travels. The big concerns are around the interplay of these things; for traditional librarians and archivists this has meant a shift to a hybrid role, as custodians and as mediators. Offline: media format management, migration, and interoperability continue to pose serious problems to preservation, as is it traditionally understood in those institutions.
I think a major concern for digital archivists is also managing what is no longer there. Salvaging efforts by Jason Scott’s team of rogue archivists to rescue early web histories – such as the abandoned GeoCities and Friendster networks – highlights this point. It also raises critical issues around data ownership, access, and preservation. Rick Prelinger (Prelinger archives) and Brewster Kahle (Internet Archive/Wayback Machine) have challenged these ideas through important archival interventions. As did Kenneth Goldsmith, with UbUWeb. And, as an example of a local effort, Lori Emerson’s Media Archaeology Lab (at the University of Colorado – Boulder) is host to a number of old computers that allow researchers to access games and electronic literature in their original contexts. This is also an important facet of digital preservation, down to the machines and the hardware. I like to think of these examples as “archivism” because there is an inherent politic to the archive, ignited by a kind of activism that is often aware, if not defiant, of the newness of ‘new’ media and the blind hyperconsumption it invites. These initiatives tend to be aligned with a kind of self-reflexive media archaeology, perhaps more than digital archiving.
Other types of concerns about the digital archive are more aptly addressed through artistic and scholarly interventions that both contextualise media historically, and revisit its underlying concepts and theories, such as authenticity and originality. In particular, work that addresses archives in exile, everyday people’s oral histories, database logics, queer temporalities, imaginary collections, representation and memory, performing the archive, etc., contributes greatly to our changing understanding of the archive, its power and potential, as well as its failures and limitations. Here, I’m thinking of Wendy Hui Kyong Chun, Ramesh Srinivasan, Tess Takahashi, Jane Anderson, Ann Cvetkovich, Sameer Farooq & Mirjam Linschooten, Anjali Arondekar, Amma Y. Ghartey-Tagoe Kootin, among many others invested in the reconceptualisation of the archive as model and practice.
All that being said, I don’t know that anyone officially qualifies as a “digital archivist” (yet) because the issues are so complex: the ideal digital archivist of today is either a team of researchers, creatives, industry, engineers, artists and programmers; or, each individual, empowered more than ever to document, share and (hopefully) preserve and pass on media collections, for (and from within) a range of communities who determine for themselves the pertinence of the archive.
2. Digital archiving is probably like choosing what to save in a fire. How do you prioritize?
The threat of loss has long inspired archival discourse (and action) – everything from the death drive (Derrida), amputation (McLuhan), accident (Virilio), to the secret (Arondekar) – but I do think the notion of loss gets reconfigured by the media we assume preserve memory in the first place. For this reason, I think digital archiving is quite unlike choosing what to save in a fire but the question of how to prioritize is a good one.
Many would argue that nothing needs to be prioritized because storage is fast, cheap, and easy. We can save tremendous amounts of data on very small devices. We have duplicates of some data in social media sites, on our personal hard drives, in our email, etc., which means that most of our digital belongings exist simultaneously in more than one space. For these reasons, it’s easier to imagine recovering digital assets than material objects, lost in a fire, for which only one exists and the damage is irreversible. Depending on the context, I think the question of how to prioritize is going to be part of a much larger conversation about what the priorities have traditionally been, and how the media itself strongly informs how we conceive of preservation. I’d like to think that we’ll move away from a scarcity model to an inhabited one, where archives are occupied spaces, where past and future interact in ways we can’t yet imagine, through the people whose stories are being recorded.
3. One interesting upcoming Interactive SXSW session is about the important stories that digital maps can tell about the lives we lead. What other aspects of our digital lives do you think tell compelling stories?
I agree, maps have become an important storytelling tool, especially when used in conjunction with oral histories and localized participatory initiatives. But often these projects rely on Google Maps, which offer another (often invisible) layer of storytelling, as these companies compile user generated data for other ends. In this way I think our digital lives are always telling more than one story.
I think a similar thing is happening with social media. On the surface, we are witnessing the incredible power of social media for things like activist mobilizations, real-time reporting, e-campaigning, and so on, but below the surface the data is collected and analysed to track networks, key nodes, and important flows, and who knows for what purposes. We know this happening because social media is ‘free’ to users but is worth millions as a package.
Because of this wariness, I have a particular interest in small data, local happenings, and process-oriented experiments that question the media they use, and even rely on heavily.
4. You’re doing a two-year research fellowship in digital curation. What exactly is digital curation?
I wouldn’t be able to give you a definition of digital curation that everyone would agree to but in the context of this postdoc, and my own research, digital curation is understood as an archival process. It means paying attention to the lifecycle of data, from creation to access to storage to remix (the DDC has a good definition).
I’m taking this postdoc as an opportunity to think more carefully about the why (of archiving) and let the how emerge from what is uncovered in that process. With all these technologies at our disposal it is very tempting to jump to the how too quickly, without considering the impact, pertinence, audience, contributors, and strategies for any particular project. Curation is very much about making sense of a collection and finding solutions for display, dissemination, and preservation, that stem from the materials and communities themselves. I think where I most diverge from the field is that I don’t think there’s one approach to digital curation or that we should be looking to technological means or algorithms to make sense of the world we live in.
5. Tell us something about your profession that you wish everyone knew.
I wish everyone knew that they were empowered to make decisions about the technologies they use, even if they’ve come to feel absorbed and dependent on them.
Past Data Stories:
- Liv Buli of Next Big Sound, the world’s first music data journalist
- Hilary Mason, Chief Data Scientist of bitly
- Rumi Chunara of HealthMap, using social data to identify epidemics
- Sherry Emery of UIC, studying social data and smoking cessation
- Lada Adamic of Michigan on information networks
- Annicka Campbell of SapientNitro on the Digital Love Project
- Gabriel Banos of ZauberLabs on predicting the election with social data
- Seth Grimes of the Sentiment Symposium on sentiment analysis