Robin Camille Davis
  1. Home /
  2. Presentations /
  3. The Final Death(s) of Digital Scholarship—An Ongoing Case Study of DH2005 Projects

The Final Death(s) of Digital Scholarship—An Ongoing Case Study of DH2005 Projects

Talk presented March 1, 2019, at the Digital Afterlives Symposium at Bard Graduate Center, New York, NY.

My talk is titled “The Final Death(s) of Digital Scholarship—An Ongoing Case Study of DH2005 Projects.” What do I mean by digital scholarship? I mean a scholarly endeavor that uses digital tools or produces a digital product. In this presentation, I’ll be focusing on web projects that were created in a scholarly environment. For example:

These examples and the case studies I’ll talk about today are all web-based projects. They’re websites with some kind of interactive feature. The main thrust of my talk is applicable to most digital scholarship projects.

The two deaths of digital scholarship

You may be thinking, “The Final Death(s) of Digital Scholarship,” this is a pretty melodramatic title. Or you may be wondering, that sounds kind of familiar, where did she get that from?

I began thinking about digital scholarship in the middle of watching a very emotional scene in the Disney movie Coco. The character Chicharrón has died once in our world, the world of the living. But after a long time in the spirit world, he finally dissolves into nothingness.

Our protagonist, Miguel, asks, “Wait, what happened?”

His friend Héctor replies, “He’s been forgotten. When there’s no one left in the living world who remembers you, you disappear from this world. We call it the Final Death.”

I will admit I’m on the verge of tears just thinking about Coco, so I will hurry myself into my transition: when Héctor talks about the final death, it reminds me so much of the lifespan of a digital project on the web. A project (like an image database or an oral history collection), after a period of activity and creation, usually gets wrapped up after the grant is finished or the creator moves on. The project may have “died” in the sense that it’s no longer active. But it’s still there, just static. It enjoys an afterlife of usefulness and citation, or maybe just the occasional visitor.

But it’s almost inevitable that this afterlife ends. The project is deleted when its creator forgets to renew a domain name or moves on to another university and declines to transfer the web project. It dies a second and final death — it’s disappeared, gone forever.

Three stages: work in progress, afterlife, disappeared.

Digital scholarly projects will disappear

The disappearance of scholarly work on the web means that there are holes opening up in the scholarly record. If you’re a scholar who works in the digital realm, the sources you use and reuse and cite will disappear. What does that mean for your claims? What does that mean for the reproducibility of your results? If you produce digital projects yourself, what does that mean for the record of your own research?

Digital work is especially prone to this final death. The digital decays fast and it decays completely.

Or as Neal Beagrie puts it, “In the right conditions papyrus or paper can survive by accident or through benign neglect for centuries, or in the case of the Dead Sea Scrolls, for thousands of years … In contrast, digital information will not survive and remain accessible by accident: it requires ongoing active management from as early in the life-cycle as possible.” Beagrie, Neal. “Digital Curation for Science, Digital Libraries, and Individuals.” International Journal of Digital Curation 1.1 (2006): 3-16.

As someone whose work is almost all web-based, I care a lot about preserving digital scholarship. The web hasn’t been around long enough for us to tell exactly where we’re headed and how this new technology (new in the grand scheme of things) will pan out for us, but we have some early indications.

We know that lot of the early web is lost to us — as Megan Sapnar Ankerson puts it: “It is far easier to find an example of a film from 1924 than a website from 1994.” [Ankerson, Megan Sapnar. “Writing Web Histories with an Eye on the Analog Past.” New Media & Society (2011) 14.3: 384-400.]

It’s also easier to find a 100-year-old academic paper, or a 300-year-old book, thanks to libraries, than a 25-year-old website. Now, I’m a librarian, and I’ll be the first to tell you that not everything must be saved. Weeding our collections is the only way to keep a library useful and spacious. And digital preservation is an extremely costly and labor-intensive endeavor.

My argument is that digital, web-based scholarship is part of the scholarly record, so it should be preserved. But it can only be preserved with human care and labor. By changing some scholarly practices, we can bless our digital projects with long and fruitful afterlives.

What’s the “death rate” of digital scholarly projects?

Those of us who do digital work might look for indications of how best to accomplish this. How long have past scholarly projects lasted? What can their longevity tell us about the fate of our own work?

Starting in 2015, I examined academic web projects that were 10 years old. I looked at a specific set of these projects, those that were presented at the Digital Humanities conference in 2005, held at the University of Victoria. The field of digital humanities, aka humanities computing, was well-established by then. (Scene-setting: this is back when social media was just starting out, and Facebook didn’t have a news feed, and Gmail was invite-only.)

I’m looking only at projects presented at DH 2005 that had a web component, something that was publicly available on the web and had some kind of interactive feature. There were 48 of these projects. There may have been more such projects at DH 2005 than the ones I’ve included in my audit, but I’ve only included projects that I could definitely tell had a web component as indicated in the abstract.

Online in 2015? Of the 48 DH2005 projects that had a web component, 14% were no longer online, 17% were partially online, and 69% were fully online

By 2015, 7 of these web projects (14%) were no longer online at all.

Online in 2015? Of the 48 DH2005 projects that had a web component, 21% were no longer online, 15% were partially online, and 64% were fully online

By 2019, that has increased to 10 (21%). Not a shocking leap there, but indicative of the larger trend.

That number will never go down. It will only increase over time. Some of these projects were fully available 4 years ago and are now just partially available, as a result of digital decay. Some have disappeared. What happened to the projects that had this status change? Here are two case studies that trace unexpected afterlives of DH projects.

Case study: Forced Migration Online

Let’s take an in-depth look at one of the projects that disappeared, Forced Migration Online, which was based out of Oxford University. It was a resource repository and information hub about human displacement: refugees, IDPs, and people displaced by disasters. It launched in 2002. In 2005, one of the project team members, Dr. Deegan, presented the essentials of this project at the DH Conference. When I first looked at their site (forcedmigration.org) in 2015, it was still fully online, but today you get a page not found error. There’s no obvious place that it moved to that I could find with my best searching skills. What happened?

Screenshot of ForcedMigration.org in 2014. Page highlights the Journal of Refugee Studies

Using the Internet Archive’s Wayback Machine, I saw that in 2014, this message was posted. The project team said, “We recognize that Forced Migration Online is a valuable resource … and we are looking for funding opportunities to enable us to continue its development.”

Screenshot of ForcedMigration.org in 2018. Page highlights has a large alert: Please note: the content on this website is no longer being updated.

But by 2018, the site’s alert just said that it was no longer being updated, so we can guess that a funding source was not found.

The domain name is still owned by Oxford and expires in 2020. I would guess that without funding, the site couldn’t be maintained, so there was a decision to simply take it offline as late in the game as possible, even if the domain name was still owned for longer.

Interestingly, after more digging, I stumbled across a subdomain, repository.forcedmigration.org, which is still online as of spring 2019. The page styling is almost all gone, since everything under the main forcedmigration.org domain was deleted from the web. But the repository still works!

Forced Migration Online's Digital Library, listing  titles like 'Immigration to Italy, data and policies'

You can search and download any of their 5,000 resources, which date from the 1950s through 2011. I checked some of the older resources, and FMO appears to be the only place you can find copies of some of them online. In my dataset, I cataloged this as a “partially available” resource, since the rest of the FMO site is down.

One item in the Digital Library, presented with info about authors, date, subject, publisher, etc.

Look at this metadata! The project team put a lot of work into compiling and adding to this information, and making the fair-use case to make these thousands of documents available to the public. I hope that this repository stays online, especially we are still in the midst of a global refugee crisis.

So here’s the breakdown of this project’s afterlife:

In total, the Forced Migration Online project was fully available on the open web for 16 or 17 years. After 4 years of funding issues, it was taken offline, except for the repository, which is a valuable collection of documents that remains online… For now.

Case study: Clotel Electronic Edition, Documents Compass

Let’s take a look at another project, the Clotel Electronic Edition, which has an unexpected afterlife.

Clotel landing page

Clotel is said to be the first African-American novel. There were four versions with different endings published. The Electronic Edition project made these available online, with lots of other critical resources as well. In 2015, I cataloged this project as still available online. The full scholarly edition itself was paywalled, but the project had a thorough website with many resources. The project was part of Documents Compass, a grant-funded project out of the Virginia Foundation for the Humanities.

Documents Compass landing page, with subtitle, serving the documentary editor

This is what it looked like as late as 2014. The text says that they “provide non-profit assistance to those who are engaged in or planning documentary editing projects…”

This year, the URL for the Clotel scholarly edition brought me to a “Page not found” landing page at the same URL, documentscompass.org. I took a look at the rest of the site, thinking maybe it was moved without a redirect.

Documents Compass, newer-looking website

On the site’s homepage, much of the wording is the same — “We provide non-profit assistance to those who are engaged in or planning documentary editing projects,” they mention XML, data tagging, and other things you might expect. But something felt “off” to me. The stock images looked too anodyne.

Documents Compass newer website with images of people typing

Scrolling down on the page, the site mentions that Documents Compass is a program of the Virginia Foundation for the Humanities, same as the previous site. But what was bugging me, then?

Blog posts with the plural of videos spelled with an apostrophe, in misspelled as is, and other typos

I scrolled down some more and looked at their blog posts, which were riddled with typos! Lots of spare apostrophes, random capitalization, verb tense errors. And suddenly documentary editing was equated with filming documentaries?

That’s when I realized that this was a fraudulent website. It was not operated by the Documents Compass team at all. They used the same wording as the old site, but it was in no way the same project. Honestly, I’m baffled by this — there’s a fake phone number and fake mailing address listed, along with a web form and a working email address. I did email them but haven’t heard back. I’m not sure what their goal is — to defraud people looking to create scholarly editions who were following a tip about the old Documents Compass project? It’s unclear! Other academic projects have been hijacked in sort of the same way, by fake publishers looking to squeeze publication money out of unsuspecting authors, but this isn’t quite the same thing, as far as I can tell.

Registrar: DropCatch.com

The WHOIS lookup says that a site called dropcatch.com registered the URL, documentscompass.org. This is a company that squats on domain names that have expired. The domain name documentscompass.org is bland enough that it can be useful for someone, so it was worth squatting on.

Screenshot with the small title, Buy this domain

Using the Wayback Machine again, and going back in time, I saw that there was an empty domain squatting page in 2018…

Blank screen with the title, The site has been archived or suspended

A page that said that the site has been “archived or suspended” in 2017…

Blog post titled, Documents Compass, 2008-2016

And finally, as late as March 2017, there was a news post on the Documents Compass website detailing the end of the project. The Mellon grant lasted from 2008 to 2016. This very nice post is a project history, and is, to a researcher like me, a thorough documentation of how a digital humanities project comes to a close. After funding disappears, the site was online and static for about a year before it disappeared and then, confusingly, hijacked.

Authentication Required popup for UVA Press

I should note that the Clotel scholarly edition website is, presumably, still online, but it’s behind a UVA login. Previously, the project site was publicly available, but it’s not anymore. Thus this is a site that has an online status change in my dataset.

Brief summary of this digital scholarly project:

To summarize, the Clotel electronic edition website was available for 12 years on the open web. It is still available at the University of Virginia.

Abandoned, finished, and ongoing digital scholarship

Let’s get back to looking at the subset of DH 2005 projects as a whole again. There’s a good amount of research about what’s termed “abandoned” digital scholarship.

Other researchers have examined abandonment specifically within the field of DH. Ten years ago, Bethany Nowviskie and Dot Porter designed a survey project called “Graceful Degradation: Managing Digital Projects in Times of Transition and Decline.” The survey was sent out in 2009, with over 100 responses. It asked about digital projects in or related to the humanities, and the authors analyzed the findings to see how digital projects fared when facing difficult times (like funding troubles) and periods of transition (like colleagues leaving the institution).

Their findings:

The survey results indicated that project scopes tended to change, and that reliable funding was sometimes an issue.

Another, more recent addition to the scholarship around abandoned DH projects comes from Luis Meneses and Richard Furuta’s recent article in Digital Scholarship in the Humanities. Meneses, Luis, and Richard Furuta. “Shelf Life: Identifying the Abandonment of Online Digital Humanities Projects.” Digital Scholarship in the Humanities, web only, 2019.

They looked at every DH Conference abstract from 2006 to 2016, extracted the URLs, and ran a test to see how many returned a 200 response code (which is to say, a site was live on the other end) versus error codes like a redirect or a 404. They called this URL decay, and based on their findings, they ascertained that a DH web project has a shelf life of about 5 years.

Similar research in URL decay have been done in other fields, too, including law and library science, to identify the “half-life” of a URL. The prognosis isn’t great for URL stability. As we’ve seen in the previous two examples of Forced Migration Online and Documents Compass, though, documenting URL changes is a useful but limited way of finding the true status of the project. I focused on just one year’s worth of DH projects so I could examine each project individually to identify their status.

Spreadsheet of DH 2005 session names, presenters, project names, abstract URLs, project URLs given and actual, change in 2019 since 2015

This is what my data looks like, by the way. I’m logging project info, URLs as they were given, URLs as they were in 2015, URLs as they are in 2019, and —

Spreadsheet continued, including information about project status, accessibility, notes

Any changes about the project’s status, along with those interesting notes about what’s still available and what isn’t.

Pie chart of DH 2005 projects with web component, URL change 2015 to 2019

Of the 40 projects that were still available online in 2015, here’s how their availability changed in the past 4 years:

What was intriguing to me was that last point — a quarter of the projects in this subset had a new URL. For the four projects did not have a redirect, I found the new URLs through a web search or by combing the project team’s CVs.

Thinking about abandonment in particular, I also cataloged the projects’ status:

Pie chart of project status for DH 2005 projects with web component

As of 2019:

Here’s an example of a project that I am assuming is abandoned:

Chunk of text with highlighted passages, noted below

I’ll leave the project unnamed, but here’s a snippet from its description page. It’s still available online in 2019, but the final paragraph of the description is frozen since 2005 in the future tense. “This rich image archive will enable us to…” It “will enable users to display the text…” There’s no rich image archive on the site, sadly. I dug a little deeper. The PI retired in 2005. I looked at the CV of one member of the project team, and they list working on this project through 2005 but no later. So with this information, I felt comfortable calling this project abandoned.

What does a clearly finished project look like?

Screenshot of the OCVE with the text, The Online Chopin Variorum Edition has been funded by the Andrew W. Mellon Foundation in three successive phases from 2003 to 2015...

The Online Chopin Variorium Edition was presented at DH 2005. Today, it has a very nice landing page that clearly describes the history of the project, which ended in 2015. At the bottom of the page, it notes who is maintaining the site: “This site is maintained under a Service Level Agreement by King’s Digital Lab.” Clearly, care has been taken to keep this project online, even past the end of its funding.

And what does an ongoing project look like?

Tapor 3: Discover research tools for studying texts

The TaPoR project was presented at DH 2005, and 14 years later, it’s going strong! It’s an ongoing database of tools for text analysis. The About page does note that the first two versions of the site are no longer available, but they can provide access on request.

What can we learn from this audit?

So, having taken a close look at 48 projects that had a web component from DH 2005, what are some of the conclusions that I have drawn?

Web-based scholarly projects are disappearing

Sometimes this happens quickly, sometimes slowly: it’s difficult to predict. An enormous amount of funding does not guarantee a long afterlife! Some of the projects that are no longer available were expensive, labor-intensive efforts — one of these had $150,000 in NEH and Mellon funding.

Preservation is unevenly distributed and is the result of many factors. This includes hosting issues: as we saw previously, in just 4 years, a quarter of the projects I looked at had changed their URLs. Hosting changes are likely to happen.

So are funding issues. Other research has identified funding fluctuations as a major factor in digital preservation, since keeping even a plain website online and functioning takes active and ongoing maintenance, which is to say, human care and labor.

For those projects that have interactive components, it might not be realistic to expect ongoing maintenance into the future. A complex project may rely on a content management system that needs ongoing security patches, PHP updates, server updates, and so on, and it’s likely that there will be an incompatibility problem eventually. Everything digital dies.

Additionally, project teams change: people retire, people die, people move on to other institutions. This is normal in academia, and it’s not always something you can plan for when you’re writing out your project roadmap.

Some projects have unexpected afterlives

In the course of this research, I found that some projects have unexpected afterlives. It’s not just that some projects survive and some die. Some, like the Documents Compass project, end up being hijacked or otherwise fraudulently presented. It’s hard to hold on to a domain name, especially in an institutional context, so digital projects do run this risk more than you’d think.

And due to digital decay, some projects end up being only partially available, like the Forced Migration Online repository that can still be used. Again, this is hard to predict. We have some indicators about what will break on a website, things like JavaScript libraries and CMSes that need updates, but even so, we can’t see the future.

Documentation is critical for future researchers

I used the Internet Archive’s Wayback Machine extensively for this project, and I can tell you that while their preservation is broad, it’s not always deep. A project’s homepage might be captured, but not the second layer, not the “meat,” so to speak. Over time, their capture software has improved, but if you’re a researcher tracking down an old digital project, you might be stymied by a site that’s entirely in Flash or that has a missing JavaScript file. (Side note, if you’re designing a digital project right now, you should know that if your site is accessible it’s likely to be preservable, too.)

In the course of my research, I found that documentation by project teams is critical for understanding the context and impact of digital projects in the scholarly record. We saw several sites that had project page updates. The best of these had a clearly written project history that detailed funding sources, team members, and important dates and milestones.

For completed or abandoned projects, including the “end of life” notification — or as I call it, the “goodbye cruel world” note — helps future researchers understand what happened to your project and why it’s inactive or gone. This is important, even if your site eventually goes offline, because a future researcher might check the Internet Archive for information and find this note.

*(Note, July 2019: the Digital Documentation Process project’s Archiving Dossier Narrative provides wonderful standardized template for this “end of life” note. Hat-tip to Micah Vandegrift for passing this on to me.)

Traditional academic outputs, like articles and white papers, were also critical for understanding the role and features of digital projects. For interactive projects, screenshots and videos were key to understanding how the project functioned, even after it broke or disappeared.

Additionally, I was surprised at how much I could rely on people’s CVs to understand the afterlife of a project. Sometimes they told me that funding ran out, or that team members moved on. Often they gave me a glimpse into how digital research can morph into several different, successive projects — a different kind of afterlife.

Cultivate a preservation mindset

Finally, I would not be a responsible librarian if I did not mention the importance of having a preservation mindset. Preservation work begins at the moment of creation. Having a preservation mindset means that your project roadmap includes a section about archiving and its afterlife, who could host it and how it will be funded. It also means making choices with long-term, low-risk thinking. Should your project be based on an older but widely-used CMS, or a shiny new one with limited support? A preservation mindset would indicate that the less-exciting but more stable system is the better choice.

The responsibility of preservation belongs wholly to the project team. If you know that your digital project has a short projected lifespan, that should change how you design your project, how you share it, and how you preserve some part of it for yourself. Libraries, archives, and other repositories of our scholarly heritage are addressing digital preservation problems. But scholars who do digital work must be invested in preserving their work, too — or we run the risk of losing a whole domain of scholarship. It’s up to us to give our digital projects a long afterlife and stave off that final death for as long as possible.

This work builds off of a talk I gave at MLA 2015.

Thank you to Jesse Merandy and Bard Graduate Center for organizing this symposium.