Preserving today's digital universe for the future: an important call

In the years to come, how can universities, libraries, and museums best preserve the digital present for the future?

This has been a vital question for decades, but has taken on a new dimension of late as automation and algorithms have started altering the landscape.  How do we preserve algorithms and the systems they structure?  Cliff Lynch, the Coalition for Networked Information (CNI)’s leader and one of the most brilliant thinkers on this planet about digital matters, period, offers an important analysis and path forward.

There are so many good pieces in the long article, like this fun thought experiment:

Imagine that Facebook CEO Mark Zuckerberg suddenly recognizes and totally embraces the idea that stewardship of some version of at least the public portion of Facebook (whatever that may be) is profoundly important, and that the company needs to support and enable it. Facebook then offers a comprehensive record to one or more stewardship institutions — perhaps the Library of Congress, Harvard University, … — and perhaps even some funding (one can always dream). How many petabytes, and how many square miles of data centers are necessary to support and provide any form of meaningful access to this data?

Or the idea of “robotic witnesses”, or trying to remix population segmentation.

And this powerful, condensed summary of what algo-preservation could be:

Actually documenting the “Age of Algorithms” ideally involves capturing and preserving the answers to two questions. The first is to record, given each specific set of inputs (which may include identity, history and context) the actual outputs of the algorithm at a given point in time. The second, and even more difficult but much more comprehensive question is to be able to capture the answer to the subjunctive form of the first question: given a hypothetical set of inputs, what would the algorithm’s outputs have been at a given point in time? [emphases in original]

Not to mention this very challenging assessment: “this new world is strange and inhospitable to most traditional archival practice.”

Cliff also offers a very rich summary and analysis of the AI/big data world.  Recommended for this alone.

