Walker Sampson


“Revealing our melting past: Rescuing historical snow and ice data”

For the last year I have served as Co-PI for a fantastic project, supported by CLIR’s Digitizing Hidden Special Collections and Archives grant program, which centers on the metadata gathering and digitization of the National Snow and Ice Data Center’s (NSDIC) expansive collection of glacier and polar exploration prints within the Roger G. Barry Archives here in Boulder. We have a stellar project archivist leading the work, and we expect to begin posting images on our own site over the course of the year. Stay tuned for that.

The linked article here, posted in the last (ever, actually) issue of GeoResJ is a good summary of the project scope and value from everyone on the team, including our initial PI now at the University of Denver. We’re really excited to be contributing along with NSIDC to glaciology and earth history through this collection, and are planning on further promotion as processing continues along.

Revealing our melting past: Rescuing historical snow and ice data
Author links open overlay panel (ScienceDirect)


“Aggregating Temporal Forensic Data Across Archival Digital Media”

Last year I attended the Digital Heritage 2015 conference and presented a paper on digital forensics in the archive. The paper centers on collecting file timestamps across floppy disks into a single timeline to increase intellectual control over the material and to explore the utility of such a timeline for a researcher using the collection.

As I state in the paper, temporal forensic data likely constitutes the majority of forensic information acquired in archival settings, and in most cases this information is gathered inherently through the generation of a disk image  While we may expect further use of this data as disk images make their way to researchers as archival objects (and the community’s software, institutional policies and user expectations grow to support it), it is not too soon to explore how temporal forensic data can be used to support discovery and description, particularly in the case of collections with a significant number of digital media.

Many thanks to the organizers of Digital Heritage 2015 for the support and feedback; it was a wonderful and very wide-reaching conference.

Aggregating Temporal Forensic Data Across Archival Digital Media (IEEEXplore) (CU Scholar)

KryoFlux Webinar Up

In February, I took part in the first Advanced Topics webinar for the BitCurator Consortium, centered on using the KryoFlux in an archival workflow. My co-participants, Farrell at Duke University and Dorothy Waugh at Emory University both contributed wonderful insights into the how and why of using the floppy disk controller for investigation, capture and processing. Many thanks to Cal Lee and Kam Woods for their contributions, and Sam Meister for his help in getting this all together.

If you are interested in using the KryoFlux (or do so already) I recommend checking the webinar out, if only to see how other folks are using the board and the software.

An addendum to the webinar for setting up in Linux

If you are trying to set up KryoFlux in a Linux installation (e.g. BitCurator), take a close look at the instructions found in README.linux text file located in the top directory of the package downloaded from KryoFlux site. It contains instructions on dependencies needed and the process for allowing access to floppy devices through KryoFlux for a non-root user (such as bcadmin). This setup that will avoid many permissions problems down the line as you will not be forced to use the device as root, and I have found it critical to correctly setting up the software in Linux.

On Doors in Games

It’s been a while since I’ve written about games here. This is a draft post from a few years ago; on rereading I think it’s worthwhile.

Doors in Penumbra: Overture

Some time ago, I started playing Penumbra: Overture, the debut first-person horror-adventure title from Frictional Games. I never finished the game or ever played very far in, but I wanted to comment on a satisfying mechanic seen early in the game: manual manipulation of objects — but particularly hinged, swinging doors.

Yes, I’m going to talk about doors in a video game, but I’m certainly not the first.

A little background on the game. Penumbra: Overture shares the perspective and control scheme of first-person shooters (WASD keys and a mouse), and although one combats enemies, this is not a first-person shooter. Your avatar, one Philip stuck in an abandoned mine, has panic attacks when he confronts enemies: breathing contracts, vision is marred, and Philip may stand and reveal himself in a nervous outbreak. Combat is similarly frantic and amateurish. Outside of doors, a player can manually slide, slam, stack, etc., various objects in the game by clicking and holding them, and then moving them onscreen with the mouse. This leads to some physics-based puzzles and object manipulation, and provides a strong sense of analog mechanics at work in the world.

Doors in First-Person Shooters

Let’s return to the operation of doors in the game. It makes design sense that so few (if any) first-person shooters contain hinged doors opened by degrees with a mouse. If the central activity is gunplay or melee, operating a door is a distraction.

Apart from gameplay, it is computationally much more expedient to implement a simple sliding or pocket door. This door will use the same animation every time, and if the game runs a genuine 3D engine, it may remove the need for on-the-fly numbers crunching. In a pseudo-3D system such as seen in the original Wolfenstein 3D, implementing a hinged door would be absurdly difficult and perhaps technically impossible. Thus the player is confronted with the same door sliding door for the entirety of the game and its sequel, from the opening jailbreak setting to officers’ quarters, secret labs and bunkers. Whatever the fiction of the setting, the door never changes.

Going back to gameplay however, a sliding or pocket door features only two states for the player: open or closed. Instead of the player encountering a panel on hinges which he or she must manipulate by degrees to pass through, the player encounters a flat, essentially 2D barrier (even in the case of more modern titles like Halo) that only needs a single button press (if that) to open. The player passes through such doors completely or not at all. Even when it is opening and visually between open and closed, it is functionally closed; when it is seen to be closing it is still absolutely closed and impassable. Many games that feature hinged doors use a preset animation, or use it as a loading screen (Resident Evil). In all cases, doors are a binary gate.

And this is acceptable. It is of course the point of doors, to let something in or out, or keep it in or out. We didn’t make doors for them to be halfway open, although as it happens many are.

Doors of this type are found in numerous older first-person shooters: Blake Stone, Doom, Duke Nukem 3D, Rise of the Triad. Even a modern title like the aforementioned Halo, which could easily implement a hinged door, has no need to do so. Esoteric or exotic settings have their clear attractions as game worlds for players, but they arguably have equally if not more compelling attractions for programmers, developers and designers. Automatic, two-state doors are infinitely easier to implement than a board that swings, by degrees, on hinges, and it’s reasonable to expect to find such doors in science fiction or otherwise futuristic settings. By that time superior door-tech  would have eliminated any instances of halfway open doors.

(This of course goes into the long tradition of technical constraints informing a game’s fiction. I have a growing mental catalog, from Mario’s design to the preponderance of bald space marines.)

Immersion and a Hinged Door

So Penumbra: Overture‘s doors can be neither entirely open nor entirely closed. Why is that so interesting?

I’m sure some of my interest has to do with how such creaky, realistically old doors contribute to the atmosphere of the game. But it’s also jarring to see something in a middle state, a state that’s not precisely describable. I could say the door is open just a crack, or mostly open, or even halfway open, but these are not absolutes like open and closed. I can have any number of different views into the next room depending on these highly adjustable degrees, as I try to imagine what is still obscured by the door panel.

It is the mundanity of operating the door, and of noticing the space it occupies, which is so compelling. Taking the time to operate the door feels slightly unreal – I’ve never had to think about opening a door in a game, and I mean really think about moving the hinged board over. Each door is opened differently, depending on my angle to it and the position of my mouse. Having to concentrate on this, however briefly, immerses one into the game unexpectedly, and reifies the world suggested onscreen.

Considering again the automatic sliding and pocket doors of first-person shooters: their prime function is to demarcate rooms and divide the player’s challenges appropriately. They reinforce the gameness of the world in that way, in their ability to be completely in the way or completely out of the way. Penumbra: Overture‘s developers do make use of their doors in gameplay. In some cases one needs to block the door from a pursuer. The tension here is that the door returns to its original binary function: it will either be open for the pursuer or closed, and one can watch as the door, by degrees, is forced open. Frequently however the doors are simply wonderfully described objects that exist in the game, outside of your immediate concerns.

This leads me to wonder what other games have used mundane details or provided interactions utterly unrelated to the central gameplay to enhance the realness of the world (and I do not count side quests and such as are found in RPGs: they almost all have very applicable benefits for the player)?

In any case, that lack of a definite state is refreshing, as it happens so little in games. When opponents are felled in a shooter, they are almost always absolutely felled: you do not often maim an opponent with an errant shot and have to deal with his or her suffering. Levels are completely finished, quests are definitely open or closed, achievements are either unlocked or locked, etc. How often can one badly hurt an opponent, and then move on to the next area? If you came back, would it still be there — would you have to hit the opponent a few more times to finish the job? This is the sort unsavory midway point games have dealt so well in discarding, one could argue games have to discard this aspect of our lives to be games at all. In any case, so many games belong in the province of fantasy that dealing in totalities makes thematic (along with the aforementioned technical) sense. Still, it is awfully nice to engage with a world where a few mechanics, at least, are not figured in such absolutes.

Repercussions of Amassed Data

I had the pleasure of meeting Mél Hogan while she was doing her postdoctoral work at CU Boulder. I think her research area is vital, though it’s difficult to summarize. But that won’t stop me, so here goes: investigating how one can “account for the ways in which the perceived immateriality and weightlessness of our data is in fact with immense humanistic, environmental, political, and ethical repercussions” (The Archive as Dumpster).

Data flows and water woes: The Utah Data Center is a good entry point for this line of inquiry. The article explores the above quoted concerns (humanistic, environmental, political, and ethical) at the NSA’s Utah Data Center, near Bluffdale. It has suffered outages and other operational setbacks since construction. These initial failures are themselves illuminating, but even assuming such disruptions are minimized in the future, the following excerpt clarifies a few of the material constraints of the effort:

Once restored, the expected yearly maintenance bill, including water, is to be $20 million (Berkes, 2013). According to The Salt Lake Tribune, Bluffdale struck a deal with the NSA, which remains in effect until 2021; the city sold water at rates below the state average in exchange for the promise of economic growth that the new waterlines paid for by the NSA would purportedly bring to the area (Carlisle, 2014; McMillan, 2014). The volume of water required to propel the surveillance machine also invariably points to the center’s infrastructural precarity. Not only is this kind of water consumption unsustainable, but the NSA’s dependence on it renders its facilities vulnerable at a juncture at which the digital, ephemeral, and cloud-like qualities are literally brought back down to earth. Because the Utah Data Center plans to draw on water provided by the Jordan Valley River Conservancy District, activists hope that a state law can be passed banning this partnership (Wolverton, 2014), thus disabling the center’s activities.

As hinted at in a previous post on Lanier, I often encounter a sort of breathlessness invoked when descriptions of cloud-based reserves of data and computational prowess are discussed. Reflecting on the material conditions of these operations, as well as their inevitable failures and inefficiencies (e.g. the apparently beleaguered Twitter archive at the Library of Congress, though I would be more interested in learning about the constraints and stratagems of private operations) is a wise counterbalance that can help refocus discussions on the humanistic repercussions of such operations. And to be sure, I would not exclude archives from that scrutiny.

Report on American Psychological Association and CIA

NYT reports today:

The American Psychological Association secretly collaborated with the administration of President George W. Bush to bolster a legal and ethical justification for the torture of prisoners swept up in the post-Sept. 11 war on terror, according to a new report by a group of dissident health professionals and human rights activists.

NYT has helpfully provided the referenced report on their site.

The Archives at CU Boulder has been collecting information on APA Psychological Ethics and National Security (PENS) debate since 2010. See the call for materials, as well as the report NYT has written up today, at the collection site.

Who Owns the Future?

Excerpts from Who Owns the Future?, by Jaron Lanier.

Lanier defines “Siren Servers” as

an elite computer, or coordinated collection of computers, on a network. It is characterized by narcissism, hyperamplified risk aversion, and extreme information asymmetry. It is the winner of an all-or-nothing contest, and it inflicts smaller all-or-nothing contests on those who interact with it.

Hm, I think I can count a few companies running such servers. On the formation of these servers:

Every attempt to create a pure bottom-up, emergent network to coordinate human affairs also facilitates some new hub that inevitably becomes a center of power, even if that was not the intent…. These days, if everything is open, anonymous, and copyable, then a search/analysis company with a bigger computer than normal people have access to will come along to measure and model everything that takes place, and then sell the resulting ability to influence events to third parties. The whole supposedly open system will contort itself to that Siren Server, creating a new form of centralized power. Mere openness doesn’t work.


In what sense is becoming dependent on private spy agencies crossed with ad agencies, which are licensed by us to spy on all of us all the time in order to accumulate billions of dollars by manipulating what’s put in front of us over supposedly open and public networks, a way of defeating elites? And yet that is precisely what the “free” model has meant.

The start of his premise:

To restate the premise of this project, it’s ultimately better to have paid information in order to create a middle class.

I’ve excerpted some of the author’s more forceful passages, but I found Lanier’s take on the future of an information economy — and his alternative model to it — very smart, and very humane.

Disk Imaging Workflow at BitCurator.net

Early in January I attended the first-ever BitCurator Users Forum in Chapel Hill. This was a fantastic day with a group of folks interested in the BitCurator project and digital forensics in an archive setting — definitely one of the most information-packed and directly applicable conferences or forums I’ve attended. I’m very much looking forward to next year’s.

I have a post on the BitCurator site on the disk imaging workflow I’m using with students presently, and there’s a great wrap-up of the day as well.

“Preserving the Voices of Revolution”

I have a paper out this month in the American Archivist with my friend and former UT Austin colleague Tim Arnold. The paper centers on best practices for collecting and preserving a collection of tweets, and looks specifically at a collection culled during the protests in Tahrir Square in early 2011. We dig into the difficulties of scoping search terms and users (in the context of the Egyptian Revolution of 2011 and more generally), the constraints of the Twitter API, and how to contextualize the harvesting of thousands of tweets through that API.

Many thanks to the original researchers for collecting the data and to the American Archivist for their interest in the paper.

Preserving the Voices of Revolution: Examining the Creation and Preservation of a Subject-Centered Collection of Tweets from the Eighteen Days in Egypt (SAA) (CU Scholar)

Checksumming till the cows come home

Jon Ippolito, from an interview with Trevor Owens at The Signal:

Two files with different passages of 1s and 0s automatically have different checksums but may still offer the same experience; for example, two copies of a digitized film may differ by a few frames but look identical to the human eye. The point of digitizing a Stanley Kubrick film isn’t to create a new mathematical artifact with its own unchanging properties, but to capture for future generations the experience us old timers had of watching his cinematic genius in celluloid. As a custodian of culture, my job isn’t to ensure my DVD of A Clockwork Orange is faithful to some technician’s choices when digitizing the film; it’s to ensure it’s faithful to Kubrick’s choices as a filmmaker.


As in nearly all storage-based solutions, fixity does little to help capture context.  We can run checksums on the Riverside “King Lear” till the cows come home, and it still won’t tell us that boys played women’s parts, or that Elizabethan actors spoke with rounded vowels that sound more like a contemporary American accent than the King’s English, or how each generation of performers has drawn on the previous for inspiration. Even on a manuscript level, a checksum will only validate one of many variations of a text that was in reality constantly mutating and evolving.

In my own preoccupation with disk imaging, generating checksums and storing them on servers, I forget that at best this is the very beginning of preservation; not an incontestable “ground truth” of the artifact.