Most of us think nothing of putting our lives in the cloud; photos in Flickr, videos on YouTube, most everything on Facebook. But what about when those services abruptly go away, taking all of our collective contributions with them? Well Jason Scott operates on the assumption that everything online will one day disappear. He explains to Bob why he and the Archive Team are dedicated to saving user-generated content for posterity.
BROOKE GLADSTONE: This is On the Media. I’m Brooke Gladstone.
BOB GARFIELD: And I’m Bob Garfield. Once we put photos in a scrapbook. Today we put them on Flickr. Once we chronicled our days in a diary. Now we update our Facebook page. Once we kept Super 8 movies of our kids. These days we post videos on YouTube. Once upon a time, we also put things on GeoCities and Friendster and Google Video. But now – they’re long gone.
Well, Jason Scott operates on the premise that every repository of user-generated content online will one day die but that the content we put there is worth saving. He leads an ad hoc group of archivists called the Archive Team, who swoop in to salvage material when a site is closing. Still, he wishes the users would render his service obsolete. And so, he urges everyone who begins to post to prepare for the end.
JASON SCOTT: Anytime you want to join up with anything, any kind of service that lets you do things for free, the first question is, where is your export function, where can I grab a copy from your site of the material? If they say, we’re working on it, then they’re lying to you. It should be as easy for them to do that as anything else. So if they do have an export function, use it. People put their lives online and then one day wake up and realize it’s not there anymore. They are keeping their memories on spinning magnetic pieces of metal.
BOB GARFIELD: That somebody else owns.
JASON SCOTT: Yes.
BOB GARFIELD: Set the scene for me. You get the notice of some service that is on its way out, what do you do?
JASON SCOTT: It’s helpful to understand that there’s a whole bunch of services out there, where you might have millions of accounts – things like GeoCities, Friendster, you know, even places like Foursquare and Flickr, where people have been encouraged to, for free, upload things they made or are doing, and then at some point someone moves a check mark from column A to column B, and they decide, eh, after this next financial quarter I think we’ll be taking this down. And the amount of time they give you is – basically random.
I’ve seen everything from six months to 48 hours. And all these people who may not have even thought about this site for – years suddenly are having it taken away. They might not be alive, they may not know how to get to their old account. They may not be checking that email.
And so, what we did was come up with this idea of the Archive Team, a collection of archivists, developers, and we would do our best to take one snapshot of the place, put it into an archive and give people the option of getting some of their data back.
BOB GARFIELD: Give me some example. What sites have you rushed in to salvage what is stored there?
JASON SCOTT: There were a couple of sites that did podcasts – Podango, MyPodcast. And what would happen is, is they would literally give you four or five days to get off – thousands of shows, thousands of episodes. So we go in and we’ve pulled down hundreds and hundreds of shows and thousands of episodes.
Poetry.com, that was a company where people were basically making their poems available, and it had about 14 million written poems. And the company basically announced, we’re shutting down, we’re going to give you about a month, hope you enjoyed your time [LAUGHS] with your poetry. So we went in and we started downloading it, and what we discovered, to our great surprise, was they started blocking us from downloading the poetry.
BOB GARFIELD: What was the relationship between you and the authors at the time? Did they express frustration that they couldn’t get at their stuff?
JASON SCOTT: One of the things that always breaks our heart is that one of these companies will announce they’re shutting down, and they’ll put it into a blog post – “Goodbye, it’s been great,” and then all the comments will be, “Please help me, how do I save this? I can’t find my husband’s password, he died two years ago.” You know, we get compared to firemen. You’d go in and you try to grab what you can.
So we grabbed the most popular poems, based on their viewer counts, and then we tried to sequentially go through and get as many poems as we could.
BOB GARFIELD: Well, there’s a little vigilantism you’re describing here. Tell me about the legality?
JASON SCOTT: Oh man, you know, the thing is we all know that this country is a little psychotic about copyright, right? I mean, just a little bit. We’re not selling what we’re putting here. We’re not putting ads on it and putting it back up again. We’re definitely not giving it to other businesses and selling it to them, you know?
Some of these things have no commercial value whatsoever, some of them might have commercial value, but the fact is, is that we are literally being that guy that hopefully in 20 years, 50 years someone goes, “Oh, thank goodness they were here at that point.”
Sites that block us are extremely rare because we find these companies have actually given up not just watching them but even caring about them.
BOB GARFIELD: Now, much of what anyone posts is trivial, and if it gets lost, who cares. Is most of what you bring back just kind of, I don’t know, junk?
JASON SCOTT: You know, the example that I give is a Civil War letter to a wife from her husband who was on the front lines. It might be the most trivial thing just saying, hope the cows are okay, hope you’re fine, but there’s so much other information coded in there.
There could be a water mark showing that a company that said it never worked for that side did, in fact, sell paper to that side. It could be a certain kind of ink. It could be that that one front guy became a general, and this is one of the few cases of him signing his own name.
I know it’s a stretch but there are people right now taking some of the things we download and doing cultural analysis: “This is what happens when life went online, this is what happened when people reached a larger audience than their genetic line had ever reached. What did they do, given that power?”
And so, even though we might objectively say this is trivial, I wouldn’t want these read out to me one by one forever, everything historical that we see is because a whole line of people said, “Let’s now throw out that box, let’s not delete that tape, let’s not get rid of those pictures.” And I don’t want to be the guy who decided, okay, this is good, this is bad and then a hundred years later be hated.