The possibility of losing our personal digital archive looms over all of us. Dr. Nick Goldman, of the European Bioinformatics Institute, has co-developed a solution: storing computer files as physical DNA, whose code can be read and converted back into a file's original information. Brooke speaks with Dr. Goldman about using DNA in the digital world, and how this storage method could last tens of thousands of years.
************ THIS IS A RUSHED, UNEDITED TRANSCRIPT *************
BROOKE: And I’m Brooke Gladstone. If by now you’ve become a little anxious about the vulnerability of your own digital histories…just be glad you’re not working at the European Bioinformatics Institute in England. The institute offers a free storage service to genomic researchers around the world. They have a 60 petabyte storage capacity, roughly equivalent to about 90 million CDs of data. Between equipment maintenance and the explosive growth of genomic data to be stored… the service was getting too expensive to continue. So Dr. Nick Goldman, a mathematician and a genome scientist at the Institute and his colleague Ewan Birney, came up with a solution. He tells it like this:
GOLDMAN Yep, two scientists walk into a pub and say: isn’t DNA - which is the root of all our problems - itself a digital storage medium for information? And we said well, what's stopping us from taking anything digital and putting it into a form where we can represent it as though it was a bit of DNA, then actually make that DNA read it back just the way we read the genomes of living organisms now, and put the raw information back together again.
BROOKE: That sounds like one uproarious pub night. [laughs]
GOLDMAN: Amongst scientists like us it was pretty good. both laugh]
BROOKE: So this is organic storage, right? How do you do it?
GOLDMAN: Well the information it's storing is held by a code that knows how to interpret the sequence of bases so the bases are conveniently represented by the letters ACG and T, so one very simple code using ACGT would allow us to write "Cat" and not very much else. But as long as we make up some more complicated rules, we can pretty much store any information. Just like if we only use 0s and 1s it lets us just store some very simple numbers, but if we make a complicated code, it can store music, and we call that complicated code mp3. And what Ewan and myself invented was a code that can take any digital file on a computer and represent it as letters ACG and T, then we get that made as actual physical DNA>
BROOKE: WE're not talking about live DNA, right? I mean this isn't a vial of blood containing the Magna Carta: this is more like dust.
GOLDMAN: It looks like dust, but in fact it's a chemical molecule that is a chain of smaller molecules and we're using it in a very similar way to the way living organisms use it, but we're not involving living organisms. There would be enormous ethics issues about trying to do that, and also it's a very bad way to store information, to put it into a living organism that's constantly copying that information and making occasional mistakes. Much better to just put the information in a form where it doesn't mutate.
BROOKE: Not to mention the fact that living organisms eventually die of old age, and this stuff can retain its information for 10s of thousands of years potentially.
GOLDMAN: Particularly if you keep it cold and dry and in the dark, scientists were able to extract DNA and put back together large parts of the genomes of 750,000 year old ancient horses. And that wasn't even a carefully prepared sample, right? That was just a dead horse.
BROOKE: Okay give me an image of how much space it takes to store digital information in DNA form.
GOLDMAN: An estimate of all the digital information in the whole world everything that's connected to the internet, if you stored that much information purely in DNA using the system we devised, you could fit all of it in the back of a minivan. The big bottleneck for us is the cost of creating an information archive in DNA. In the last 15 years since the human genome was first sequenced, there's been a million fold improvement in the price and the speed of doing that. If the next 15 years sees another million fold improvement, and if we also get that improvement in the creation of DNA, then we have a product that private individuals will use to store information they want kept really safe, whether it's wedding photographs or the presidential archives or records on nuclear waste dumps. Those will be more than affordable in 10-15 years than we think in three or four years for some of the customers who would pay the most for the most valuable information.
BROOKE: Now, we're talking to you as part of a show we're doing on the potential digital dark age. Picture one of those ancient tomb raiding films.
BROOKE: It is now 10,000 years into the future, and man has apparently wiped out its store of knowledge and start all over again, and then they stumble upon one of your storage facilities. What clues would you give them about how they could access once again the world of human culture?
GOLDMAN: We have thought about this. You know the voyager spacecraft was sent with messages for aliens we haven't quite imagined to try to tell them something about people on earth, and what you need to do is to send someone a signal that says "there is important information hidden here". We build this storage facility and we put big pictures that show double helix diagrams, and you know some arrows saying information is in here, you know, if they're technologically advanced, they're going to be into reading DNA because we will rediscover the knowledge that that's where the information about life is stored.
BROOKE: Let us concede it took us quite a while.
GOLDMAN: It took us a while, but you know, we got there. It took us a few thousand years. Maybe we're a bit quicker next time. [Brooke laughs] You wanted me to play the game -
BROOKE: Yes! Play the game!
GOLDMAN: I admit it's taken a long time. And then we'll have to make sure they can understand the code that's being used. And there we'd use the same kind of strategy as the voyager, you pick bits of information to store first, that you make the easiest things to decode that just let people know they're on the right track. You encode something that's information about the structure of a hydrogen atom, because everyone will have discovered hydrogen. And then you give a series of successively more complicated messages. The last one of which explains in full your coding and decoding system. And then that gives access to the last room in this storage facility, which has got the entire archive of human knowledge encoded on DNA> And you just let people go.
BROOKE: Where do you think this hypothetical storage facility would be?
GOLDMAN: A mountainside in norway, or the Antarctic might be a good one because there are a number of treaties already there about protecting it so maybe we could piggyback on those and get some international corporation going about having an information store.
BROOKE: Dr. Goldman, thank you very much.
GOLDMAN: It's my pleasure, it's been great to talk to you.
BROOKE: Dr. Nick Goldman is a mathematician and genome scientist at the European Bioinformatics Institute in Cambridge, England. This month, Dr. Goldman sprinkled the public with some of his DNA pixie dust. He and artist Charlotte Jarvis collaborated with the Kreutzer Quartet for a series of performances in London called “Music of the Spheres.” The composition is made of up of two verses... And a refrain that was coded into DNA dust -- sudsed up, and then released, bathing the audience with DNA music bubbles.
Right now, you’re hearing Fibers AND Coils, composed by Mihailo Trandafilovski.
MEREDITH: my name is meredith and I’m calling from phoenix. In high school I did a lot of music composition and when I went to Oberlin college I decided to be a composition major. At that time everything I composed, I would have audiotapes of. I kept the audiotapes around somewhere but I have nothing to play them on. And I regret that. I have one piece in particular that goes through my head a lot that I wish I could listen to.