2016-07-12

Keyword tags:

Data Backup

Data Protection

Storage Strategy

Many moons ago (well, about a month ago), I discussed DNA being a future alternative for archiving and long term storage.

Recently Microsoft claims to have “reached an early but important milestone in DNA storage by storing a record 200 megabytes of data on the molecular strands.” In that release, they showed a picture of a pink smudge with size smaller than the tip of a pencil at the bottom of a test tube – that smudge being the DNA.

According to their blog post, “The Microsoft-UW [University of Washington] team stored digital versions of works of art (including a high-definition video by the band OK Go!), the Universal Declaration of Human Rights in more than 100 languages, the top 100 books of Project Guttenberg and the nonprofit Crop Trust’s seed database on DNA strands.”

As far as technology goes, this is one of the tech projects that will hang around, even after certain forms of storage drives becomes obsolete. Remember floppy disks anyone? As many scientists have noted – as long as there are DNA lifeforms, we will still have a way of reading ACTG sequences.

So far, these are restricted to lab works and encoding materials for archival use; that is great news for preservation of historically significant information. What if someone could invent a decoder that could process these data in the speed of flash? Maybe DNA could be a storage medium that links and works in conjunction with the future equivalent of flash drives.

This goes beyond purely data storage. It could mean a more secure way of storing and accessing data. On the pure assumption that any fully developed systems would require no connections to any external database or servers, because the data can theoretically be exclusively housed in the hardware itself.

I’m of course talking about theoretical uses in machine learning and AI. Many companies are scrambling to make the next big step in machine learning to achieve a true AI. One of the issues has been the problem of fitting petabytes of storage on a single piece of hardware or device. As of right now, much of those data live in server farms or cloud infrastructures. So for a hardware to work, it will be connected to the servers to retrieve and process data.

Where there’s connection, there’s a way to hack it externally. That’s also the reason why many financial businesses within the APAC region are reluctant to move away from legacy – due to worries of a secure infrastructure.

In terms of security, given the speed in which DNA can duplicate – provided scientist can achieve error free duplication with DNA – millions of copies of the exact same data can serve as a failsafe, as well as allowing for the same data to be copied onto multiple devices.

Already companies are looking at programmable bots to improve their platforms and services. Virtual assistants like Siri, Google Now and Cortana has been pitted against each other, forcing their parents (read: developers) to improve their coding and expand the parameters. Automotive companies are working on self-driving cars. If this trend is any indication, the next step being commercially available AI for majority of device and services isn’t that much of a stretch.

If we are to achieve Sonny like assisstants – we will be looking at a lot of data. Right now one of the hurdles for AI, on top of building a true general purpose algorithm, is the potential to fit all the data needed into a single device the size of a mobile phone, without connecting to a cloud. (Think a more intelligent Siri or Google Now, except you don’t have to be connected to a network.) Keeping in mind AI is RAM intensive, it limits the types of storage that could be used. Unless there’s a way of fitting all these data into a decent sized metal, high intelligence robots or devices might not be on the next possible agenda.

For both DNA and AI to achieve its full potential use, DNA needs to be faster and AI needs to expand their learning. Recent years sees and increase in collaborative or hybrid technology, using the best qualities of certain storage system to create a hybrid infrastructure using network and automatic dynamic partitioning.

With DNA storage noted to move beyond Moore’s law, things look promising. Microsoft states they “already [have] increased storage capacity a thousand times in the last year. And they believe they can make big advances in speed by applying computer science principles like error correction to the process.”

While our current technology doesn’t support this fantasy yet, and Microsoft acknowledging it’s a long way yet, I think humans have innovative minds that will find a solution to make it reality.

More on Microsoft's breakthrough here.

Microsoft's DNA storage project here. 

Show more