The DNA of Big Data

When James Watson and Francis Crick won the 1962 Nobel Prize for Medicine for discovering the structure of DNA nine years earlier, scientists of the time would have been hard pushed to conceive of the uses such knowledge would be put to 50 years later.

Fast forward half a century and the 3 billion DNA bases have been unravelled and, though we still don’t really know what 97% of the human genome is for, the era of genetic medicine has begun - and its key is in analysing vast quantities of data.

How does this relate to the communications industry though? There are several points to consider, the first is that data and its analysis will become ever important in the future across all industries, and the mobile industry will be at the heart of the collection, transfer and analysis of data coming from multiple sources.

The second point is perhaps more important. While we may have a vague idea of the value of a concept or discovery, it is impossible to understand the full future impact of a discovery from the standpoint of the present.

According to the computer giant IBM, 80% of the world’s data is unstructured, and most businesses do not even attempt to use this data to their advantage. While software approaches (such as one called Hadoop) have been developed to structure data by mapping it and creating a coherent file structure for it, we are only at the beginnings of an era where the value of bulk data can be realised through coherent analysis of millions, or even billions of data points.

AT&T Labs, for example, has been looking into this for a few years now and has already shown that a clear picture of individuals’ habits and patterns of behaviour can be achieved through “mining” simple data from mobile subscribers. In this way AT&T Labs is able to tell us with a high degree of certainty which US cities have the longest commutes or where people travel from to go to festivals.

The next step is to combine this with, say, healthcare data achieved through large scale remote patient monitoring, to achieve a more accurate picture of the individual through knowledge of that individual’s peers. It may not be an exaggeration to say that we may be on the edge of a new era where individual circumstances are routinely informed through precise data analysis of a data “cloud”.

The third point to mention is about time-scales. It is over fifty years since the Crick and Watson discovery, and only now are we really understanding the implications. That must tell us not to expect too much too soon. The benefits of Big Data will come, but they may take some time...