Thursday, December 23, 2010

Some data on data.

"There was 5 exabytes of information created between the dawn of civilization through 2003 . . . but that much information is now created every 2 days, and the pace is increasing."
- Eric Schmidt, CEO, Google (1 exabyte = 1 billion gigabytes)

"So plot that curve and now you see why it's so painful to operate in these information markets. The information explosion is so profoundly larger than anyone ever thought—I certainly, it's larger than anything I ever thought—but that's what this opportunity creates."
- Eric Schmidt, CEO, Google

"In recent years Oracle, IBM, Microsoft and SAP between them have spent more than $15 billion on buying software firms specialising in data management and analytics. This industry is estimated to be worth more than $100 billion and growing at almost 10% a year, roughly twice as fast as the software business as a whole."
- Economist, February 2010

Pulling from Eric Schmidt's recent talks, one could say that in recent years, humanity's creation of data has gone from a light dusting to a veritable avalanche. And looking to the Economist's special report from February, "The Data Deluge," there are some stunning facts:
  • "The amount of digital information increases tenfold every five years."
  • "By 2013 the amount of traffic flowing over the internet annually will reach 667 exabytes . . . . [T]he quantity of data continues to grow faster than the ability of the network to carry it all."
  • "When the Sloan Digital Sky Survey started work in 2000, its telescope in New Mexico collected more data in its first few weeks than had been amassed in the entire history of astronomy. Now, a decade later, its archive contains a whopping 140 terabytes of information. A successor, the Large Synoptic Survey Telescope, due to come on stream in Chile in 2016, will acquire that quantity of data every five days."
  • The amount of data in the world is growing at a compound annual growth rate of 60%.
The foregoing facts show why last week, when I was discussing what I would invest in were I a VC right now, I said I would be looking for strong plays in data management, data demand, and data mining/processing. Revealing a thought I didn't really know I had until it came out of my mouth, I continued, "because the future is literally in their hands."

So why did I want to write about this? Primarily, the rapid expansion of data demand and data generation poses many large, interesting problems. It also creates vast opportunities and has led to the founding of some fascinating companies. And there will be much more potential for starting companies and creating value.

And as if this were not already clear, it appears I'm not the only one thinking about data, how it is changing, and how that will affect our behavior. In fact, soon after I began writing this, I found a recent post by Mark Suster on the same topic.

So as a survey, if I were looking for investments in this area today, what kind of companies would I be looking for?

Data Demand/Management -- I'd definitely be looking for companies with data management solutions to help meet the coming decade's massive data demand. These companies would be addressing critical problems in data compression and transmission.

Just look at current trends--cell phone providers are cracking down on unlimited data plans while ISPs are spending millions to fight net neutrality. To me, these demonstrate a broader pain point--meeting demand is incredibly complicated and expensive, and that means there is room for disruptive technologies to make waves in this field.

Among other sources of data demand, communication and mobile applications are only going to make these opportunities larger. The iPhone has crippled AT&T's networks in NY and SF, and the smartphone market is growing incredibly fast. Consumers want to be able to communicate from anywhere, download large files on the go, and stream content on a constant basis. Providers are feeling the pain. One solution may come with newly freed portions of the communications spectrum, which will allow a wifi-type signal to carry over kilometers, not meters, but it's clear that increased data demand will not slow and will continue to create opportunities in this space.

Data Storage -- 5 exabytes of data is being created every two days, and the pace of creation is only accelerating. Enough said.

Data Mining (i.e., making sense of it all) -- As Einstein said, "Information is not knowledge." One of the largest problems with the mountain of data being generated on a constant basis is that it's hard to know where to tap into it to make it meaningful. We have a surplus of data, and we have scarcity in terms of the time and resources necessary to make the most of it. Stated otherwise, there will be massive opportunities here. Case in point--Google's mission statement: "Google's mission is to organize the world's information and make it universally accessible and useful."

And some of the start-ups in this field are absolutely amazing. One of my long-time favorites is Palantir Technologies, which has created an incredible interface for sorting and making logical connections between massive amounts of data. Their demo videos show incredible applications, ranging from piecing together sleeper cells of terrorists to unwinding incredibly complicated tranches of mortgage-backed securities. Actually, if you've never watched their demos and you're a geek like me, just go watch some. They are absolutely amazing.

Another company I'm excited about in this space is Recorded Future, which is building a "temporal analytics engine" and has taken funding from both Google Ventures and the CIA. Its founder served in the Swedish special forces. The company claims that with the right data mining algorithms, they could predict the future. Obviously this would be massive for both counterterrorism and investment applications.

But these are just two interesting companies. I believe this space has incredible potential to generate innovative winners. Many industries are using the massive amounts of data collected to form increasingly complicated models for predicting behavior, and this has huge potential for controlling risk, with obvious applications to pooled risk (e.g., insurance policies) and investment schemes. Only time will tell who those success stories will be, but I am certain that great companies will rise to fill this void.

Data Security -- As we rely on data more and more--and as more is generated and transmitted in massive volumes--security will only become more complex and more important. In the next ten years, it is likely that all of our medical and financial records will be digital. As WikiLeaks has recently demonstrated, sensitive government documents are already digital and vulnerable to the whims of one or two petty officers with clearances. And data always implicates personal privacy concerns. Even if generational notions of privacy are shifting, even younger generations care about some baseline level of privacy, and data security goes hand in hand with those expectations. For one, the vast amount of personal and financial data being generated has large potential for new types of crime, as less-than-reputable persons have gained access to huge databases of credit card and social security numbers. Data security will only become more critical in the coming decade, and if I were investing today, I would be looking for teams with expertise and vision in this arena.

Conclusion -- As the Economist stated in its report on data:
"[T]he world contains an unimaginably vast amount of digital information which is getting ever vaster ever more rapidly. This makes it possible to do many things that previously could not be done: spot business trends, prevent diseases, combat crime and so on. Managed well, the data can be used to unlock new sources of economic value, provide fresh insights into science and hold governments to account."
That last sentence should get you excited. With data as with anything else in technology, challenges should be seen as a sign that there is underlying potential for transformative, disruptive opportunities. The "deluge of data" will not slow. It will only accelerate. At this point, there is no looking back. In that way, I'm bullish on data-driven opportunities for the same reason I'm bullish on cleantech: the underlying problems will only become more important, meaning the opportunities will only become greater. And a new generation of exciting companies is going to rise to seize them.

Which companies are you watching? Which spaces within data excite you the most?

No comments:

Post a Comment