Over the last 15 years, there have been two useful heuristics for figuring out where computing is going.
When I want to look at how applications are going to be built, I look at games. After all games are at the forefront of creating new kinds of digital experiences, and the need to push the boundaries of how we entertain ourselves is a crucial to create new revenue and sales opportunities.
When I want to look at how infrastructure is going to change, I look at what people want to do in the super computer space.
Two nights ago, I had the marvelous opportunity to hear a talk that was a discussion of Physics and Big Data. As a software infrastructure guy, at the end of the day I like to think about how to build systems that enable applications, I have been wondering if Big Data was going through a bigger-faster-stronger phase or whether there were new intrinsic problems.
And the answer is yes to both.
Clearly we need systems that can do more analysis faster, store more data at cheaper costs, etc.
What was not as obvious was that exponential increase in transistors coupled with the disruptive trend of 3D printing was going to enable:
- A proliferation of very sensitive distributed sensors that need to be calibrated and whose data needs to be collected.
- The ability to find even weaker signals in the data.
In effect, we were going to be able collect more data faster and because of that we will be able to find things that we could not find before. And solving 1 and 2 on it’s own are very interesting problems that can keep me busy for the next 10 years …
However there are some new problems that come out of that:
We will need to be able to find new ways to explore data and track our exploration through the data.
We will need ways to combine the datasets we create. After all as more and more sensors get created, and sensors get cheaper, the ability to combine data sets will become crucial. And as the scale of the datasets grows, an ETL becomes less realistic.
And then to make this all more interesting, there is some thought that the way we collect data itself may create signals and that meta-analysis of the data will be required. And how you do that is an interesting problem on itself. And how do you create systems can correct for that…
My head has sufficiently exploded. Turns out that just making things go faster isn’t the only problem worth solving…