Most Popular Posts

Nov 26, 2011

My Bets on Scalable Databases

For those who don’t follow the database industry closely, here is the problem: Relational database servers clearly dominated the market for the last 15-20 years, but today they render inadequate for  Big Data. There are sectors of the industry where relational databases have never been able to perform - web indexing, bio research, etc. MapReduce has emerged to serve those sectors. While the MapReduce pattern is no match for SQL in terms of functionality, the distributed storage architecture on which MapReduce is based has a great potential for scalable processing.

Now the question is: Which approach will produce a rich, yet scalable, data processing engine first? 

  • a) Enabling relational databases to operate over distributed storage, or
  • b) Expanding the processing functionality over distributed storage?

I bet on the latter approach:

Scalable databases will emerge from distributed storage.

It may seem that I am betting against the odds, because relational database vendors already have both the code and the people for the job. However, those codebases are about 20 years old and thus they are  barely modifiable. The assumption that data is locally, yet exclusively, available is spread out everywhere. So I don’t believe those codebases will be of much help. Even if database vendors abandon their existing codebases, and try to solely leverage their people to build a new data processing technology from scratch, there are ecosystems grown around those legacy database engines that will keep randomizing the development process and will hold back progress. So those developers will be much less efficient then developers who don’t have to deal with such a legacy baggage.

If the above bet comes true, it will trail the following consequence:

There will be new companies who will play significant roles in the database industry.

Nov 23, 2011

Rest in Peace, PFX

This is a milestone in my career at Microsoft that marks the end of About Me @ Microsoft (part II), but it doesn’t yet mark the beginning of “Part III”. Part III is unclear at this moment. (I hope there will be such a part). I should know more in about a week or two.

Briefly, without disclosing any details:

The PFX team (Parallel .NET Framework Extensions) no longer exists.

By “the PFX team” I mean the people, not the products. The codebase still exists. There is another team that is chartered to maintain it, but no PFX developer is on that team. I don’t know what the future of those products will be, and I doubt there is anybody who knows that at this point. So please don’t ask me. I can only hope innovation in .NET parallelism will continue.

Looking back, I feel lucky that I’ve had the rare chance to be part of the PFX team. The knowledge I’ve gained during these two years is priceless. I came with a few ideas I wanted to explore, and now I’m leaving with a lot more.

While I’m sad about how abruptly things ended, I’m looking at it positively – it’s the trigger I needed to take the next step in my career. I am looking ahead empowered for new adventures.