On data center operating systems  

After strangely being blocked from commenting on the InfoWorld article below, presumably by its authors, I thought I would post my thoughts here on my blog.


This is actually an interesting and informative article written by some very forward-thinking and operationally experienced venture capitalists!

However, let us give credit where credit is wholly due: The notion of treating an entire datacenter as a computer and/or operating system that abstracts low-level primitives like individual containers, VMs, machines, racks and rows was invented out of necessity by Google. Why necessity you may ask? Well, because Google is one of only a handful (among Facebook & Microsoft) of companies in existence to have essentially been forced to deal with extreme scaling and distributed systems challenges many years before anyone else… Interested parties will take 30 min to read the following light-book (only 100 pages, and seminal) written by Google in 2009. The book’s authors include Google employee #8 (Urs Hölzle, global head of infrastructure), Luiz André Barroso (Google fellow), Jimmy Clidaras (Distinguished data center engineer) and even Jeff Dean made significant contributions.

“The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines”: http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905CAC006

After digesting this book, readers will understand how Google holistically addressed their global complexities of cluster-level scheduling and multi-data center resource management. Google has been containerizing all of their applications and workloads for over a decade now, they even run many of their VMs in containers (for security reasons, the shared kernel attack surface problem still persists). On their public cloud, all of their containerized services run in VMs.

We also have Google to thank for the rise of the modern containerization movement these past few years. Google contributed a core modern building block for containers into the Linux Kernel itself in 2007. Docker is built on top of cgroups and indeed most of the extremely large scale container deployments that power large scale production Internet services use cgroups for containerization, not Docker: http://en.wikipedia.org/wiki/Cgroups

Of late, Google has inspired some highly valuable software infrastructure industries (including Hadoop, thanks GFS/MapReduce + NoSQL, thanks BigTable). The latest transformative industry primed to hit the enterprise is containerization, again inspired and spawned out of Google.

It is extremely exciting to see Google launch the fully open source Kubernetes project, a compression of their work over the past 15 years in building a bespoke multi-datacenter operating system, a system called Borg. Kubernetes is the second fastest growing Go project (second only to Docker). It was released in June of 2014: https://github.com/GoogleCloudPlatform/kubernetes



Now read this

The $70B Open Source Software Paradox

It’s no news that over the last 10-15 years, particularly noticeable in the last 5 years, the enterprise software industry has gradually transformed from complex, expensive, closed-source (proprietary) software stacks to... Continue →