Monday, May 18, 2009

The juice and the cloud

You'll recall my May 8 post "FriendFeed hungry, Twitter pining for the fjords - does this qualify as a perfect storm?", written in response to the same-day outages of FriendFeed and Twitter. The FriendFeed outage was clearly unplanned, as Bret Taylor noted:

Our data center, SVColo (, lost power (and apparently all generators as well) this afternoon, causing our site to be completely unavailable for a couple of hours. We apologize for the extended outage....

This outage impacted our site as well as a number of other web sites hosted at SVColo. We are obviously fairly frustrated by the incident, and we are working hard to get everything else back online now.

The SvColo website does not have a news or a blog section, so they don't have any way to communicate to customers regarding bad (or good) news. And to my knowledge, SVColo has not made a public statement on the May 8 outage. But people certainly said a lot about SVColo, including this Nicholas James joke:

@scobleizer You need to get the #friendfeed guys to host at Rackspace ;)

Of course Robert Scoble, a new Rackspace employee, is well aware that his employer has had outages of its own. An example from November 2007:

First, Rackspace had a “maintenance failure” at its Dallas data center on Sunday. Then a truck driver hit a transformer feeding power to the Rackspace data center on Monday.

There's all sorts of directions in which I can go from here, but for now I'm going to confine myself to the fact that any cloud-based service is dependent upon electrical power, or some other form of power. And no matter how many redundant systems or backup generators you set up, no one can provide 100% uptime.

Speaking of uptime, there was that Uptime Institute/McKinsey report a month ago, which the New York Times covered:

“The industry has assumed the financial benefits of cloud computing and, in our view, that’s a faulty assumption,” said Will Forrest, a principal at McKinsey, who led the study.

Owning the hardware, McKinsey states, is actually cost-effective for most corporations when the depreciation write-offs for tax purposes are included. And the labor savings from moving to the cloud model has been greatly exaggerated, Mr. Forrest says. The care and feeding of a company’s software, regardless of where it’s hosted, and providing help to users both remain labor-intensive endeavors.

Clouds, Mr. Forrest notes, can make a lot of sense for small and medium-sized companies, typically with revenue of $500 million or less. picked at the underlying assumptions:

McKinsey repeats a very common mistake made by people skeptical about cloud computing: confusing the marginal cost of a single server in a company's own data center with the total cost of a server hosted by a cloud provider. In my research, the cost for data center construction runs $600 to $1000 per square foot. Some portion of that amount needs to be assigned to the internal server instance; furthermore, owning a data center is not a one-time expense—there's maintenance as well, which adds to the monthly cost of an internal server. That doesn't even address the capital expense assignment of additional capital assets like network equipment, storage arrays and the like.

But cost is only one aspect of a cloud vs. in-house decision. Regarding the point of this post, what of reliability?

I don't think these periodic failures are endemic to cloud computing as a whole; indeed, I think it should be realized as a solution which is much less susceptible to such outages and requiring fewer user-level redundancies than more conventional hosting practices. This is contrary to conventional wisdom, but I think conventional wisdom has lost its head on this one. Businesses have long since figured out how to solve single-point-of-failure problems in their on internal systems. It's not easy or cheap, but it's possible, and what you are buying when you buy cloud services should be just that solution. The entire point of the model is to defray the costs of doing so to a point where businesses otherwise unable to afford such power and redundancy can cost-effectively share the benefits of such engineering.

Assume for the moment that FriendFeed prized reliability above all else, including cost. (For a startup with no revenue, that's a faulty assumption, but we'll make it anyway.) Could the FriendFeeders have built an in-house system that is more reliable than the service that SVColo offers? Considering FriendFeed's reputation for having very few outages, I doubt it.
blog comments powered by Disqus