Archive

Archive for the ‘software fluidity’ Category

The enterprise "barrier-to-exit" to cloud computing

December 2, 2008 Leave a comment

An interesting discussion ensued on Twitter this weekend between myself and George Reese of Valtira. George–who recently posted some thought provoking posts on O’Reilly Broadcast about cloud security, and is writing a book on cloud computing–argued strongly that the benefits gained from moving to the cloud outweighed any additional costs that may ensue. In fact, in one tweet he noted:

IT is a barrier to getting things done for most businesses; the Cloud reduces or eliminates that barrier.

I reacted strongly to that statement; I don’t buy that IT is that bad in all cases (though some certainly is), nor do I buy that simply eliminating a barrier to getting something done makes it worth while. Besides, the barrier being removed isn’t strictly financial, it is corporate IT policy. I can build a kick butt home entertainment system for my house for $50,000; that doesn’t mean it’s the right thing to do.

However, as the conversation unfolded, it became clear that George and I were coming at the problem from two different angles. George was talking about many SMB organizations, which really can’t justify the cost of building their own IT infrastructure, but have been faced with a choice of doing just that, turning to (expensive and often rigid) managed hosting, or putting a server in a colo space somewhere (and maintaining that server). Not very happy choices.

Enter the cloud. Now these same businesses can simply grab capacity on demand, start and stop billing at their leisure and get real world class power, virtualization and networking infrastructure without having to put an ounce of thought into it. Yeah, it costs more than simply running a server would cost, but when you add the infrastructure/managed hosting fees/colo leases, cloud almost always looks like the better deal. At least that’s what George claims his numbers show, and I’m willing to accept that. It makes sense to me.

I, on the other hand, was thinking of medium to large enterprises which already own significant data center infrastructure, and already have sunk costs in power, cooling and assorted infrastructure. When looking at this class of business, these sunk costs must be added to server acquisition and operation costs when rationalizing against the costs of gaining the same services from the cloud. In this case, these investments often tip the balance, and it becomes much cheaper to use existing infrastructure (though with some automation) to deliver fixed capacity loads. As I discussed recently, the cloud generally only gets interesting for loads that are not running 24X7.

(George actually notes a class of applications that sadly are also good candidates, though they shouldn’t necessarily be: applications that IT just can’t or won’t get to on behalf of a business unit. George claims his business makes good money meeting the needs of marketing organizations that have this problem. Just make sure the ROI is really worth it before taking this option, however.)

This existing investment in infrastructure therefore acts almost as a “barrier-to-exit” for these enterprises when considering moving to the cloud. It seems to me highly ironic, and perhaps somewhat unique, that certain aspects of the cloud computing market will be blazed not by organizations with multiple data centers and thousands upon thousands of servers, but by the little mom-and-pop shop that used to own a couple of servers in a colo somewhere that finally shut them down and turned to Amazon. How cool is that?

The good news, as I hinted at earlier, is that there is technology that can be rationalized financially–through capital equipment and energy savings–which in turn can “grease the skids” for cloud adoption in the future. Ask the guys at 3tera. They’ll tell you that their cloud infrastructure allows an enterprise to optimize infrastructure usage while enabling workload portability (though not running workload portability) between cloud providers running their stuff. VMWare introduced their vCloud initiative specifically to make enterprises aware of the work they are doing to allow workload portability across data centers running their stuff. Cisco (my employer) is addressing the problem. In fact, there are several great products out there who can give you cloud technology in your enterprise data center that will open the door to cloud adoption now (with things like cloudbursting) and in the future.

If you aren’t considering how to “cloud enable” your entire infrastructure today, you ought to be getting nervous. Your competitors probably are looking closely at these technologies, and when the time is right, their barrier-to-exit will be lower than yours. Then, the true costs of moving an existing data center infrastructure to the cloud will become painfully obvious.

Many thanks to George for the excellent discussion. Twitter is becoming a great venue for cloud discussions.

Advertisements

Do Your Cloud Applications Need To Be Elastic?

November 22, 2008 Leave a comment

I got to spend a few hours at Sys-Con’s Cloud Computing Expo yesterday, and I have to say it was most certainly an intellectually stimulating day. Not only was just about every US cloud startup represented in one way or another, but included were an unusual conference session, and a meetup of fans of CloudCamp.

While listening in on a session, I overheard one participant ask how the cloud would scale their application if they couldn’t replicate it. This triggered a strong response in me, as I really feel for those that confuse autonomic infrastructures with magic applied to scaling unscalable applications. Let me be clear, the cloud can’t scale your application (much, at least) if you didn’t design it to be scaled. Period.

However, that caused me to ask myself whether or not an application had to be horizontally scalable in order to gain economically while running in an Infrastructure as a Service (IaaS) cloud. The answer, I think, is that it depends.

Chris FlexFleck of Citrix wrote up a pretty decent two part explanation of this on his blog a few weeks ago. He starts out with some basic costs of acquiring and running 5 Quad-core servers–either on-premises (amortized over 3 years at 5%) or in a colocation data center–against the cost of running equivalent “high CPU” servers 24X7 on Amazon’s EC2. The short short of his initial post is that it is much more expensive to run full time on EC2 than it is to run on premises or in the colo facility.

How much more expensive?

  • On-premises: $7800/year
  • Colocation: $13,800/year
  • Amazon EC2: $35,040/year

I tend to believe this reflects the truth, even if its not 100% accurate. First, while you may think “ah, Amazon…that’s 10¢ a CPU hour”, in point of fact most production applications that you read about in the cloud-o-sphere are using the larger instances. Chris is right to use high CPU instances in his comparison at 80¢/CPU hour. Second, while its tempting to think in terms of upfront costs, your accounting department will in fact spread the capital costs out over several years, usually 3 years for a server.

In the second part of his analysis, however, Chris notes that the cost of the same Amazon instances vary based on the amount of time they are actually used, as opposed to the physical infrastructure that must be paid for whether it is used or not (with the possible exception of power and AC costs). This comes into play in a big way if the same instances are used judiciously for varying workloads, such as the hybrid fixed/cloud approach he uses as an example.

In other words, if you have an elastic load, plan for “standard” variances on-premises, but allow “excessive” spikes in load to trigger instances on EC2, you suddenly have a very compelling case relative to buying enough physical infrastructure to handle excessive peaks yourself. As Chris notes:

“To put some simple numbers to it based on the original example, let’s assume that the constant workload is roughly equal to 5 Quadcore server capacity. The variable workload on the other hand peaks at 160% of the base requirement, however it is required only about 400 hours per year, which could translate to 12 hours a day for the month of December or 33 hours per month for peak loads such as test or batch loads. The cost for a premise only solution for this situation comes to roughly 2X or $ 15,600 per year assuming existing space and a 20% factor of safety above peak load. If on the other hand you were able to utilize a Cloud for only the peak loads the incremental cost would be only $1,000. ( Based on Amazon EC2 )

Premise Only
$ 15,600 Annual cost ( 2 x 7,800 from Part 1 )
Premise Plus Cloud
$ 7,800 Annual cost from Part 1
$ 1,000 Cloud EC2 – ( 400 x .8 x 3 )
$ 8,800 Annual Cost Premise Plus Cloud “

The lesson of our story? Using the cloud makes the most sense when you have an elastic load. I would postulate that another option would be a load that is not powered on at full strength 100% of the time. Some examples might include:

  • Dev/test lab server instances
  • Scale-out applications, especially web application architectures
  • Seasonal load applications, such as personal income tax processing systems or retail accounting systems

On the other hand, you probably would not use Infrastructure as a Service today for:

  • That little accounting application that has to run at all times, but has at most 20 concurrent users
  • The MS Exchange server for your 10 person company. (Microsoft’s multi-tenant Exchange online offering is different–I’m talking hosting your own instance in EC2)
  • Your network monitoring infrastructure

Now, the managed hosting guys are going to probably jump down my throat with counter arguments about the level of service provided by (at least their) hosting clouds, but my experience is that all of these clouds actually treat self-service as self-service, and that there really is very little difference between do-it-yourself on-premises and do-it-yourself in cloud.

What would change these economics to the point that it would make sense to run any or all of your applications in an IaaS cloud? Well, I personally think you need to see a real commodity market for compute and storage capacity before you see the pricing that reflects economies in favor of running fixed loads in the cloud. There have been a wide variety of posts about what it would take [pdf] to establish a cloud market in the past, so I won’t go back over that subject here. However, if you are considering “moving my data center to the cloud”, please keep these simple economics in mind.

Why I Think CohesiveFT’s VPN-Cubed Matters

October 28, 2008 Leave a comment

You may have seen some news about CohesiveFT’s new product today–in large part thanks the the excellent online marketing push they made in the days preceding the announcement. (I had a great conversation with Patrick Kerpan, their CTO.) Normally, I would get a little suspicious about how big a deal such an announcement really is, but I have to say this one may be for real. And so do others, like Krishnan Subramanian of CloudAve.

CohesiveFT’s VPN-Cubed is targeting what I call “the last great frontier of the cloud”, networking. Specifically, it is focusing a key problem–data security and control–in a unique way. The idea is that VPN-Cubed gives you software that allows you to create a VPN of sorts that is under your personal control, regardless of where the endpoints reside, on or off the cloud. Think of it as creating a private cloud network, capable of tying systems together across a plethora of cloud providers, as well as your own network.

The use case architecture is really very simple.


Diagram courtesy of CohesiveFT

VPNCubed Manager VMs are run in the network infrastructure that you wish to add to your cloud VPN. The manager then acts as a VPN gateway for the other VMs in that network, who can then communicate to other systems on the VPN via virtual NICs assigned to the VPN. I’ll stop there, because networking is not my thing, but I will say it is important to note that this is a portable VPN infrastructure, which you can run on any compatible cloud, and CohesiveFT’s business is to create images that will run on as many clouds as possible.

Patrick made a point of using the word “control” a lot in our conversation. I think this is where VPN-Cubed is a game changer. It is one of the first products I’ve seen target isolating your stuff in someone else’s cloud, protecting access and encryption in a way that leaves you in command–assuming it works as advertised…and I have no reason to suspect otherwise.

Now, will this work with PaaS? No. SaaS? No. But if you are managing your applications in the cloud, even a hybrid cloud, and are concerned about network security, VPN-Cubed is worth a look.

What are the negatives here? Well, first I think VPN is a feature of a larger cloud network story. This is the first and only of its kind in the market, but I have a feeling other network vendors looking at this problem will address it in a more comprehensive solution.

Still, CohesiveFT has something here: it’s simple, it is entirely under your control, and it serves a big immediate need. I think we’ll see a lot more about this product as word gets out.

The PaaS Spectrum: Choosing Your Coding Cloud

October 12, 2008 Leave a comment

Platform as a Service is a fascinating space to me. As I noted in one of my reviews of Google AppEngine when it was released, there is a certain development experience that comes with a good distributed platform that understands both simple development-test cycles, yet also reduces the complexity of delivering highly scalable and reliable applications to a complex data center. With the right platform, and there are many, a development team can leapfrog the hours of pain and effort required to stitch together hardware, software, networking and storage to create a bulletproof web application.

At Cloud Camp Silicon Valley earlier this month, a group of us discussed this in some depth. A crowd of about thirty assorted representatives of cloud vendors and customers alike engaged in a lively discussion of what the elements of cloud oriented architectures are, and how one chooses the right architecture.

I spoke (perhaps too much) about software fluidity, and it was noted that many PaaS platforms limit that fluidity, rather than enable it. Think Google AppEngine or force.com or Bungee Connect. Great products, but not exactly built to make your application portable and dynamic. (Google and others are open sourcing all or part of their platforms, but the ecosystem to port applications isn’t there yet. See below.) So, the conclusion went, perhaps you choose some PaaS offerings when time-to-market was the central tenet of your project, not portability. Others (possibly including IaaS, if you want to be technical) make sense when portability is your primary concern, but you’ll have to do more work to get your application out the door.

This creates a spectrum on which PaaS offerings would fall:

This makes perfect sense to me. Choose to use an “all-in-one” platform from a single software+service+infrastructure vendor, and they can hide much of the complexity of coding, deploying and operating your application from you. On the other hand, select an infrastructure-only vendor using “standard” OS images (typically based on one of the major linux distros) but little else, and you can port your applications to your hearts content, but you’ll have to do all of the configuration of database connection, middleware memory parameters, etc. yourself. Many platforms will lie somewhere in the middle of this spectrum, but the differences between the edges is striking, and most platforms will fall towards one end or the other.

For an example of a relatively “extreme right” platform, take a look at Force.com, the application platform provided by, and tied closely to, Salesforce.com, and its APEX language. How much do they provide in the way of productivity? Well, Rich Unger, an engineer at Salesforce.com (and one of the participants in the CloudCamp SV discussion), has an excellent blog that covers APEX. Here’s one example that he gives:

“Database operations don’t have to establish a connection (much less manage a connection pool) [in APEX]. Also, the object-relational mapping is built into the language. It’s even statically typed. Let’s say you want the first and last names of all your contacts. In Java, there are many ways to set this up, depending on whether you’re using JPA, straight JDBC, entity beans, etc. In general, you must to at least these four things:

  1. Write an entity class, and annotate it or map it to a DB table in an xml file
  2. Configure the DB and connection pool
  3. acquire a connection in the client code
  4. Perform the query

In Apex, you’d do just do #4:

Contact[] mycontacts = [select firstname, lastname from Contact];
for (Contact c : mycontacts) {
System.debug(c.firstname + ' ' + c.lastname);
}

That’s it. You could even shorten this by putting the query right into the for loop. The language knows how to connect to the DB. There’s no configuration involved. I’m not hiding any XML files. Contact is a standard data type. If you wanted a custom data type, you’d configure that through the Salesforce.com UI (no coding). Adding the new type to the DB automatically configures the O-R mapping. Furthermore, if you tried:

Account[] myaccounts = [select firstname, lastname from Contact];

…it wouldn’t even compile. Static typing right on down to the query. Try that by passing strings into Jdbc!”

Freakin’ brilliant! That is, as long as you wanted to write an application that used the Salesforce.com databases and ran on the Force.com infrastructure. Not code that you can run on AppEngine or EC2.

On the other hand, I’ve been working with GoGrid for a little while getting Alfresco to run in a clustered configuration on their standard images. It has been amazing, and helped along both by the fact that GoGrid gives you root access to the virtual server (very cool!), and that the standard Alfresco Enterprise download (trial version available for free) contains a Tomcat instance, and installs with a tar command, a single properties file change and a database script. So, combine a CentOS 64-bit image with Alfresco 2.2 Enterprise, make sure iptables has port 8080 open, and away you go. The best thing is that–in theory–I should be able to grab the relevant files from that CentOS image, copy them to a similar image on, say, Flexiscale, and be up and running in minutes. However, I did have to manage some very techie things; I had to edit iptables, for instance, and know how to confirm that I had the right Java version for Tomcat.

By the way, long term operational issues are similarly affected by your choice of PaaS provider. If you have root access to the server, you must handle a measurable percentage of issues that are driven by configuration changes over time in your system. On the other hand, if your code is running on a complete stack that the vendor maintains for backward compatibility, and that hides configuration issues from you at the get-go, you may not have to do much of anything to keep your system running at reasonable service levels.

Today, the choice is up to you.

I wonder, though, if this spectrum has to be so spread out. For example, as I wrote recently, I see a huge opportunity for application middleware vendors, such as GigaSpaces, BEA and JBOSS, to provide a “portability layer” that would allow both reduced configuration on prebuilt app server/OS images, but at the same time allow the application on top of the app server to be portable to just about any instance of that server in the cloud. (There would likely be more configuration required on the middleware option than the APEX example earlier. For instance, the application server and/or application itself would have to be “pointed” to the database server.)

Google AppEngine should, in theory, be on this list. However, while they open sourced the API and development “simulator”, they have not provided source to the true middleware itself–the so-called Google “magic dust”. Implementing a truly scalable alternative AppEngine platform is an exercise left up to the reader. Has anyone built a true alternative AppEngine-compatible infrastructure yet? I hear rumors of what’s to come, but as far as I know, nothing exists today. So, AppEngine is not yet portable. To be fair, there is no JBOSS “cloud edition” yet, either. GigaSpaces is the only vendor I’ve seen actively pursue this route.

While we are waiting for more flexible options, you are left with a choice to make. Do you need it now at all costs? Do you need it portable at all costs? Or do you need something in between? Where you fall in the PaaS spectrum is entirely up to you.

Cisco’s Nexus 1000v and the Cloud: Is it really a big deal?

September 17, 2008 Leave a comment

Yesterday, the big announcements at VMWorld 2008 were about Cloud OSes. Today, the big news seemed to be Maritz’s keynote (where he apparently laid out an amazing vision of what VMWare thinks they can achieve in the coming year), and the long rumored Cisco virtual switch.

The latter looks to be better than I had hoped for functionally, though perhaps a little more locked in to VMWare than I’d like. There is an explanation for the latter, however, so it may not be so bad…see below.

I’ve already explained why I love the Nexus concept so much. Today, Cisco and VMWare jointly announced the Nexus 1000v virtual machine access switch, a fully VI compatible software switch that…well, I’ll let Cisco’s data sheet explain it:

“The Cisco Nexus™ 1000V virtual machine access switch is an intelligent software switch implementation for VMware ESX environments. Running inside of the VMware ESX hypervisor, the Cisco Nexus 1000V supports Cisco® VN-Link server virtualization technology, providing

  • Policy-based virtual machine (VM) connectivity
  • Mobile VM security and network policy, and
  • Non-disruptive operational model for your server virtualization, and networking teams.

When server virtualization is deployed in the data center, virtual servers typically are not managed the same way as physical servers. Server virtualization is treated as a special deployment, leading to longer deployment time with a greater degree of coordination among server, network, storage, and security administrators. But with the Cisco Nexus 1000V you can have a consistent networking feature set and provisioning process all the way from the VM to the access, aggregation, and core switches. Your virtual servers can use the same network configuration, security policy, tools, and operational models as physical servers. Virtualization administrators can leverage predefined network policy that follows the nomadic VM and focus on virtual machine administration. This comprehensive set of capabilities helps you to deploy server virtualization faster and realize its benefits sooner.”

In other words, the 1000v is a completely equal player in a Cisco fabric, and can completely leverage all of the skill sets and policy management available in its other switches. Think “my sys admins can do what they do best, and my network admins can do what they do best”. Further more, it supports VN-Link, which allows VMWare systems running on Cisco fabric to VMotion without losing any network or security configuration. Read that last sentence again.

(I wrote some time about about network administrators facing the most change by this whole pooled-resource thing–this feature seals the deal. Those static network maps they used to hang on your wall, showing them exactly what system was connected to what switch port with what IP address are now almost entirely obsolete.)

I love that feature. I will love it even more if it functions in its entirety in the vCloud concept that VMWare is pitching, and all indications are that it will. So, to tell the story here as simply as possible:

  • You create a group of VMs for a distributed application in VConsole
  • You assign network security and policy via Cisco tools, using the same interface as on the physical switches
  • You configure VMWare to allow VMs for the application to get capacity from an external vendor–one of dozens supporting vCloud
  • When an unexpected peak hits, your VM cluster grabs additional capacity as required in the external cloud, without losing network policy and security configurations.

Cloud computing nirvana.

Now, there are some disappointments, as I hinted above. First, the switch is not stackable, as originally hoped, though the interconnectivity of VN-Link probably overrides that. (Is VN-Link just another way to “stack” switches? Networking is not my strong point.)

Update: In the comments below, Omar Sultan of Cisco notes that the switches are, in fact, “virtually stackable”, meaning they can be distributed across multiple physical systems, creating a single network domain for a cluster of machines. I understand that just enough to be dangerous, so I’ll stop there.”

More importantly, I was initially kind of ticked off that Cisco partnered so closely with VMWare without being careful to note that they would be releasing similar technologies with Citrix and Red Hat at a minimum. But, as I thought about it, Citrix hitched its wagon to 3TERA, and 3TERA owns every aspect of the logical infrastructure an application runs on. In AppLogic, you have to use their network representation, load balancers, and so on as a part of your application infrastructure definition, and 3TERA maps those to real resources as it sees fit. For network connections, it relies on a “Logical Connection Manager (LCM)“:

“The logical connection manager implements a key service that abstracts intercomponent communications. It enables AppLogic to define all interactions between components of an application in terms of point-to-point logical connections between virtual appliances. The interactions are controlled and tunneled across physical networks, allowing AppLogic to enforce interaction protocols, detect security breaches and migrate live TCP connections from one IP network to another transparently.”

(from the AppLogic Grid Operating System Technical Overview: System Services)

Thus, there is no concept of a virtual switch, per se, in AppLogic. A quick look at their site shows no other partners in the virtual networking or load balancing space (though Nirvanix is a virtual storage partner), so perhaps Cisco simply hasn’t been given the opportunity or the hooks to participate in the Xen/3TERA Cloud OS.

(If anyone at 3TERA would like to clarify, I would be extremely grateful. If Cisco should be partnering here, I would be happy to add some pressure to them to do so.)

As for Red Hat, I honestly don’t know anything about their VMM, so I can’t guess at why Cisco didn’t do anything there…although my gut tells me that I won’t be waiting long to hear about a partnership between those two.

This switch makes VMWare VMs equal players in the data center network, and that alone is going to disrupt a lot of traditional IT practices. While I was at Cassatt, I remember a colleague predicting that absolutely everything would run in a VM by the end of this decade. That still seems a little aggressive to me, but a lot less so than it did yesterday.

Let the Cloud Computing OS wars begin!

September 15, 2008 Leave a comment

Today is a big day in the cloud computing world. VMWorld is turning out to be a core cloud industry conference, where many of the biggest announcements of the year are taking place. Take,for instance, the announcement that VMWare has created the vCloud initiative, an interesting looking program that aims to build a partner community around cloud computing with VMWare. (Thanks to the every increasingly cloud news leader, On-Demand Enterprise, for this link and most others in this post.) This is huge, in that it signals a commitment by VMWare to standardize cloud computing with VI3, and provide an ecosystem for anyone looking to build a public, private or hybrid cloud.

The biggest news, however, is the bevy of press releases signaling that three of the bigger names in virtualization are each delivering a “cloud OS” platform using their technology at the core. Here are the three announcements:

  • VMWare is announcing a comprehensive roadmap for a Virtual Datacenter Operating System (VDC-OS), consisting of technologies to allow enterprise data centers to virtualize and pool storage, network and servers to create a platform “where applications are automatically guaranteed the right quality of service at the lowest TCO by harnessing internal and external computing capacity.”

  • Citrix announces C3, “its strategy for cloud computing”, which appears to be a collection of products aimed at cloud providers and enterprises wishing to build their own clouds. Specific focus is on the virtualization platform, the deployment and management systems, orchestration, and–interestingly enough–wide area network (WAN) optimization. In the end, this looks very “Cloud OS”-like to me.

  • Virtual Iron and vmSight announce a partnership in which they plan to deliver “cloud infrastructure” to managed hosting providers and cloud providers. Included in this vision are Virtual Iron’s virtualization platform, virtualization management tools, and vmSight’s “end user experience assurance solution” technology to allow for “operating system independence, high-availability, resource optimization and power conservation, along with the ability to monitor and manage application performance and end user experience.” Again, sounds vaguely Cloud OS to me.

Three established vendors, three similar approaches to solving some real issues in the cloud, and three attacks on any entrenched interests in this space. All three focus on providing comprehensive management and infrastructure tools, including automated scaling and failover; and consistent execution to allow for image portability. The VMWare and Citrix announcements go further, however, in announcing technologies to support “cloudbursting” in which overflow processing needs in the data center are met by cloud providers on demand. VMWare specifically calls out OVF as the standard that enables this in their release; OVF is not mentioned by Citrix, but they have done significant work in this space as well.

Overall, VMWare has made the most comprehensive announcement, and have a lot of existing products to back up their feature list. However, much of what needs to be done to tightly integrate these products appears yet to be done. I base this on the fact that they highlight the need for a “comprehensive roadmap”–I could be wrong about this. They have also introduced a virtual distributed switch, which is a key component for migration between and within the cloud. Citrix doesn’t mention such a thing, but of course the rumor is that Cisco will quite likely provide that. Whether such a switch will enable migration across networks, as VMWare’s does (er, will?) is yet to be seen, however (see VMWare’s VDC-OS press release). Citrix does, however, have a decent stable of existing applications to support their current vision.

By the way, Sun is working feverishly on their own Cloud OS. No sign of Microsoft, yet…

The long and the short of it is that we have entered into a new era, in which data centers will no longer simply be collections of servers, but will actually be computing units in and of themselves–often made up of similar computing units (e.g. containers) in a sort of fractal arrangement. Virtualization is key to make this happen (though server virtualization itself is not technically absolutely necessary). So are powerful management tools, policy and workflow automation, data and compute load portability, and utility-type monitoring and metering systems.

I worry now about my alma mater, Cassatt, who has chosen to go it largely alone until today. Its a very mature, very applicable technology, that would form the basis of a hell of a cloud OS management platform. Here’s hoping there are some big announcements waiting in the wings, as the war begins to rage around them.

Update: No sooner do I express this concern, than Ken posts an excellent analysis of the VMWare announcement with Cassatt in mind. I think he misses the boat on the importance of OVF, but he is right that Cassatt has been doing this a lot longer than VMWare has.

Cloud Computing and the Constitution

September 8, 2008 Leave a comment

A few weeks ago, Mark Rasch of SecurityFocus wrote an article for The Register in which he described in detail the deterioration of legal protections that individuals and enterprises have come to expect from online services that house their data. I’ll let you read the article to get the whole story of Stephen Warshak vs. United States of America, but suffice to say the case opened Rasch’s eyes (and mine) to a series of laws and court decisions that I believe seriously weaken the case for storing your data in the cloud in the United States:

  • The Stored Communications Act, which was used to allow the FBI to access Warshak’s email communications without a warrant, his consent, or any form of notification.

  • The appeals court decisions in the case that argue:

    1. Even if the Stored Communications Act is unconstitutional, Warshak cannot block introduction of the evidence as “the cops reasonably relied on it
    2. Regardless of that outcome, the court could not determine if “emails potentially seized by the government without a warrant would be subject to any expectation of privacy”
  • The Supreme Court decision in Smith v. Maryland, in which the court argued that people generally gave up an expectation of privacy with regards to their phone records simply through the act of dialing their phone–which potentially translates to removing privacy expectation on any data sent to and accessible by a third party.

Rasch notes that in cloud computing, because most terms of service and license agreements are written to give the providers some right of access in various circumstances, all data stored at a provider is subject to the same legal treatment.

This is a serious flaw in the constitutional protections against illegal search and seizure, in my opinion, and may be a reason why US data centers will lose out completely on the cloud computing opportunity. Think about it. Why the heck would I commit my sensitive corporate data to the cloud if the government can argue that a) doing so removes my protections against search and seizure, and b) all expectations of privacy are further removed should my terms of service allow anyone other than myself or my organization to access the data? Especially when I can maintain both privileges simply by storing and processing my data on my own premises?

Couple this with the fact that the Patriot Act is keeping many foreign organizations from even considering US-based cloud storage or processing, and you see how it becomes nearly impossible to guarantee to the world market the same security for data outside the firewall as can be guaranteed inside.

It is my belief that this is the number one issue that darkens the otherwise bright future of cloud computing in the United States. Simple technical security of data, communications and facilities is a solvable problem. Portability of data, processing and services across applications, organizations or geographies is also technically solvable. But, if the US government chooses to destroy all sense of constitutional protection of assets in the cloud, there will be no technology that can save US-based clouds for critical security sensitive applications.

It may be too late to do the right thing here; to declare a cloud storage or processing facility the equivalent of a rented office space or an apartment building–leased spaces where all constitutional protection against illegal search and seizure remain in full strength. When I was younger and rented an apartment, I had every right to expect law enforcement wishing to access my personal spaces would be required to obtain a warrant and present it to me as they began their search. The same, in my opinion, should apply to data I store in the cloud. I should rest assured that the data will not be accessed without the same stringent requirements for a search warrant and notification.

Still, there are a few things individuals and companies can do today that appear OK to thwart attempts to secretly access private data.

  1. Encrypt your data before sending it to your cloud provider, and under no circumstances provide your provider with the keys to that encryption. This means that the worse a provider can be required to do is to hand over the encrypted files. You may even be able to argue that your expectations of privacy were maintained, as you handed over no accessible information to the provider, simply ones and zeros.

  2. Require that your provider modify their EULA/ToS to disavow ANY right to directly access your data or associated metadata for any reason. The exception might be file lengths, etc., required to run the hardware and management software, but certainly no core content or metadata that might reveal the relevant details about that content. This would also weaken the government’s case that you gave up privacy expectations when you handed your data to that particular cloud provider.

  3. Store your data and do your processing outside of the United States. It kills me to say that, but you may be forced into that corner.

If there are others that have looked at this issue and see other approaches (both political and technical) towards solving this (IMHO) crisis, I’d love to hear it. I have to admit I’m a little down on the cloud right now (at least US-based cloud services) because of the legal and constitutional issues that have yet to be worked out in a cloud consumer’s favor.

Oh, and this issue isn’t even close to being on the radar screen of either of the major presidential candidates at this point. I’m beginning to consider what it would take to get it into their faces. Anyone have Lawrence Lessig’s number handy?