Archive for May, 2008

It just keeps getting cloudier and cloudier

Looking for inspiration, I checked out my latest Google Alerts for “cloud computing” and found an interesting–perhaps even disturbing–trend: people are locking in their definitions of cloud computing. The problem is these definitions are largely inconsistent.

First, allow me to make a confession. In my own storied attempt to define cloud computing, I certainly sounded definitive in my definition. For example, I stated:

Cloud computing describes a systems architecture. Period. This particular architecture assumes nothing about the physical location, internal composition or ownership of its component parts. It represents the entire computing stack from software to hardware, though system boundaries (e.g. where does one system stop and another begin) may be difficult to define. Components are simply integrated or consumed as need requires and economics allow.

For what its worth, I have found myself shifting a little; not so much on the definition, but on what exactly it defines. Given the largely consensus opinion that Cloud Computing refers to a service model, I am willing to concede that the description above really describes a “Cloud Oriented Architecture” for a complex integrated environment. The true definition of cloud computing is still evolving in my mind.

Now, back to the posts at hand. What I believe I am seeing these days is a split between two camps; the “cloud computing is only about services” camp, and the “cloud computing is getting what ever you need from the Internet” camp.

An example of the former comes from Randy Bias at NeoTactics:

“There seems to be a group myopia around so-called ‘cloud computing’ and it’s definitions. What we’re really talking about are ‘cloud services’ of which, ‘computing’, is only a subset. It gets worse when you have people talking about Software as a Service (SaaS) as a ‘cloud’ service. Things continue to become murkier when the SaaS crowd, bloggers, and reporters start making up new definitions for cloud services using SaaS-like terms such as Platform as a Service (PaaS) and Infrastructure as a Service (IaaS).”

Scott Wilson of The CIO Weblog adds the following:

“When I think of a service as cloud computing, it is characterized by being an offering of nearly unlimited capacity (although it may be billed differently at different utilizations) which has some sort of generic utility but beyond certain minimal architectural requirements there should be no inherent specificity in what it may or should do. It may be a service of a certain type of utility, perhaps storage, raw processing capability, or data storage, but in the same way that a datacenter does not restrict what servers you may host with them, it should not restrict what sort of data you store, process, or serve.”

[Some definition links removed]

Sort of a “cloud services have a cloudy definition” kind of definition.

One of the best examples of the latter comes from ProductionScale‘s Joseph Kent Langley:

“Cloud Computing (Figure 1.0) is a commercial extension of computing resources like computation cycles and storage offered as a metered service similar to a physical public utility like electricity, water, natural gas, or telephone network. It enables a computing system to acquire or release computing resources on demand in a manner such that the loss of any one component of the system will not cause total system failure. Cloud computing also allows the deployment of software applications into an environment running the necessary technology stack for the purposes of development, staging, or production of a software application. It does all this in a way that minimizes the necessary interaction with the underlying layers of the technology stack. In this way cloud computing obfuscates much of the complexity that underlies Software as a Service (SaaS) or batch computing software applications. To explain better though, let’s simplify that and break it down this definition to it’s constituent parts.”

Langley’s definition is more closely aligned with utility computing, but may be best summarized as a “if you can run it on the Internet, its a cloud”.

Of course, there is also James Governor’s famous list of requirements.

All of which leads to a gap in terminology that gets filled by whatever reaches the vacuum at the moment: what do you call a “cloud-like” infrastructure in a private data center? As I noted to the Google Groups Cloud Computing alias:

“[H]ere (is) how I arrived at that conclusion:

  • If “grid computing” is about running job-based tasks in a MPP model (e.g. HPC) (as it seems to be defined for many), and
  • If “utility computing” is a business model for providing computing on an as-needed, bill-for-what-you-use basis, and
  • If “cloud computing” is a market model describing services provided over the Internet (which it is for most of the Web 2.0 world), and
  • If “virtualization” describes providing software layers in the execution stack to decouple software from the hard resources it depends on (and it is important to note for the purposes of this argument that “resource-pooled” does NOT require virtualization in this sense; it is quite possible to run your software on bare metal server pools, as we did at Cassatt)
  • Then, what do we call the systems/infrastructure model where resources are pooled together, and used for a variety of workloads, including both job-based and “always running” tasks (such as web applications, management and monitoring applications, security applications, etc.)?

Do we redefine “grid” to cover the expanded role of resource-pooled computing (as 3TERA seems wont to do)? Do we leverage “utility computing” as an adjective for platforms that can deliver that business model for those that own infrastructure (as Cassatt and IBM tend to do)? Does the term “virtualization” represent a broader view than how VMWare, Microsoft and Citrix are defining it? Is there another term (such as “resource-pooled computing”–ugh) that would better serve the discussion?”

I’m still hunting for the answer to that one.

However, in terms of my definition of cloud computing, I have to say I lean towards the “anything you can run on the Internet” camp, as it–to me–best represents what an actual drawing of a cloud means in a system diagram. Just “go to the cloud” and get what you need, whether its a complete CRM system or a simple purchasing service. This eliminates a million potential grey areas at the boundaries of the “only about services” definition. Is PayPal a cloud service? Why or why not?

I’d love to hear from those of you that are beginning to see some consensus in online communities about what a constitutes a cloud or cloud service and what doesn’t. In the meantime, I am settling down for another long summer of fog (this is the Bay Area, after all), though I’ll have plenty of company, I’m sure.


Off Topic: Scratch that!

Turns out that Luis has set up a group blog that I feel is much better targeted at the Alfresco audience in general, so I am going to kill MiningAlfresco before it really gets started. Sorry for the false alarm, but check out The Gang: Thoughts from the Alfresco Field Technical Team if you remain interested in the topic.

Categories: blogs personal

Off Topic: Introducing "Mining Alfresco"

I didn’t want to sully this blog by introducing a whole bunch of ECM/Alfresco stuff here, so I created a second blog for that content. Mining Alfresco will cover my experiences in learning the ECM market, Alfresco (look for a lot of technical postings), and how all of that relates to the topic of this blog, Cloud Computing. If you have an interest in ECM or “Content in the Cloud”, you may want to check it out and subscribe.

I also want to apologize for the “dead time” in my posting, but as you can imagine spinning up a new job takes a lot of focus. I’ll try to fit in several new posts in the coming days.

Categories: blogs, PaaS, personal, SaaS

Cassatt Announces Active Response 5.1 with Demand Base Policies

Ken Oestreich blogged recently about the very cool, probably landmark release of Cassatt that just became available, Cassatt Active Response 5.1. He very eloquently runs down the biggest feature–demand based policies–so I won’t repeat all of that here. What I thought I would do instead is relate my personal thoughts on monitoring based policies and how they are the key disruptive technology for data centers today.

To be sure, everyone is talking about server virtualization in the data center market today, and that’s fine. It’s core short-term benefit, physical system consolidation and increased utilization is key for cost-constrained IT departments, and features such as live motion and automatic backup are creating new opportunities that should be carefully considered. However, virtualization alone is limited in its applications, and does little to actually optimize a data center over time. (This is why VMWare is emphasizing management over just virtualizing servers these days.)

The technology that will make the long term difference is resource optimization: applying automation technologies to tuning how and when physical and virtual infrastructure is used to solve specific business needs. It is the automation software that will really change the “deploy and babysit” culture of most data centers and labs today. The new description will be more like “deploy and ignore”.

To really optimize resource usage in real time, the automation software must use a combination of monitoring (aka “measure“), a policy engine or other logic system (aka “analyze“) and interfaces to the control systems of the equipment and software it is managing (aka “respond“). It turns out that the “respond” part of the equation is actually pretty straight forward–lots of work, but straight forward. Just write “driver” like components that know how to talk to various data center equipment (e.g. Windows, DRAC, Cisco NX-OS, NetApp Data ONTAP, etc.), as well as handle error conditions by directly responding or forwarding the information to the policy engine.

The other two, however, require more immediate configuration by the end user. Measure and analyze, in fact, are where the entire set of Service Level Automation (SLAuto) parameters are defined and executed on. So, this is where the key user interface between the SLAuto system and end user has to happen.

What Cassatt has announced is a new user interface to define demand based policies as the end user sees fit. For example, what defines an idle server? Some systems use very little CPU while they wait for something to happen (at which point they get much busier), so simply measuring CPU isn’t good enough in those cases. Ditto for memory in systems that are compute intensive but handle very little state.

What Cassatt did that is so brilliant (and so unique) is to allow the end user to leverage the full range of SNMP attributes for their OS, as well as JMX and even scripts running on the monitored system to create expressions that define an idle metric that is right for that system. For example, on a test system you may in fact say that a system is idle when the master test controller software indicates that no test is being run on that box. On another system, you may say its idle when no user accounts are currently active. Its up to you to define when to attempt to shut down a box, or reduce capacity for a scale-out application.

Even when such an “idle” system is identified, Cassatt gives you the ability to go further and write some “spot checks” to make sure they system is actually OK to shut down. For example, in the aforementioned test system, Cassatt may determine that its worth trying to power down a system, but a spot check could be run to determine if a given process is still running, or an administrator account is currently actively logged in to the box that would indicate to Cassatt that it should ignore that system for now.

I know of no one else that has this level of GUI configurable monitor/analyze/respond sophistication today. If anyone wants to challenge that, feel free. Now that I no longer work at Cassatt, I’d be happy to learn about (and write about) alternatives in the marketplace. Just remember that it has to be easy to configure and execute these policies, and scripting the policies themselves is not good enough.

It is clear from the rush to release resource optimization products for the cloud, such as RightScale, Scalr, and others, that this will be a key feature for distributed systems moving forward. In my opinion, Cassatt has launched itself into the lead spot for on premises enterprise utility computing. I can’t wait to see who responds with the next great advancement.

Disclaimer: I am a Cassatt shareholder (or soon will be).

A Funny Thing Happened On The Way to the Apple Store…

Part of the fun of joining my new employer is their open policy for selecting the laptop of your choice. Of course, being a lover of technologies that enable one to be technically lazy, I chose a MacBook Pro. It should arrive in a few days.

However, I was beginning to feel like I needed another beefed up system of my own at home to act as a multi-guest virtual “server farm” for various experiments, etc., that may include scale-out benchmarking, interesting integration issues, etc. My initial thought was a 8-core Mac Pro loaded with memory and disk, which would have set me back about $6500. So I asked Luis what he thought, and he said, “Don’t Bother. Whenever I need a bunch of servers to test with, I generally find [Amazon] EC2 works perfectly fine.”

You could have heard the head slap a mile away.

With all of my focus being on enterprise computing the last two years, I had totally lost sight of the “individual” applications of a cloud like EC2. I no longer have to think about building up a server farm of my own, or purchase a big honkin’ dual Quad-core tower, or even reserve space on the corporate “cluster library”. I just need my credit card, my Amazon account, and a little time with the “Getting Started” tutorial, and I have all the server resources I need at a price that is a fraction of buying the big box, with billing that allows me to easily expense work-related computing. Damn, I love the modern world!

Now, all of this probably seems so obvious to all of you out there, and it probably cracks you up to see a cloud computing blogger miss this opportunity to “reach for the clouds”, so to speak. However, I think this is indicative of the change that both individuals and enterprises must go through to take advantage of these new breed of technologies.

I, like may Fortune 500 IT departments, am an old school client-server/SOA guy. I have a “use the right tool for the job” mentality, driven by years of pain trying to force procedural pegs into SOA holes. This mentality leads to a “best of breed” bias that leads one to worry about the ground up implementation of any software solution. If a tool was found that reliably hid some of that implementation, that was awesome and incredibly helpful to productivity. However, one needed to still understand how the server worked with the OS worked with the middleware worked with the application implementation to be comfortable to go to production.

To me, Amazon, Mosso, Cassatt and others are indicative of a major change in this mentality. With reliable shared configurations of systems (or a reliable systematic infrastructure for matching compute tasks to disparate resources that can handle those tasks), application developers now need to know less and less about the server, networking and storage part of the equation. Now, with the focus from the OS on up the stack, developers can start shopping for the infrastructure that makes economic sense for the problem they are trying to solve. The trick, of course, is to remember there are alternatives to buying your own servers.

So, this week I started to play with Amazon EC2, S3 and Cloud Services‘ new instance management tool, Cloud Studio. Let me just say, I am incredibly impressed with what I’ve done so far, which is little more than creating, starting and terminating instances (with a little between machine networking thrown in for fun). Even using Amazon’s command line tools, it is a pretty straight forward process to get either a 32-bit or 64-bit server, but when you add the visual cues of Cloud Studio, it just becomes so simple it boggles the mind.

Now, there are definitely disadvantages to using Amazon for some problems. Windows support is out, for instance. (Anyone have a good suggestion for a true on-demand pricing option for Windows? Mosso would work, I hear, but they have a fixed upfront price that is a little steep for my general needs.) Also, any work that involves large amounts of data transfer ups the ante greatly. (Kevin Burton talked about this some time ago–see his note about bandwidth pricing just below the last quote, about half way down.) However, I will never again forget to consider the cloud before “own your own” for any computing task I have in my personal world.

Hmmm. I wonder if I can get my wife to use Zoho now…

Blog Title Change: Leveraging the Wisdom of Clouds

As I discussed in my last post, the change of jobs gives me the opportunity to broaden the coverage of this blog somewhat beyond the basic topic of delivering SLAuto to enterprise data centers. To more completely reflect this, and (quite frankly) to increase visibility to those searching for information about cloud computing and utility computing, I have changed the title and description of this blog.

Now titled “The Wisdom of Clouds” (with absolute apologies to James Surowiecki and his great book, The Wisdom of Crowds) this blog will discuss cloud computing, utility computing, SaaS, PaaS and Haas as they relate to both the enterprise and individual users. This really isn’t much of a departure from the topics covered in the last year or so–in fact, I considered sub-titling the blog “Covering your *aaSes since 2006”–but the explicit description allows more people to more readily discover my ramblings.

For those who have been following this blog for some time, as well as those who have just discovered it, I thank you. I hope you will join me in creating and shaping “the wisdom of clouds”.

Sometimes change must happen…

It is with mixed feelings that I announce that I am leaving Cassatt, effective COB tomorrow. I want to state first and foremost that this change was for personal and family reasons, and NOT because of issues with either the company or technology at Cassatt. I had a phenomenal two years with the company, and will remain in touch with much of the organization in the coming years. Cassatt still has the most technology independent solution to data center optimization and on-premises utility computing infrastructure. I still firmly believe in the vision and opportunity that is Cassatt.

That being said, the commute was killing me, and with Owen in preschool, Emery home with a Nanny that rightfully deserves reasonable working hours each day, and Mia in clinicals for her sonography program, something had to give. Not being in a hurry to make a change, I dabbled in conversation with a few traditional enterprise sales companies, but none of them blew me away. Then, unexpectedly, Matt Asay contacted me via LinkedIn (the worlds BEST professional network), and asked me if I would be interested in his current open source endeavor, Alfresco.

Now, Enterprise Content Management (ECM) has never been my gig, but I thought Matt’s pitch was interesting, so I took a call with him. As he covered the company, the technology and the opportunity, I found myself getting more and more excited. After then talking to the VP of Alliances, Martin Musierowicz and the Senior Director of Solutions Engineering, Luis Sala and downloading and playing with the technology, I felt it was an opportunity that met both my immediate and long term needs. It was almost a no-brainer to accept the offer when it arrived.

The position, a solutions engineering role working with their Alliances group, let’s me work from home, so the commute couldn’t be better (though there are clients all around the Bay Area, and there will be occasional travel both domestically and internationally). It is also with a growing open source company, which I am extremely excited about. Oh…and I get a Mac!

In that first conversation with Matt, I asked him a key question: “Matt, you’ve seen my blog; I remain keenly interested in cloud/utility computing. How does that fit in with Alfresco’s needs, and would I be able to continue exploring the subject as part of my job.” In response, Matt said flat out that Alfresco very much encourages the building of personal brands, and that cloud computing is indeed a subject of interest to Alfresco. So, I will continue this blog, though I will probably rename it later this week. (Mostly for better search visibility…)

As for ECM, I intend to build some real expertise in the space very quickly, so I will create a separate blog (probably to cover that space, and my experience with Alfresco. There will be some cross-linking between the blogs as I discover how the two technologies can help each other, but I will endeavor to keep each on topic as much as possible.

(One interesting side effect of creating an ECM blog is that I risk landing in the crosshairs of James McGovern. How fun would that be?)

I will miss my friends and colleagues at Cassatt, but I am very much looking forward to this new future. As always, I can be contacted on LinkedIn, FriendFeed or at jurquhart (no spammers) at yahoo dot com.

Categories: Uncategorized