Archive for March, 2008

Rumbling from distant clouds

For those with their heads in the sand this last week, life was busy in the IT weather system. A few key announcements, combined with some very credible rumors set the stage for a thrilling Q2CY08:

In the meantime, I was knee deep in family (thanks for coming Dad and Sheila), working my butt off to manage four separate customer pilots, two of which I am working directly on, and generally having to choose sleep or blogging on a day-to-day basis. (Luckily I’m obsessive enough about blogging that sleep only wins about half the time.)

Thank god blogs like, and others are providing stellar coverage of the cloud computing space.

By the way, I’ve had a very good boost in subscribers in the last week. Thanks to all who have joined the conversation, and please feel free to comment at any time. If you have a blog related to service level automation, utility/cloud computing or distributed systems technologies you’d like me to follow, drop me a line at james dot urquhart at cassatt dot com.

Categories: Uncategorized

MapReduce reaches adolescence

March 29, 2008 1 comment

I have to admit I find myself growing more impressed with the MapReduce (and related algorithm) community every day. I spend the better part of an hour watching Stu Hood of Rackspace/Mailtrust discussing MapReduce, Mailtrust’s use of it for daily log processing, and comparing it to SQL. I’m a MapReduce newbie, so I was happy to find Stu’s overview clear, careful and at a level I could grasp.

His overview of Hadoop (an open source implementation of a MapReduce framework) was equally enlightening, and I learned that Hadoop is more than the framework, but it includes a distributed file system as well. This is where I think SLAuto starts to become important, as it will be critical not only to monitor which systems in a Hadoop cluster are alive at any time (thus providing access to their storage), but also to correct failures by remounting disks on additional nodes, provisioning new nodes to meet increased data loads, etc. Granted, I know just enough to be dangerous here, but I would bet that I could sell the value of SLAuto in a MapReduce environment.

Another interesting overview of the MapReduce space comes from Greg Linden. (Damn, now I’ve mentioned Greg twice in a row…my groupie tendencies are really showing these days! -) Greg points us to notes taken at the Hadoop Summit by James Hamilton, an architect on the Windows Live Platform Services team. I haven’t read through them all yet, but I like the breakdown of many of the big projects getting a lot of coverage among techies these days: Yahoo’s PIG and HBase, as well as Microsoft’s DRYAD. Missing is CouchDB, but I plan to watch Jan Lehnardt’s talks [1][2] on that as soon as I get a moment.

Again, the reason MapReduce is being covered in a blog about Service Level Automation and utility computing is that as soon as I see “tens of thousands of nodes”, I also see “no way human beings can meet the SLAs without automation”. At least not without significant costs compared to automating. System provisioning, monitoring, autonomic scaling and fail-resistance are not built in to Hadoop, they are simply easy to support. Something else is needed to provide SLAuto support at the infrastructure layers.

Greg Linden on the Cloud

Greg Linden, of Geeking with Greg fame, was interviewed on Mix about his work in search personalization, recommendation engines and cloud computing. Most of the interview is only sort of interesting, but what really perked my ears up was Greg’s observation that anyone scaling a software environment to thousands or tens of thousands of servers will likely continue to run their own data centers, if only because they will want to tweak the hardware to meet their specific needs.

Initially, I thought of this as just another example of a class of data center that will not be quickly (if ever) moved to a third party capacity vendor. Based on examples like Kevin Burton’s fine tuning of Spinn3r‘s infrastructure using Solid State Drives (SSD) instead of RAID and traditional disks, it even seems like there would be many such applications. Ta da! It is proven that there will always be private data centers!

Yet, the more I think about it, I wonder if I wouldn’t pay Google’s staff to run my Map/Reduce infrastructure, even if it used tens of thousands of servers. I mean, where is the economic boundary between when it is cheaper to purchase your computing from clouds that already have your needed expertise versus hiring staff with specialized skills to meet those same needs?

Alternatively, is this kind of thing a business opportunity for a “boutique” cloud vendor? “Come to Bob’s MapReduce Heaven. We’ll keep your Hadoop systems running for $99.95, or my name isn’t Bob Smith!”

I’ll just leave it at that. I’m tired tonight, and coherence has left the building.

An amazing resource for scalable systems architectures

I don’t know why I hadn’t heard of these guys before, but I’m in love with the content at In post after post, feature after feature, there is more to learn here about everything from architecting software to optimize Amazon Web Services costs, to possibly the greatest collection of articles on real-life scalable architectures ever assembled. I have a feeling I will lose a few hours of sleep in the next few nights trying to read everything I can here.

I noted the inevitability of architecting specifically for utility (or cloud) computing some months ago.

Eric Schmidt: Please believe me…

ZDNet Asia covered comments from Eric Schmidt of Google regarding the trust issues that enterprises must address before adopting cloud computing. He made these comments during a recent visit to Sydney, Australia. I find the comments interesting, because it signals for me the first public acknowledgment of the challenges that Google faces in selling the enterprise on the cloud vs. in-house applications.

Of course, he couches it in terms of how to choose Google Apps over Microsoft Office, but heart of the issue–trust–applies to just about any choice between traditional “I own it all” IT, and “renting” from the cloud–including compute capacity. (By the way, is anyone still claiming that Google Apps does not compete with Microsoft Office?)

As Eric notes for the Apps/Office debate:

“At some point in your firm, someone is going to say: ‘Well maybe there is an alternative in the enterprise’, and they’re going to do an evaluation. And they’re going to say the cloud computing model has its strengths and weaknesses.”

This seems consistent for all cloud computing choices: in each case, the IT organization (or even the business) will need to evaluate the costs/benefits of moving data and functionality to the cloud versus maintaining traditional desktop/server systems. Up to now, I agree with Eric, but then he goes on to say:

“What assurances [do you have] that the information you have in your computer is safe–that it is properly stored and so forth? So it’s important to understand that you really are making trade offs of one versus the other.”

Assuming I am understanding this right, Eric seems to be saying, “Hey, your data isn’t really all that secure on your PC, so why don’t you just trust us that we will do better?” Ah, there is the rub.

I believe most enterprises would answer,

“Well, if data is misappropriated on my in-house systems, I can hunt down and fire those responsible, and the original copy of the data is still in my control. If Google (or someone who compromises Google) misappropriates my data in the cloud, I can go after the guilty parties, but if I no longer trust Google, I now have a legal battle on my hands to get my data back and get Google to completely delete it from their systems.”

This partially gets to data portability, which some are trying to address, but it is not a solved problem yet. However, even with portability, its the “completely delete it from their systems” part that I may never trust without clear and explicit legal consequences and vendor auditing. Until I have full control over where my data resides (at least in terms of vendors) and when and where I can move it and how it gets removed from storage that I no longer wish to utilize, I am putting a lot at risk by moving data outside of my firewalls.

At its heart, I think Eric’s statement gets at the core of what Google has ahead of them in terms of delivering Apps to large, established enterprises. I don’t doubt that Google will both develop and acquire technology that overcomes many of the security concerns that large enterprises have, but I continue to believe that we will see a major legal case in the next 5 years where a large corporation has to fight in court to get their data from a SaaS/cloud computing provider.

If it were me, I’d look to get cloud-like economics from my existing infrastructure. This is done by utilizing software architectures that are multitennant capable (SOA is a good place to start), and by implementing utility computing type infrastructure in your own data center. No matter how nicely Eric asks, be careful of what you are getting into if you put your sensitive data in the cloud.

The Social Enterprise Opportunity

March 19, 2008 2 comments

I want to begin today with a quick shout-out to my fellow bloggers at Data Center Knowledge. In a recent post, they identified me as one of the bloggers they follow for cloud and utility computing, and I’m honored to me included among such a strong list of bloggers. (Rich Miller, who posted the list, is no slouch himself.) Update: I violated the cardinal rule of Internet social networking: assuming a given name applies to one person. Rich Miller from Data Center Knowledge is not the same Rich Miller that writes Telematique. My apologies to both.

One of those bloggers is Phil Wainwright, whose Software as Services blog is one of my regular reads. He is the most aggressive, forward thinker in the SaaS space, and he is very often sees opportunity that most of us miss. (Phil’s blog is also a great way to stay on top of the companies and technologies that specifically support the SaaS market.)

Phil recently wrote an interesting post about SaaS and Web 2.0 concepts, titled “Enter the socialprise”, in which he points out that the very nature of an “enterprise” is changing thanks to the Internet and cloud computing concepts. He notes that loyalty between individuals is replacing corporate loyalty, and that social networking on the Internet is creating a new work economy for individual knowledge workers.

He then goes on to challenge enterprise computing models:

But enterprise computing is still designed for the old, stovepipe model in which every transaction took place within the same firm. There’s no connection with the social automation that’s happening between individuals. Many enterprises even resist talking about social networking. And even when an application vendor adds some kind of social networking features, there’s always the suspicion that they’re just painting social lipstick on a stovepipe pig.

This yawning chasm is an opportunity for a new class of applications to emerge that can harness the social networks between individuals and make them relevant to the enterprise. Or perhaps reinvent a new kind of enterprise, better suited to the low-friction reality of the connected Web. Enter the socialprise.

The example he gives of a company leveraging this is InsideView, which is creating a very cool sales intelligence application that integrates with major SaaS CRM vendor products to aggregate information from a variety of online sources into a single prospect activity dashboard. This is an incredibly cool example of how rich data about individuals within and across firms can be used at an enterprise level.

Another product that is similar that struck me was JobScience, which is one of the companies whose blog is in the Data Center Knowledge list referenced above. JobScience is using to create a rich social intelligence engine for customers. Their product, aptly called Genius, is an excellent example of what they are able to do. Read the post for all the features, but my favorite is:

The Genius Tracker. Not only does the tracker pop up to tell me an email recipient has just opened my email, or is visiting my web site, but the more important intelligence this gives me is that this prospect is is online and engaged with our solution. If a sales rep can call 40 people in a day, and a blast to 5000 prospects shows me that 40 of those prospects are online and engaged, it doesn’t take a genius to figure out who to call. That rep’s going to have a much more productive day calling people who they know are in the office. Less voicemails, less brushoffs, less calls to people who don’t work there anymore.

Bordering on privacy issues, I know, but an amazing level of detail, and invaluable if used wisely. More importantly, it goes to show what is possible in a stable, shared application environment.

By the way, this direct integration with a given CRM platform by a “value added extender” is an interesting twist to the dependency issues that Bob Warfield writes about on the SmoothSpan blog. JobScience’s products are services that become a feature of the destination both visually as well as functionally. Bob’s point about being a component provider to the actual product is well taken, and I wonder if the only exit strategy for these guys is acquisition by Salesforce. What else can they hope for as a company dependent on Talk about cloud lock-in.

How to find me these days…

March 17, 2008 2 comments

I must apologize for my continued absence (8 days now) in the blogosphere, but I have a “perfect storm” of time-demanding things in my life right now. Blogging is taking the hit, unfortunately. I should be back to regular posting in the next week or two.

In the meantime, you can follow what I am reading/doing at If you are on FriendFeed already, just subscribe to me and I’ll return the favor.

Also, if you haven’t watched Simon’s pre- and post-conference talks from the Enterprise 2.0 Summit at CeBIT, they are a must-watch. Simon is really tightening up this talk, and I hope he gets a chance to present it soon in sunny CA.

Finally, I am using my page now to highlight key articles and posts from my Google Alerts emails (most importantly, alerts on “utility computing” and “cloud computing”). This will also show up on FriendFeed, however.

Lot’s going on. I’m itching to comment on Ray Ozzie at MIX, James Governor on what makes cloud computing cloud computing, and what I hope to achieve at GreenDevCamp this year.

Categories: Uncategorized