What does it mean? Microsoft upping efforts on Infrastructure-as-a-service

Wired is reporting a rumor that Microsoft will soon launch a new Infrastructure-as-a-service offering to compete with Amazon EC2, in June.

What Does it Mean?

I have no idea whether the “rumor” is true, or even what it really means. I speculate that the bottom line is that we’ll be able to upload arbitrary VHDs to Azure. Right now Microsoft allows people to upload VHDs that run Windows Server 2008.  With this change they may support “anything”.  Because it’s a virtual hard drive, and the creator of that hard drive has full control over what goes into it, that means an Azure customer will be able to provision VMs in the Microsoft cloud that run any OS, including Linux. This would also represent a departure from the stateless model that Windows Azure currently supports for the VM role. It means that VHDs running in the Windows Azure cloud will be able to save local state across stop/restart.

Should we be Surprised?

Is this revolutionary?  Windows Azure already offers compute nodes; it’s beta today but it’s there, and billable.  So there is some degree of Infrastructure-as-a-service capability today.

For my purposes “infrastructure as a service”  implies raw compute and storage, which is something like Amazon’s EC2 and S3. A “platform as a service” walks up the stack a little, and offers some additional facilities for use in applications. This might include application management and monitoring, enhancements to the storage model or service, messaging, access control, and so on. All of those are general-purpose things, usable in a large variety of applications, and we’d say they are “higher level” than storage and compute. In fact those services are built upon the compute+storage baseline.

For generations in the software business, Microsoft has been a major provider of platforms. With its launch in 1990, Windows itself was arguably the first broadly adopted “application platform”.  Since the early 90’s, specialization and evolution have resulted in an proliferation of platforms in the industry – we have client platforms, server platforms (expanding to include the Hypervisor), web platforms (IIS+ASP.NET, Apache+PHP), data platforms, mobile platforms and so on. And beyond app platforms, since Dynamics Microsoft has also beein in the business of offering applications as well, and it’s here we see the fractal nature of the space.  The applications can act as platforms for a particular set of extensions.  In any case, it’s clear that Microsoft has offerings in all those spaces, and more.

Beneath the applications like Dynamics, and beneath the traditional application platforms like Windows + SQL Server + IIS + .NET, Microsoft has continued to deliver the foundational infrastructure, specifically to enable other alternative platforms. Oracle RDBMS and Tomcat running on Windows is a great example of what I mean here. Sure, Microsoft would like to entice customers to adopt the entirety of their higher-level platforms, but the company is willing to make money by supplying lower-level infrastructure for alternative platforms.

Considering that history, the rumor that Microsoft is “upping efforts on infrastructure as a service” should not be surprising.  Microsoft has long provided offerings at many levels of “the stack”.  I expect that customers have clearly told Microsoft they want to run VHDs, just like on EC2, and Microsoft is responding to that.  Not everyone will want this; most people who want this will also want higher-level services.  I still believe strongly in the value of higher-level cloud-based platforms.

Platform differentiation in the Age of Clouds

It used to be that differentiation in server platforms was dominated by the hardware. There were real, though fluctuating and short-lived, performance differences between Sun’s Sparc, HP’s PA-RISC, IBM’s RIOS and Intel’s x86. But for the moment, the industry has found an answer to the hardware question; servers use x64.

With standard high volume servers, the next dominant factor for differentiation was on the application programming model.  We had a parade of players like CORBA, COM, Java, EJB, J2EE, .NET. More recently we have PHP, node.js, Ruby, and Python. The competition in that space has not settled on a single, decisive winner, and in my judgment, that is not likely to happen. Multiple viable options will remain, and the options that enjoy relatively more success do share some common attributes: ease of programming (eg, building an ASPNET service or a PHP page) is favored over raw performance (building an ISAPI or an Apache module in C/C++).  Also, flexibility of the model (JSP/Tomcat/RESTlets) is favored over more heavily prescriptive metaphors (J2EE). I expect the many options in server platform space to continue; the low-cost to develop and extend these platform alternatives means there is no natural economic value of convergence, as there was in server hardware where the R&D costs are relatively high.

Every option in the space will offer its own unique combination of strengths, and enterprises will choose among them. One company might prefer strong support for running REST services, while another might prefer the application metaphor  of Ruby on Rails.  Competition will continue.

But programmer-oriented features will not be the key differentiator in the world of cloud-hosted platforms. Instead, I expect to see operational and deployment issues to dominate.

  • How Reliable is the service?
  • How difficult is it to provision a new batch of servers?
  • How flexible is the hosting model? Sometimes I want raw VMs, sometimes I want higher-level abstractions. I might want to manage a “farm” of servers at a time, or even better, I might want to manage my application without regard for how many VMs back it.
  • How extensive are the complementary services, like access control, messaging, data, analysis, and so on.
  • What kind of operational data do I get out of that farm of servers? I want to see usage statistics and patterns of user activity.

It won’t be ease of development that wins the day.

Amazon has been very disruptive with its AWS, and Microsoft is warming to the competition. This is all good news for the industry. It means more choices, better options, and lower costs, all of which promotes innovation.

What drives the demand for continuous change?

Lately, it seems, no system is ever “finished”.  You are only running “this week’s build”.  And this is how we want it!  What drives the demand for continuous evolution of information systems?

In my opinion, it’s the possibilities. The possibility for interconnections among disparate systems, stakeholders, and devices. The model of exteme interconnectivity is enabled through standard protocols and data formats, and it is the single most striking change in IT from 4 years ago. There was a time when you needed to buy your CPU and your hard disk drive from the same manufacturer, or they wouldn’t work together. And can you belive we actually had vendor-specific networking technology?  Does nayone remember DecNet and IBM’s Token Ring?!

Just ten years ago, Scott McNealy, then CEO of Sun Microsystems, was criticizing .NET as “Not yet” or “Dot Not”. His line was that .NET was a “lock in” strategy. Lock in!  Remember that?  Java was proposed as the way to avoid “vendor lock in”.  Does anyone really think about vendor lock-in any more?

We have come a long, long way. Rather than worrying about evading vendor leverage, CIOs are interested in proactively solving business problems, and they realize that means interconnecting disparate systems. It means buying what they can buy, and building the rest, and forging as many connections as the business needs.  It means relying on JSON, XML and REST – messaging, rather than elegant distributed object models like CORBA or Java everywhere, as the preferred way to connect systems.

The interconnectivity enabled by that practical approach is the impetus for continuous change.

Open standards and defined data formats allow the interconnections that produced the explosion in possibilities for building software systems. Any developer today can perform lookups on Google’s data services to do geolocation.  It is straightforward to use Bing maps to display a color-coded map of sales results by country, or a map of the density of clients by county.  This stuff was exotic or expensive just a few years ago, and now, because we can interconnect systems so easily, the state of the art has advanced to the point where the business demands this sort of analysis and intelligence.  Look at Tableau Software – they are a terrific example of a company exploiting this trend.

Analysis of business data right now. When a new opportunity opens up, I want to be able to analyze that, right now.

But there is still so much more upside. Just the other day I was speaking to a sales manager who bemoaned the inability of his IT staff to produce a report he needed. But that situation is as unnecessary as it is frustrating.  Why is he relying on someone else to produce his reports? He should have access to his own business data, the way he wants it!  He should have desktop business intelligence. He shouldn’t have to wait for his monthly staff meeting to see the data.

There’s lots more to come.


The Quiet Revolution in Software Development

There’s a natural human resistance to change. Everyone has it, everyone is subject to it. Some of us are more aware than others of our own tendencies to resist change unconsciously.  But by and large, all of us like to minimnize surprises, like to feel that we are in control.  We have enough going on, right? Especially in a work environment, where compensation is dictated by achievement and performance is judged and weighed, we don’t like to push the envelope lest we fail. We might lose that pay raise, we might even lose our jobs.

So when a new approach to project management comes along, it’s not surprising to find resistance.  It’s the conservative approach, and there’s a lot to be said for being consciously conservative in business.

On the other hand software project management is just screaming for a new approach. The domain is novel enough that the analogues we’ve tried to apply – Software as system design, Software development as building architecture and design, distributed systems development as city planning – have always been less than satisfactory.  Yes, software development is a little bit like those things, but it is a lot unlike them too.  If we blindly attempt to lay models from those domains into software development, we’ll fail.

Not only is software unique, it is also evolving rapidly. This is cliche, but the implications are sometimes overlooked. Developing a software project today is much, much different than developing a software project 15 years ago, even in the same industry. In 1997, the web was hot, and everyone wanted to figure out how to web-enable their business systems. These days, the web is the platform.  Where before we were delighted to be free of green screens, now we demand integration with mobile consumer-oriented devices. Building inspectors want to bring their ipad’s to jobs to fill out forms, take pictures, and submit their reports over the cell network. These use cases were firmly in the realm of miracle only a few years ago. Now they are de rigueur.

And the ever-expanding list of demands – for more and more connections, more integration, front-ends, back-ends, reporting systems, feedback systems – this explosion of possibility has implications for how we execute software projects. Not only is the list expanding, but it is also ever-shifting.  This is why the building analogy fails: buildings last for years, while we design software expecting to re-design it or extend it in 4 months. We expect it!  There is a demand for constant change, a demand for more or less continuous evolution of business systems.

The waterfall – the comfortable, conservative, well-known approach where there are clear handoffs, lots of documents describing exactly what is happening when, lots of reports, formalized requirements documents, many review meetings – that model simply cannot work any longer, not with the changes in software we’ve seen. This is a model that made sense in projects where testing was expensive and slow, driven by humans. With those economics, it made sense to make sure the plan was rock solid and air tight before we took the first step.

But that model no longer serves us. There’s been a slow but undeniable revolution in software development processes, driven not by hype or synthetic demand driven by vendors, but by a real improvement in results. I’m talking about Scrum and Agile methods. Iterative approaches that favor learn-as-you-go approach, with lots of automated testing that drives many small corrections, rather than a rigorous lengthy planning process upfront.  Software projects  that use these methods are more likely to succeed today than projects using the old-school waterfall methods, if we judge success as on-time, meeting requirements, and on-budget.

Software companies, like Google, Microsoft, games companies, and other organizations that make their money mostly or wholly from software, know this. They’ve been steadily and quietly increasing their commitment to test-driven developments, sprints, Scrummy project management. This isn’t about new products – it’s about new practices.

But larger companies that aren’t in the software business – the ones that think of themselves as manufacturing companies, or financial services companies, or healthcare providers, or telecom – some of these have been slower to adopt these practices. Conservative business people run these companies and they have good reason to tread carefully.

But I’ve got news for you: Scrum is now conservative. It just works better. It’s not hard to do, though it does require some new thinking.  You don’t need a squad of A players to pull this off. You don’t need to raid Microsoft’s dev teams. You can do this with competent developers and competent project managers; with B and C people, the people most companies in the world are stocked with.  In light of this, any software project manager or CIO who prefers to lean toward Waterfall methods for  new development efforts, is taking on unnecessary risk.

Yes, there’s a hesitancy to embrace new things when large sums of money are at stake. Rightly so.  But Agile and Scrum are no longer new.  They are no longer unproven.  You’ve been standing by the side of the pool long enough.  It’s time to jump in the water.


Big Data: Real benefits or Hype?

I’m a technologist. I believe technology, well utilized, can advance business goals. A business can derive a signficant advantage from making the right technology moves, exploiting information in just the right way.

But I am a bit skeptical of the excitement in the industry around Big Data, MapReduce, and Hadoop. While Google obviously has derived great benefit from MapReduce over the years, Google is special. Most businesses do not look like Google, and do not have information management requirements that are similar to Google’s. Google custom-constructs their PCs. At Google, the unit of computer hardware deployment is “the warehouse.”

If you underwrite insurance, or process medical records, or do scanning of transactions for fraud, or logistics optimization, or statistical process control, or any one of a variety of other typical business information tasks, your company is very much not like Google. If you don’t have hundreds of millions of users, generating billions of transactions, then you’re not like Google, and you should not try to emulate their technology strategy. Big Table is not for you, MapReduce is not something that will give you a strategic advantage.

Big Data seems to be the industry’s next touchstone.  Everyone feels they need th “check the box.”  There’s lots of interest by buyers, so vendors believe they need to talk about it. The tech press, with their persistently positive view of Google, encourages this. Breathless analyst reports fuel the flames. CS programs at universities teach MapReduce in 1st-year courses. Devs put MapReduce on their resume. All this combines to produce a self-reinforcing cycle.

But for most CIOs, MapReduce is a distraction.  In this view, I am persuaded by Dewitt, Stonebraker et al. CIOs should be focusing on figuring out how to better utilize the databases they already have.  Figure out cloud, and figure out how to improve management and governance of IT projects. Are you agile enough?  Are you doing Scrum?  Figure out what major pieces you can buy from your key technology partners.

I have read user stories of people using MapReduce to scan through log files, tens of gigabytes of log files.  Seriously?  Tens of gigabytes fits on a laptop hard-drvie. Unless you are talking about multiple terabytes of information, MapReduce is probably the wrong tool.

If you are doing analysis of the human genome, or weather modelling, or if you work for NSA or Baidu, then yes, you need MapReduce.  Otherwise, Big Data is not yet mainstream.