Cloud Run instances may shutdown at any time

A bit of a surprise for me, I thought I’d note it down. I’ve been using Cloud Run for serverless hosting of containerized apps. The whole experience is really slick. I can provide a source directory for something like a Java app or a nodejs app, and Cloud Run will build the container for me, when I use something like

gcloud run deploy my-service –source . –allow-unauthenticated –region us-west1

You can also set minimum and maximum instances with these options:

–min-instances 1
–min-instances 2

But what is “minimum” anyway? What I did not realize is this fact, taken from the Cloud Run documentation:

For Cloud Run services, an idle instance can be shut down at any time, including instances kept warm via a minimum number of instances.

You have been warned.

What this means to me is, I need to design my services so that they can always start up, and re-configure themselves. Any state they maintain in memory needs to be … persisted somewhere, so that in the case of shutdown and restart, that state can be re-applied.

I should not have been surprised by this.

Yes, it’s trite, but we really are at an Inflection Point

It may sound like a platitude, but…the industry is now in the midst of an inflection point.

Behind us is the technology of client-server, with web goo glommed-on:

  • UI rendered to fixed computers, implemented using HTML(5) and Javascript.
  • Application logic built in Servlet/JSP, PHP, or ASPNET.
  • Relational databases as a store. Data is accessed via datastore-specific protocols.

Ahead are pure web technologies:

  • UI rendered to mobile computers, and optimized for device capability. Android, iPhone, iPad, and Windows8 are the key options, but more will emerge. The Kindle, XBox, and PS3 are the up-and-comers. The HTML-based web-browser UI will remain as a least-common denominator for some time, but there’s a steady trend away.
  • Application logic built in dynamic languages. Ruby-on-Rails, PHP, Python. Javascript was the first web app server language (Netscape Live server in 1995 and ASP Classic in 1996) and it is now back, with Node.js.
  • Data stores using NoSQL databases with massive scaleout. Data is accessed over HTTP, via REST.

Remember when “Scale” meant a really large box with lots of CPUs in it? We’ve moved to farms of managed computers that accomplish the same thing. Rather than depending on the hardware design to support the scale out, we’ve now done it in software. Rather than relying on the CPU front side bus to move data around, we’re depending on 40GBps or even 100GBps ethernet and software-based data-dependent prioritization and routing.

The force behind the economy of scale of standard high-volume components has not abated. If you wanted to build a superfast computer for one moment in time you might resort to some custom hardware. But the pace of evolution and improvement in CPU, memory, storage, and networking is such that the value of any dedicated hardware declines rapidly, even during design. It makes no economic sense to pursue the scale-up course. Designs need to accommodate evolution in the ecosystem. Just as the “Integrated” vendor-specific computers of the late 80’s gave way to “open systems”, the integrated single computer model is giving way to the “farm of resources” model.

This is all obvious, and has been for some time. Companies like Google were ahead of the curve, and dragged the rest of the industry with them, but now architectures based on the idea that “the datacenter is the computer” are now available for low cost to just about everyone. These architectures have elastic compute, network, and storage, along with the software architecture to exploit it. The upshot is you can just add resources and you get additional, usable performance. Unlike the old “scale up” machines, this approach is not limited to 16 CPUs or 64 or 1024. Just keep going. People call it “cloud technology”, but the main point is elasticity.

The inflection point I spoke about is not defined by a particular month, like Novermber 2012. or even a year. But over the past 6 years, this transition has been slowly, inexorably proceeding.

The one missing piece to the puzzle has been management skills and tools; The gear was there, and the software has continued to improve to exploit the gear, but people were initially not comfortable with managing it. This is dissipating over time, as people embrace the cloud. We’re realizing that we no longer need to perform {daily,weekly} backups because the data is already stored redundantly in Cassandra.

Even as cloud technology gets democratized, drops in price, and becomes more manageable, the capability of a single high-volume server computer continues to ramp upward on a log scale. This means that the old “automation” tasks, tracking orders, ERP systems (whether custom or not)… will be fulfilled by single machines, with optional redundancy.

Cloud technology therefore presents a couple opportunities:

  • For technology conservatives, where IT is a cost center, the maturation of cloud tech drops the cost of deploying new systems, and of handling peak load. A company can purchase short-term options for compute to handle the proverbial “black friday” or “Victoria’s Secret Fashion show” load. This opportunity is somewhat conflated with the ongoing drop in the cost of technology. Basically, the cost is dropping, and it is dropping even more if you let someone else host your servers.
  • For companies that view technology as a business enabler, cloud tech allows them to pursue innovative new approaches for relatively low cost and risk. New partner-enabling initiatives; new channels; new markets or new approaches to the markets they already play in.

Without a doubt, the big payoffs come from the latter, expansive approach. You can’t grow by cutting costs. You *can* grow by increasing speed or efficiency – let’s say, dropping your turn-time on commercial loan approvals from 52 days to 22 days – but the big growth is in entirely new approaches.

But you don’t know what you don’t know. To uncover and develop the opportunities, companies need to dive in. They need to be pushing beyond their normal competencies, learning new things.

One billion smart phones, and counting

From C|Net, Strategy Analytics reports that there are now more than 1 billion smartphones in use on the planet. The story has been picked up by other places, too.

One billion is a big number, there’s no question about that. But what people are glossing over is that the number is +46% in one year. That’s is a HUGE jump.

Apple, Samsung and Nokia, with their partners, are obviously making things that people want. Somehow HTC is missing out on the gravy train. While the market is thumping along, HTC has reported their lowest profit since 2006. Time for the board to take action.)

Aside from the mobile handset space, with that kind of scale it is easy to imagine some pretty outsize opportunities in complementary markets. Like, for example, in connecting all those phones to cloud-based resources. I guess there would have to be a way for software programs running on the handsets, to be able to connect with the cloud. Some kind of programmatic interface – an API, perhaps?

And then on the cloud-side – and by cloud I mean anything that is not tethered to the phone, anything that is reached over the internet – there oughtta be some way for cloud-based service providers to manage the deluge of API access, to measure it, to rate-limit, examine it.

Hmmmm, I wonder if there’s a company that could help with that?

Impressive factoids on Facebook and Hadoop

It’s common knowledge that Facebook runs Hadoop. The largest Hadoop cluster on the planet.

Here are some stats, courtesy of HighScalability, which scraped them from twitter during the Velocity conference:

  • 6 billion mobile messages every 30 minutes
  • 3.8 trillion cache operations in 30 minutes
  • 160m newsfeeds, 5bln realtime msgs, 10bln profile pics, 108 bln queries on mysql, all in 30 minutes

Now, some questions of interest:

  1. How close is the typical enterprise to that level of scale?
  2. How likely is it that a typical enterprise would be able to take advantage of such scale to improve their core business, assuming reasonable time and money budgets?

Let’s say you are CIO of a $500M financial services company.  Let’s suppose that you make an average of $10 per business transaction; and further suppose that each business transaction requires 24 database operations, including queries and updates.

At that rate, you’d run 50M*24 = about 1.2B database transactions … per year.

Scroll back up. What does Facebook do?  3.8B in 30 minutes.  Whereas 1.2B per year works out to be about 68,000 in 30 minutes. Facebook does 55,000 times as many database transactions as the hypothetical financial services company.

Now, let me repeat those questions:

  • If you run that hypothetical company, do you need Hadoop?
  • If you had Hadoop, would you be able to drive enough data through it to justify the effort of adoption?

 

Do you think “Cloud Computing” is vaguely defined? Dell disagrees.

I was reading through some of the Microsoft chatter surrounding TechEd, the company’s massive annual technical conference.

One of the blog posts got my attention with this headline:

I wasn’t clear on just what that meant, so I clicked through, of course.

The biggest news there is about System Center, which is an IT Pro management tool. Apparently the latest offering has some new features that help sysadmins manage “private clouds”, which I suppose refers to on-premises pools of computers that are allocated out in some flexible manner. Sounds like useful stuff. The proof is in the pudding of course.

But the thing that caught my eye was one of the partner news items. A “cloud offering” from Dell.

A Dell rack filled with storage and compute
Dell's Impression of a Cloud

OK, now I know what Dell thinks a cloud looks like. 🙂

Mark Russinovich #TechEd on Windows Azure VM hosting and Virtual Networking

A video of his one-hour presentation with slides + demo.

Originally Windows Azure was a Platform-as-a-Service offering; a cloud-hosted platform. This was a new platform, something like Windows Server, but not Windows Server. There was a new application model, a new set of constraints. With the recent announcement, Microsoft has committed to running arbitrary VMs. This is a big shift towards what people in the industry call Infrastructure-as-a-Service.

Russinovich said this with a straight face:

One of the things that we quickly realized as people started to use the [Azure] platform is that they had lots of existing apps and components that they wanted to bring onto the platform…

It sure seems to me that Russinovich has put some spin into that statement.  It’s not the case that Microsoft “realized” customers would want VM hosting.  Microsoft knew very well that customers,  enterprises in particular, would feel confidence in a Microsoft OS hosting service, and would want to evaluate such an offering as a re-deployment target for existing systems.

This would obviously be disruptive, both to Microsoft partners (specifically hosting companies) and to Microsoft’s existing software licensing business.  It’s not that Microsoft “realized” that people would want to host arbitrary VMs. They knew it all along, but delayed offering it to allow time for the partners and its own businesses to catch up.

Aside from that rotational verbiage, Russinovich gives a good overview of some of the new VM features, how they work and how to exploit them.

Is Microsoft a Cloud-first Company?

Microsoft is a Cloud-first company, asserts Jonathan Hassell.

Not sure that’s completely accurate, or helpful.  He’s right that Microsoft is accentuating the cloud offerings, these days, and is really pushing to exploit what is a once-every-two decades kind of disruptive development in the industry.

On the other hand the lion’s share of Microsoft’s revenue still derives from on-premises software, in its “traditional strongholds.”

Does Metcalfe’s Law apply to Cloud Platforms and Big Data?

Cloud Platforms and Big Data – Have we reached a tipping point?  To think about this, I want to take a look back in history.

Metcalfe’s Law was named for Robert Metcalfe, one of the true internet pioneers, by George Gilder,  in an article that appeared in a 1993 issue of Forbes Magazine, it states that the value of a network increases with the square of the number of nodes.  It was named in the spirit of “Moore’s Law” – the popular aphorism attributed to Gordon Moore that stated that the density of transistors on a chip roughly doubles every 18 months. Moore’s Law succinctly captured why computers grew more powerful by the day.

With the success of “Moore’s Law”, people looked for other “Laws” to guide their thinking about a technology industry that seemed to grow exponentially and evolve chaotically, and “Metcalfe’s Law” was one of them.  That these “laws” were not really laws at all, but really just arguments, predictions, and opinions, was easily forgotten. People grabbed hold of them.

Generalizing a Specific Argument

Gilder’s full name for the “law” was “Metcalfe’s Law of the Telecosm”, and in naming it, he was thinking specifically of the competition between telecommunications network standards, ATM (Asynchronous Transfer Mode) and Ethernet.  Many people were convinced that ATM would eventually “win”, because of its superior switching performance, for applications like voice, video, and data.  Gilder did not agree. He thought ethernet would win, because of the massive momentum behind it.

Gilder was right about that, and for the right reasons. And so Metcalfe’s Law was right!  Since then though, people have argued that Metcalfe’s Law applies equally well to any network.  For example, a network of business partners, a network of retail stores, a network of television broadcast affiliates, a “network” of tools and partners surrounding a platform.  But generalizing Gilder’s specific argument this way is sloppy.

A 2006 article in IEEE Spectrum on Metcalfe’s Law says flatly that  the law is “Wrong”, and explains why:  not all “connections” in a network contribute equally to the value of the network.  Think of Twitter – most subscribers publish very little information, and to very limited circles of friends and family.  Twitter is valuable, and it grows in value as more people sign up, but Metcalfe’s Law does not provide the metaphor for valuing it. Or think of a telephone network: most people spend most of their time on the phone with around 10 people. Adding more people to that network does not increase the value of the network, for those people. Adding more people does not cause revenue to rise according to the  O(n2) metric implicit in Metcalfe’s Law.

Clearly the direction of the “law” is correct – as a network grows, its value grows faster.  We all feel that to be implicitly true, and so we latch on to Gilder’s aphorism as a quick way to describe it. But clearly also, the law is wrong generally.

Alternative “Laws” also Fail

The IEEE article tries to offer other valuation formulae, suggesting that the true value is not O(n2), but instead O(n*log(n)), and specifically suggests this as a basis for valuation of markets, companies, and startups.

That suggestion is arbitrary.  I find the mathematical argument presented in the IEEE article to be hand-wavey and unpersuasive. The bottom line is that networks are different, and there is not one law – not Metcalfe’s, nor Reed’s nor Zipf’s as suggested by the authors of that IEEE article – that applies generally to all of them. Metcalfe’s Law applied specifically but loosely to the economics of ethernet, just as Moore’s Law applied specifically to transistor density. Moore’s Law was not general to any manufacturing process, nor is Metcalfe’s Law general to any network.

Sorry, there is no “law”; One needs to understand the economic costs and potential benefits of a network, and the actual conditions in the market, in order to apply a value to that network.

Prescription: Practical Analysis

Ethernet enjoyed economic advantages in terms of cost of production and generalization of R&D development. Ethernet reached an economic tipping point, and beyond that other factors like improved switching performance of alternatives, were simply not enough to overcome the existing investment in tools, technology, and understanding of ethernet.

We all need to apply that sort of practical thinking to computing platform technology. For some time now, people have been saying that Cloud Platform technology is the Next Big Thing. There have been some skeptics, notably Larry Ellison, but even he has come around and is investing.

Cloud Platforms will “win” over existing on-premises platform options, when it makes economic sense to do so. In practice, this means, when tools for building, deploying, and managing systems in the cloud become widely available, and just as good as those that exist and are in wide use for on-premises platforms.

Likewise “Big Data” will win when it is simply better than using traditional data analysis, for mainstream data analysis workloads. Sure, Facebook and Yahoo use MapReduce for analysis, but, news flash: Unless you have 100m users, your company is not like Facebook or Yahoo. You do not have the same analysis needs. You might want to analyze lots of data, even terabytes of it. But the big boys are doing petabytes. Chances are, you’re not like them.

This is why Microsoft’s Azure is so critical to the evolution of Cloud offerings  Microsoft brought computing to the masses, and the company understands the network effects of partners, tools providers, and developers. It’s true that Amazon has a lead in cloud-hosted platforms, and it’s true that even today, startups prefer cloud to on-premises. But EC2 and S3 are still not commonly considered as general options by conservative businesses. Most banks, when revising their loan processing systems, are not putting EC2 on their short list. Microsoft’s work in bringing cloud platforms to the masses will make a huge difference in the marketplace.

I don’t mean to predict that Microsoft will “win” over Amazon in cloud platforms; I mean only to say that Microsoft’s expanded presence in the space will legitimize Cloud and make it much more accessible. Mainstream. It remains to be seen how soon or how strongly Microsoft will push on Big Data, and whether we should expect to see the same effect there.

The Bottom Line

Robert Metcalfe, the internet pioneer, himself apparently went so far as to predict that by 2013, ATM would “prevail” in the battle of the network standards. Gilder did not subscribe to such views. He felt that Ethernet would win, and Metcalfe’s Law was why. He was right.

But applying Gilder’s reasoning blindly makes no sense. Cloud and Big Data will ultimately “win” when they mature as platforms, and deliver better economic value over the existing alternatives.

 

Windows Azure goes SSD

In a previous post I described DynamoDB, the SSD-backed storage service from Amazon, as a sort of half-step toward better scalability.

With the launch of new Azure services from Microsoft, it appears that Microsoft will offer SSD, too.  Based on the language used in that report — The new “storage hardware” is thought to include solid state drives (SSDs) —  this isn’t confirmed, but it sure looks likely.

I haven’t looked at the developer model for Azure to find out if the storage provisioning is done automatically and transparently, as I suggested it should be in my prior post.  I’ll be interested to compare Microsoft’s offering with DynamoDB in that regard.

In any case, notice is now given to mag disk drives: do not ask for whom the bell tolls.