Evernote’s argument for delivering a REST-less API leaves me unimpressed.

The Evernote API is notable because it is not based on REST. The defense of that decision leaves me unimpressed.

When the world is going to REST, fully open and usable APIs, why would Evernote go the other way? They ought to have a good reason. Evernote’s VP of Platform Strategy Seth Hitchings has something to say about it. According to the article on ProgrammableWeb,…

Hitchings concedes that compared to the RESTful APIs, developers have to endure a bit of a learning curve to make use of the SDKs’ core functionality; to create, read, update, search, and delete Evernote content. But then again, according to Hitchings, Evernote is a special
needs case

OK, so it’s more work for the consuming developers. It’s also more work for the company, because they have to support all the various “SDKs”, as they call them. [Evernote delivers libraries for various platforms including iOS, Android, C#, PHP, JavaScript, and more. They call these things “SDKs”, but they’re really not SDKs. An SDK is a Kit, that includes a libraries, documentation, example code, tools, and other stuff. When Evernote uses the word “SDK” they mean “library.”] So… why? Why do it if everyone has to do more work?

Seeking the least compromise to data-transfer performance, Evernote needed a solution that could shuffle large quantities of data with minimal overhead. Despite its superior efficiency over XML, REST still wasn’t good enough.

Whoa. REST has “superior efficiency over XML”? That’s just nonsense. REST is not a data format. REST is an architectural approach. REST does not mean “not XML”. If you want to transfer XML data using the REST approach, go ahead. That’s why Roy Fielding, Tim Berners-Lee, and Henrik F. Nielsen invented the Content-Type header. That’s what MIME types are for. You can transfer XML, or binary, or any sort of data with REST.

The implicit and incorrect assumption is that REST implies JSON, or that REST implies not binary. That’s false. There is no need to avoid REST in order to attain reasonable data transfer performance.

According to the article, that faulty reasoning is why Evernote selected Apache Thrift. Furthermore, as a benefit!! Thrift has tools to generate libraries for many platforms:

Thrift’s code-generating ability to write-once-and-deploy-to-many is also the reason Evernote is able to offer SDKs for so many platforms.

Yippee! But guess what! If you used REST, you wouldn’t need to generate all those libraries. And you’d have even broader platform support.

Just for fun, let’s have a look at the API that is being generated via Thrift. The Evernote API Reference looks like this:

OMG, the horror. Look at all that stuff. The reason people like REST is that they can figure out the data model just by perusing the URLs. It’s obviously not possible to do so in this case.

Evernote’s is not a modern API. It is a mass of complexity.

Not impressed.

Not Impressed

API-First Development

POP.co logo

Pop.co explains why they went to an API-first development model.

This is going to be a continuing trend. More and more places are already doing this, even if they’re not blogging or talking about it. With the continued growth of mobile devices, to become co-equal with a website, if not the predominant way that people interact with companies, there is an imperative to go to APIs first.

Why?

  • Consistency is key. Companies will want to deliver the same or comparable capabilities, to the extent possible, across websites and mobile apps. There’s obvious duplication. Rather than duplicate efforts, APIs allow companies to reap economies of scale across the two platforms. One API can support both. New features can be released to both simultaneously using the same service-layer infrastructure.
  • Agile is not pure hype. This is how smart software developers work. And APIs are designed for Agile philosophy. APIs get versioned and updated rapidly. There’s no rigid, fixed schema, no “WSDL” to update and fiddle with. Think about it – that is just a clean restatement of the “minimal docs” plank of the Agile philosophy. If you’re agile, you want APIs. They support your work rather than fight it.

Everyone is going to be doing this.

I don’t agree with everything in that POP.co post. For example, POP lists “Scalability” as a reason for going to APIs. I don’t see it. Their theory is that Separation of concerns leads to scalability, meaning they can have web servers and API servers and scale them independently. But Microsoft refuted this as a requirement for high performance long ago. ASPNET proved that you can build a stateless layer of web/app servers, backed by a fast datastore layer, that screams. There’s no real performance benefit to separating the HTML server from the “API Server”.

On the other hand, the other reasons that POP.co offers for going to APIs first, make a ton of sense to me.

Must Love Clouds

Apigee is looking for a few good SEs.

An SE is a Sales Engineer. At Apigee, this is a member of the sales team who’s very technically adept, loves technology, and loves talking to other technologists about how to apply cool technology to solve hard problems.

This person needs to have a good developer background in server-side apps programming, with good C++, Java or C# skills, and probably one or more of Python, Ruby, PHP and NodeJS. Of course should know APIs, REST, SOAP very well, and should be handy with JSON, XML, and the various tools around those. Ought to know who Roy Fielding is, why jQuery is named jQuery, should know what RFC 2616 is, must love clouds, big data. Experience with AWS, Azure, RackSpace, VMware or IBM Smartcloud is a big plus. Beyond all that, the person’s got to love dealing with smart people with different perspectives.

If you know someone who fits this bill, lives near one of the NFL cities in the USA (you know, New York, Boston, Atlanta, Dallas, Houston, Chicago, Denver, San Fran, Los Angeles, etc), and wants to work for an ambitious late stage startup, send em my way: @dpchiesa on twitter.

The way Azure should have done it – A better Synonyms Service

This is a followup from my previous post, in which I critiqued the simple Synonyms Service available on the Azure Datamarket.

To repeat, the existing URI structure for the service is like this:

GET https://api.datamarket.azure.com/Bing/Synonyms/GetSynonyms?Query=%27idiotic%27

How would I do things differently?

The hostname is just fine – there’s nothing wrong with that. So let’s focus on the URI path and the other parts.

GET /Bing/Synonyms/GetSynonyms?Query=%27idiotic%27

Here’s what I would do differently.

  1. Simplify. The URI structure should be simpler. Eliminate Bing and GetSynonyms from the URI path, as they are completely extraneous. Simplify the query parameter. Eliminate the url-encoded quotes when they are not necessary. Result: GET /Synonyms?w=improved
  2. Add some allowance for versioning. GET /v1/Synonyms?w=positive
  3. Allow the caller to specify the API Key in the URI. (Eliminate the distorted use of HTTP Basic Auth to pass this information). GET /v1/Synonyms?w=easy&key=0011EEBB4477

What this gets you, as an API provider:

  1. This approach allows users to try the API from a browser or console without registering. The service could allow 3 requests per minute, or up to 30 requests per day, for keyless access. Allowing low-cost or no-cost exploration is critical for adoption.
  2. The query is as simple as necessary and no simpler. There is no extraneous Bing or GetSynonyms or anything else. It’s very clear from the URI structure what is being requested. It’s “real” REST.

What about multi-word queries? Easy: just URL-encode the space.
GET /v1/Synonyms?w=Jennifer%20Lopez&key=0011EEBB4477

There’s no need to add in url-encoded quotes for every query, in order to satisfy the 20% case where the query involves more than one word. In fact I don’t think multi-word would even be 20%. Maybe more like 5%.

For extra credit, do a basic content negotiation that looks at the incoming Accepts header and modifies the format of the result based on that header. As an alternative, you could include a suffix in the URI path, to indicate the desired output data format, as Twitter and the other big guys do it:

GET /v1/Synonyms.xml?w=adaptive&key=0011EEBB4477

GET /v1/Synonyms.json?w=adaptive&key=0011EEBB4477

As an API provider, conforming to a “pragmatic REST” approach means you will deliver an API that is immediately familiar to developers regardless of the platform they use to submit requests. That means you have a better chance to establish a relationship with those developers, and a better chance to deepen that relationship.

That’s why it’s so important to get the basic things right.

Azure Synonyms Service – How NOT to do REST.

Recently, I looked on the Azure data market place (or whatever it’s called) to see what sort of data services are available there. I didn’t find anything super compelling. There were a few premium, for-fee services that sounded potentially interesting but nothing that I felt like spending money on before I could try things out.

As I was perusing, I found a synonyms service. Nice, but this is nothing earth-shaking. There are already a number of viable, programmable synonyms services out there. Surely Thesaurus.com has one. I think Wolfram Alpha has one. Wordnik has one. BigHugeLabs has one that I integrated with emacs. But let’s look a little closer.

Let me show you the URL structure for the “Synonyms” service available (as “Community Technical Preview”!) on Azure.


https://api.datamarket.azure.com/Bing/Synonyms/GetSynonyms?Query=%27idiotic%27

Oh, Azure Synonyms API, how do I NOT love thee? Let me count the ways…

  1. There’s no version number. What if the API gets revised? Rookie mistake.
  2. GetSynonyms? Why put a verb in the URI path, when the HTTP verb “GET” is already implied by the request? Useless redundancy. If I call GET on a URI path with the word “Synonyms” in it, then surely I am trying to get synonyms, no?
  3. Why is the word Bing in there at all?
  4. Notice that the word to get synonyms of, must be passed with the query param named “Query”. Why use Query? Why not “word” or “term” or something that vaguely corresponds to the actual thing we’re trying to do here? Why pass it as a query param at all? Why not simply as part of the URL path?
  5. Also notice that the word must be enclosed in quotes, which themselves must be URL-encoded. That seems like an awkward design.
  6. What you cannot see in that URL is the authentication required. Azure says the authentication is “HTTP Basic Auth” which means you pass a username and password pair, joined by a colon then base64 encoded, as an HTTP Header. But… there is no username and password. Bing/Azure/Microsoft gives you an API Key, not a user name. And there’s no password. So you need to double the API key then base64 encode *that*, and pretend that it’s HTTP Basic Auth.

If readers aren’t persuaded that the above are evidence of poor API design, then you might consider heading over to the API Craft discussion group on Google Groups to talk it over.

Alternatively, or in addition, spend some time reading “the REST Manifesto,” Roy Fielding’s thesis paper, specifically chapter 5 in that document. It’s about 18 printed pages, so not too big a commitment.

The problem with releasing a poorly-designed API, is that it can do long-term damage.
As soon as a real developer takes a look at your service, he will not walk, he’ll RUN away to an alternative service. If your API is a pain to use, or is poorly designed, you are guaranteed to drive developers somewhere else. And they won’t come back! They might come just to poke around, but if they see a bad service, like this Synonyms service, they will flee, never to return. They will quickly conclude that you just don’t get it, and who could blame them?

So learn from Azure’s mistakes, and learn from the success of others. Take the time to get it right.

And now a word from my sponsor: Apigee offers a Rapid API Workshop service where we can send in experts to collaborate with your team on API design principles and practice. Contact us at sales@Apigee.com for more information.

Hmmm, SOA is a bad word? So let’s call it APIs !

David Linthicum, in his InfoWorld article, observes that SOA is a “bad word”, noting that SOA Companies are now rebranding themselves as API Management companies.

Some editor at InfoWorld apparently chose the article title to be
“Service governance morphs into cloud API management”. But I don’t think that’s an accurate summary of the article. That’s not the gist of it.

The gist is more accurately captured in Linthicum’s subtitle, to wit: More evidence that SOA is a bad word: Traditional SOA service governance technology rebrands itself for the cloud.

Re-branding is the first baby step in the “morphing” process I guess. But sayin it don’t make it so.

The problem with the rest of Linthicum’s anaylsis is that it looks at everything through a SOA lens. API Management exposes APIs – “that’s SOA” he says. API Management platforms enforce security controls. “That’s SOA governance.”

Great, I can see the parallels. But what about the things API Management platforms do that are completely outside the domain of SOA? What about developer enablement and engagement? What about analytics? Versioning, cloud-based scale out.

Linthicum is an SOA guy, and when he looks around, everything is tinted with his SOA-colored glasses. I have a different perspective. SOA solved some big problems. As a metaphor for interconnecting enterprises, it was a huge advance. A huge improvement.

Even so, with SOA there were problems. Poor support for mobile devices. Still a great deal of complexity. Hard for developers to get connected and productive. Poor visibility by business owners into the impact of applicaiton inter-connect traffic. API Management platforms present an opportunity to address all those challenges.

Despite what Linthicum sees, API Management is not simply re-branded SOA.

APIs within the Enterprise – a Webinar

Recently I did a web chat with colleague Greg Brail discussing the use of APIs in the Enterprise.

Quick summary: SOA has been used with success within enterprises to interconnect systems. APIs address a different set of problems, and there is real value to be gained by using APIs to interconnect systems within the enterprise, as well as to provide external or partner access into enterprise systems.

Preflight CORS check in PHP

I was reading up on CORS today; apparently my previous understanding of it was flawed.

Found a worthwhile article by Remy. Also found a problem in the article in the same PHP code he offered. This was server-side code that was shown to illustrate how to handle a CORS preflight request.

The “preflight” is an HTTP OPTIONS request that the user-agent makes in some cases, to check that the server is prepared to serve a request from XmlHttpRequest. The preflight request carries with it the special HTTP Header, Origin.

His suggested code to handle the preflight was:

// respond to preflights
if ($_SERVER['REQUEST_METHOD'] == 'OPTIONS') {
  // return only the headers and not the content
  // only allow CORS if we're doing a GET - i.e. no saving for now.
  if (isset($_SERVER['HTTP_ACCESS_CONTROL_REQUEST_METHOD']) &&
      $_SERVER['HTTP_ACCESS_CONTROL_REQUEST_METHOD'] == 'GET') {
    header('Access-Control-Allow-Origin: *');
    header('Access-Control-Allow-Headers: X-Requested-With');
  }
  exit;
}

But according to my reading of the CORS spec, The Access-Control-Xxx-XXX headers should not be included in a response if the request does not include the Origin header.

See section 6.2 of the CORS doc.

The corrected code is something like this:

// respond to preflights
if ($_SERVER['REQUEST_METHOD'] == 'OPTIONS') {
  // return only the headers and not the content
  // only allow CORS if we're doing a GET - i.e. no saving for now.
  if (isset($_SERVER['HTTP_ACCESS_CONTROL_REQUEST_METHOD']) &&
       $_SERVER['HTTP_ACCESS_CONTROL_REQUEST_METHOD'] == 'GET' &&
       isset($_SERVER['HTTP_ORIGIN']) &&
       is_approved($_SERVER['HTTP_ORIGIN'])) {
    header('Access-Control-Allow-Origin: *');
    header('Access-Control-Allow-Headers: X-Requested-With');
  }
  exit;
}

Implementing the is_approved() method is left as an exercise for the reader!

A more general approach is to do as this article on HTML5 security suggests: perform a lookup in a table on the value passed in Origin header. The lookup can be generalized so that it responds with different Access-Control-Xxxx-Xxx headers when the preflight comes from different origins, and for different resources. This might look like this:

// respond to preflights
if ($_SERVER['REQUEST_METHOD'] == 'OPTIONS') {
  // return only the headers and not the content
  // only allow CORS if we're doing a GET - i.e. no saving for now.
  if (isset($_SERVER['HTTP_ACCESS_CONTROL_REQUEST_METHOD']) &&
      $_SERVER['HTTP_ACCESS_CONTROL_REQUEST_METHOD'] == 'GET' &&
      isset($_SERVER['HTTP_ORIGIN']) &&
      is_approved($_SERVER['HTTP_ORIGIN'])) {
    $allowedOrigin = $_SERVER['HTTP_ORIGIN'];
    $allowedHeaders = get_allowed_headers($allowedOrigin);
    header('Access-Control-Allow-Methods: GET, POST, OPTIONS'); //...
    header('Access-Control-Allow-Origin: ' . $allowedOrigin);
    header('Access-Control-Allow-Headers: ' . $allowedHeaders);
    header('Access-Control-Max-Age: 3600');
  }
  exit;
}

Reference:

Yes, it’s trite, but we really are at an Inflection Point

It may sound like a platitude, but…the industry is now in the midst of an inflection point.

Behind us is the technology of client-server, with web goo glommed-on:

  • UI rendered to fixed computers, implemented using HTML(5) and Javascript.
  • Application logic built in Servlet/JSP, PHP, or ASPNET.
  • Relational databases as a store. Data is accessed via datastore-specific protocols.

Ahead are pure web technologies:

  • UI rendered to mobile computers, and optimized for device capability. Android, iPhone, iPad, and Windows8 are the key options, but more will emerge. The Kindle, XBox, and PS3 are the up-and-comers. The HTML-based web-browser UI will remain as a least-common denominator for some time, but there’s a steady trend away.
  • Application logic built in dynamic languages. Ruby-on-Rails, PHP, Python. Javascript was the first web app server language (Netscape Live server in 1995 and ASP Classic in 1996) and it is now back, with Node.js.
  • Data stores using NoSQL databases with massive scaleout. Data is accessed over HTTP, via REST.

Remember when “Scale” meant a really large box with lots of CPUs in it? We’ve moved to farms of managed computers that accomplish the same thing. Rather than depending on the hardware design to support the scale out, we’ve now done it in software. Rather than relying on the CPU front side bus to move data around, we’re depending on 40GBps or even 100GBps ethernet and software-based data-dependent prioritization and routing.

The force behind the economy of scale of standard high-volume components has not abated. If you wanted to build a superfast computer for one moment in time you might resort to some custom hardware. But the pace of evolution and improvement in CPU, memory, storage, and networking is such that the value of any dedicated hardware declines rapidly, even during design. It makes no economic sense to pursue the scale-up course. Designs need to accommodate evolution in the ecosystem. Just as the “Integrated” vendor-specific computers of the late 80’s gave way to “open systems”, the integrated single computer model is giving way to the “farm of resources” model.

This is all obvious, and has been for some time. Companies like Google were ahead of the curve, and dragged the rest of the industry with them, but now architectures based on the idea that “the datacenter is the computer” are now available for low cost to just about everyone. These architectures have elastic compute, network, and storage, along with the software architecture to exploit it. The upshot is you can just add resources and you get additional, usable performance. Unlike the old “scale up” machines, this approach is not limited to 16 CPUs or 64 or 1024. Just keep going. People call it “cloud technology”, but the main point is elasticity.

The inflection point I spoke about is not defined by a particular month, like Novermber 2012. or even a year. But over the past 6 years, this transition has been slowly, inexorably proceeding.

The one missing piece to the puzzle has been management skills and tools; The gear was there, and the software has continued to improve to exploit the gear, but people were initially not comfortable with managing it. This is dissipating over time, as people embrace the cloud. We’re realizing that we no longer need to perform {daily,weekly} backups because the data is already stored redundantly in Cassandra.

Even as cloud technology gets democratized, drops in price, and becomes more manageable, the capability of a single high-volume server computer continues to ramp upward on a log scale. This means that the old “automation” tasks, tracking orders, ERP systems (whether custom or not)… will be fulfilled by single machines, with optional redundancy.

Cloud technology therefore presents a couple opportunities:

  • For technology conservatives, where IT is a cost center, the maturation of cloud tech drops the cost of deploying new systems, and of handling peak load. A company can purchase short-term options for compute to handle the proverbial “black friday” or “Victoria’s Secret Fashion show” load. This opportunity is somewhat conflated with the ongoing drop in the cost of technology. Basically, the cost is dropping, and it is dropping even more if you let someone else host your servers.
  • For companies that view technology as a business enabler, cloud tech allows them to pursue innovative new approaches for relatively low cost and risk. New partner-enabling initiatives; new channels; new markets or new approaches to the markets they already play in.

Without a doubt, the big payoffs come from the latter, expansive approach. You can’t grow by cutting costs. You *can* grow by increasing speed or efficiency – let’s say, dropping your turn-time on commercial loan approvals from 52 days to 22 days – but the big growth is in entirely new approaches.

But you don’t know what you don’t know. To uncover and develop the opportunities, companies need to dive in. They need to be pushing beyond their normal competencies, learning new things.

Are DDoS attacks a novel threat to API servers? Nope.

Mark O’Neill, CTO at Vordel, published a post on programmableweb regarding DDoS attacks and the implications for APIs.

For those who learned programming before Friends became a hot TV show, the term of art “application programming interface” referred to the function names and signatures that you’d link your program to. These days, the term API refers most often to a Web API, in other words network interface, often a REST-based network interface. One program sends another program an HTTP request, and gets a reply of a given form in response.

I think O’Neill made things sound waaaay more dramatic than they actually are. The term that he used – “Soft underbelly” – was intended to imply that APIs represent a special vulnerability on the Web. That’s simply not accurate. API interfaces are just a “regular underbelly”, to coin a phrase; json access is just like html access. DDoS is a risk and it can affect json servers and html servers alike. O’Neill doesn’t provide any specific advice on why API servers are different, or what special steps need to be taken to protect API resources.

He does make some reasonable points : (a) that API access was given short shrift in the original reports; (b) that APIs are likely to rise in importance as the usage of mobile apps grows; and (c) and that hosting APIs separately from www traffic (on api.mybank.com vs www.mybank.com) might/could have mitigated problems.

But API management platforms such as the one sold by O’Neill’s company, are not likely to be effective against any non-naive DDoS. In fact the existing DDoS mitigation techniques, using network devices, are all we need to protect APIs. “Nothing to see here, move along.”

I understand that hype will attract attention to the post and to O’Neill’s company. On balance though, I think he’s doing more of a disservice to APIs by exaggerating or even mischaracterizing the risks.


Reference: Intro to Distributed Denial of Service attacks

Disclaimer: I work for Apigee, which is a purveyor of API Management solutions. These opinions re my own.