Sauce Labs explains its move from NoSQL CouchDB to old-skool MySQL

Sauce Labs has rendered a valuable service to the community by documenting the factors that went into a decision to change infrastructure – to replace CouchDB with MySQL.

Originally the company had committed  to CouchDB, which is a novel NoSQL store originating from a team out of MIT.  CouchDB is termed a “document store” and if you are a fan of REST and JSON, this is the NoSQL store for you.

Every item in CouchDB is a map of key-value pairs of arbitrarily deep nesting.  Apps retrieve objects via a clean REST api, and the data is JSON.  Very nice, easy to adopt, and, with the ubiquity of json parsers, CouchDB is easy to access from any language or platform environment. Speaking of old-school, I built a demo connecting from Classic ASP / Javascript to CouchDB – it was very easy to produce.  I also did a small client in PHP, C#, Python – all of them are 1st class clients in the world of CouchDB.

It really is a very enjoyable model for a developer or systems architect.

For Sauce Labs, though, the bottom line was – drumroll, please – that CouchDB was immature.  The performance was not good. Life with incremental indexes was … full of surprises.  The reliability of the underlying data manager was substandard.

Is any of this surprising?

And, MySQL is not the recognized leader in database reliability and enterprise readiness, which makes the move by Sauce Labs even more telling.

Building blocks of infrastructure earn maturity and enterprise-readiness through repeated trials.  Traditional relational data stores, even open source options, have been tested in real-world,  high-load, I-don’t-want-to-be-woken-up-at-4am scenarios. Apparently CouchDB has not.

I suspect something similar is true for other NoSQL technologies, including MongoDB, Hadoop, and Cassandra. I don’t imagine they would suffer from the same class of reliability problems reported by Sauce Labs. Instead, these pieces lack maturity and fit-and-finish in other ways. How difficult is it to partition your data? What are the performance implications of structuring a column family a certain way? What kind of network load should I expect for a given deployment architecture? These are questions that are not well understood in the NoSQL world.  Not yet.

Yes, some big companies run their businesses on Hadoop and other NoSQL products. But chances are, those companies are not like yours. They employ high-dollar experts dedicated to making those things hum. They’ve pioneered much of the expertise of using these pieces in high-stress environments, and they paid well for the privilege.

Is NoSQL ready for the enterprise?

Ready to be tested, yes. Ready to run the business?  Not yet.

In any case, it’s very valuable for the industry to get access to such public feedback. Thanks, Sauce Labs.