Recently, there has been a lot of buzz about “No SQL” databases. In fact there are at least two conferences on the topic in 2009, one on each coast. Seemingly this buzz comes from people who are proponents of:
• document-style stores in which a database record consists of a collection of (key, value) pairs plus a payload. Examples of this class of system include CouchDB and MongoDB, and we call such systems document stores for simplicity
• key-value stores whose records consist of (key, payload) pairs. Usually, these are implemented by distributed hash tables (DHTs), and we call these key-value stores for simplicity. Examples include Memcachedb and Dynamo.
In either case, one usually gets a low-level record-at-a-time DBMS interface, instead of SQL. Hence, this group identifies itself as advocating “No SQL.”
There are two possible reasons to move to either of these alternate DBMS technologies: performance and flexibility.
The performance argument goes something like the following. I started with MySQL for my data storage needs and over time found performance to be inadequate. My options were:
1. “Shard” my data to partition it across several sites, giving me a serious headache managing distributed data in my application
or
2. Abandon MySQL and pay big licensing fees for an enterprise SQL DBMS or move to something other than a SQL DBMS.
The flexibility argument goes something like the following. My data does not conform to a rigid relational schema. Hence, I can’t be bound by the structure of a RDBMS and need something more flexible.
This blog posting considers the performance argument; a subsequent posting will address the flexibility argument.
For simplicity, we will focus this discussion on the workloads for which NoSQL databases are most often considered: update- and lookup-intensive OLTP workloads, not query-intensive data warehousing workloads. We do not consider document repositories or other specialized workloads for which NoSQL systems may be well suited.
There are two ways to improve OLTP performance; namely, provide automatic “sharding” over a shared-nothing processing environment and improve per-server OLTP performance.
In the first case, one improves performance by providing scalability as nodes are added to a computing environment; in the second case, one improves performance of individual nodes. Every serious SQL DBMS (e.g., Greenplum, Asterdata, Vertica, Paraccel, etc.) written in the last 10 years has provided shared nothing scalability, and any new effort would be remiss if it did not do likewise. Hence, this component of performance should be “table stakes” for any DBMS. In my opinion, nobody should ever run a DBMS that does not provide automatic sharding over computing nodes.
As a result, this posting continues with the other component, namely, single node OLTP performance. The overhead associated with OLTP databases in traditional SQL systems has little to do with SQL, which is why “NoSQL” is such a misnomer.
Instead, the major overhead in an OLTP SQL DBMS is communicating with the DBMS using ODBC or JDBC. Essentially all applications that are performance sensitive use a stored-procedure interface to run application logic inside the DBMS and avoid the crippling overhead of back-and-forth communication between the application and the DBMS. The other alternative is to run the DBMS in the same address space as the application, thereby giving up any pretense of access control or security. Such embeddable DBMSs are reasonable in some environments, but not for mainstream OLTP, where security is a big deal.
Using either stored procedures or embedding, the useful work component is a very small percentage of total transaction cost, for today’s OLTP data bases which usually fit in main memory. Instead, a recent paper [1] calculated that total OLTP time was divided almost equally between the following four overhead components:
Logging: Traditional databases write everything twice; once to the database and once to the log. Moreover, the log must be forced to disk, to guarantee transaction durability. Logging is, therefore, an expensive operation.
Locking: Before touching a record, a transaction must set a lock on it in the lock table. This is an overhead-intensive operation.
Latching: Updates to shared data structures (B-trees, the lock table, resource tables, etc.) must be done carefully in a multi-threaded environment. Typically, this is done with short-term duration latches, which are another considerable source of overhead.
Buffer Management: Data in traditional systems is stored on fixed-size disk pages. A buffer pool manages which set of disk pages is cached in memory at any given time. Moreover, records must be located on pages and the field boundaries identified. Again, these operations are overhead intensive.
If one eliminates any one of the above overhead components, one speeds up a DBMS by 25%. Eliminate three and your speedup is limited by a factor of two. You must get rid of all four to run a lot faster.
Although the No SQL systems have a variety of different features, there are some common themes. First, many manage data that is distributed across multiple sites, and provide the “table stakes” noted above. Obviously, a well-designed multi-site system, whether based on SQL or something else, is way more scalable than a single-site system.
Second, many No SQL systems are disk-based and retain a buffer pool as well as a multi-threaded architecture. This will leave intact two of the four sources of overhead above.
Concerning transactions, there is often support for only single record transactions and an eventual consistency replica system, which assumes that transactions are commutative. In effect the “gold standard” of ACID transactions is sacrificed for performance.
However, the net-net is that the single-node performance of a NoSQL, disk-based, non-ACID, multithreaded system is limited to be a modest factor faster than a well-designed stored-procedure SQL OLTP engine. In essence, ACID transactions are jettisoned for a modest performance boost, and this performance boost has nothing to do with SQL.
However, it is possible to have one’s cake and eat it too. To go fast, one needs to have a stored procedure interface to a run-time system, which compiles a high-level language (for example, SQL) into low level code. Moreover, one has to get rid of all of the above four sources of overhead.
A recent project [2] clearly indicated that this is doable, and showed blazing performance on TPC-C. Watch for commercial versions of these and similar ideas with open source packaging. Hence, I fully expect very high speed, open-source SQL engines in the near future that provide automatic sharding. Moreover, they will continue to provide ACID transactions along with the increased programmer productivity, lower maintenance, and better data independence afforded by SQL. Hence, high performance does not require jettisoning either SQL or ACID transactions.
In summary, blinding performance depends on removing overhead. Such overhead has nothing to do with SQL, but instead revolves around traditional implementations of ACID transactions, multi-threading, and disk management. To go wildly faster, one must remove all four sources of overhead, discussed above. This is possible in either a SQL context or some other context.
References
[1] S. Harizopoulos, et. al., “Through the Looking Glass, and What We Found There,” Proc. 2008 SIGMOD Conference, Vancouver, B.C., June 2008
[2] M. Stonebraker, et. al., “The End of an Architectural Era (It’s Time for a Complete Rewrite),” Proc 2007 VLDB Conference, Vienna, Austria, Sept. 2007
Disclosure: Michael Stonebraker is associated with four startups that are either producers or consumers of data base technology. Hence, his opinions should be considered in this light.
It appears that Racine is proposing a return to a hierarchical data model (popularized by IBM's IMS in the 1960s). In such a system joins are avoided by being "prebuilt" into the definition of the hierarchy. Hence, the line items of a purchase order would be clustered together and stored with their parent purchase order. A hierarchical data model will work well for certain kinds of data (for example documents); however, history has shown it to be unsuitable for a general purpose data model. Instead of writing a long-winded response, I will just make two quick points:
Codd's 1970 CACM paper presents a very clear exposition of why hierarchies don't work in general. In fact, Codd's work was mainly motivated by the problems that IMS customers were having "shoe-horning" their data into a hierarchical model.
IMS was extended in the 1970's with so-called logical data bases, which extended IMS to support non-hierarchical data. Hence, the limitations of hierarchies were well understood long ago.
In summary, there has been about 40 years of DBMS research and experience on data models. Much of this is well recorded in the literature (especially in the 1970's) and is discussed in all major DBMS textbooks. I would refer hierarchical enthusiasts to these sources for a comparison of data models.
Michael Stonebraker, July 6, 2010
Ahhh this is a very clarifying article on the alleged differences between sql and nosql.
the way nosql is hyped it's as though it's found a way to operate without the "impedance mismatch" between the constraints imposed by hardware and the need for speed. I always wondered what that way might have been.
Data is to be transmitted, written, stored and retrieved. Just how much performance is to be squeezed out of any approach given those requirements and the requirement that a solution actually manifest itself in the real world? ?
Thanks so much for providing this no-specialist with a solid foundation on which to evaluate competing claims.
Displaying comments 11 - 12 of 12 in total