With the rise of big data technologies like Apache Hadoop, MapReduce and the bevy of open source products growing...
up around them, the good old traditional SQL database has been losing mindshare with some fairly influential application developers.
And it's not difficult to understand why. Big data is hot, and there have been plenty of headlines over the last few years questioning the long-term viability of SQL in the era of unstructured data. It's no surprise that many developers want to follow suit with the big data pioneers at Google and Facebook -- but the desire to go big isn't always a practical one.
Just ask Tim O'Brien, an author and independent consultant who specializes in helping companies work more effectively with developers. O'Brien, who spoke about the future of relational databases at the recent O'Reilly Strata Conference in Santa Clara, believes that when one looks at the history of IT over the last several years, it's easy to understand why attitudes have changed.
"There is a certain kind of developer that is really focused on the trends that are being set by that group of 50 people that do big architecture at a place like Facebook or Google," O'Brien said during a phone call after the conference. "The conclusion that they came to in the last couple of years is: 'We would never use a relational database. Relational databases don't scale.'"
While well-funded startups and big data crunching organizations like the Chicago Mercantile Exchange, NASDAQ, the Internal Revenue Service and others will follow suit with the likes of Google, the average company will continue to find that SQL is the right tool for most development projects for the foreseeable future, according to O'Brien.
O'Brien offers three main reasons why organizations in general "can't escape SQL development." For starters, it's a language that has a great deal of inertia, he said. The majority of development tools and platforms, such as Ruby on Rails, are using SQL. Secondly, it's the best query language available. Lastly, SQL was originally created as a way to help organizations work more easily with multiple vendor' databases -- and O'Brien predicts that SQL's ability to unify will continue to be important for years to come.
"I think the big data community is focused on creating this perception that the world is changing right now, and if you continue to use that old relational database technology, you're just going to be an old useless man working on old useless systems," O'Brien said. "And I think that's false."
O'Brien went on to suggest that in the next few years the "traditional" SQL database may evolve into something better and far more scalable -- something that blurs the lines between big data technology and more familiar database management systems. He pointed to Google's Spanner database as one possible example of things to come.
"I think Spanner points the way toward the future of big data for most companies," he said. "The important thing about Spanner is that it's SQL-based, it provides transactions, it is horizontally scalable -- and that's the big difference."
For more on the 2013 O'Reilly Strata conference and big data vs. SQL
Find out why one Strata speaker called for a greater emphasis on data scientist education
Learn the key characteristics for success in data science
Another company that offers a possible glimpse of how SQL fits into the future is Drawn to Scale, which bills its Spire product as "the first database for large, user-facing applications built on Hadoop." Spire supports SQL and MongoDB queries in addition to MapReduce, and is built to power large-scale websites, mobile deployments and other applications.
"There is no reason why you can't use SQL to query everything, right? That is already happening. People are using SQL to query Hadoop," O'Brien said. "Fast forward 20 years and I don't care how the database is deployed to me as a developer. I'm just executing a SQL query and getting a result back. It's like the difference between a cloud-based Linux machine and a real Linux machine. It's the interface that defines the experience."
When it comes time to develop a big application or website, it's important to avoid the hype and simply pick the right tool for the job. While there may be temptation to discount relational altogether and go straight to big data technologies, it's important to weigh both approaches against the need of the job at hand. Conference attendee Felix Giguere Villegas, a distributed systems specialist who runs the Big Data Montreal user group, said he agrees with that point.
"For analyzing logs, you're probably better off with a tool like Hadoop," he said. "But for a lot of use cases, SQL does the trick quite well, especially at the scale most of us run at and especially considering the skills that are available in the marketplace at the moment."
Giguere Villegas went on to say that he would welcome any big data technologies that incorporate SQL. He said some of the SQL engines that run on top of Hadoop -- such as Cloudera Impala -- are proving that horizontal scalability for SQL is possible. The only problem is that these offerings do not boast the same level of maturity as the popular relational databases of today.
"SQL is a very useful abstraction and, of course, there is momentum behind the fact that a bunch of people know it," Giguere Villegas said. "But it's not just momentum that is going to keep it there. It is genuinely useful to have SQL, and if we can have a mature, working, interactive, scalable SQL solution on top of a big data platform, that would be a big boon for everyone."
About the author:
Mark Brunelli is news director for the Business Applications and Architecture Media Group at TechTarget. After a five-year stint covering crime and local politics for The Boston Globe, Brunelli joined TechTarget in 2000. He has since covered a wide range of technologies, running the gamut from hardware to software. Email him at firstname.lastname@example.org and follow him on Twitter: @Brunola88.