Advantages of Software Database Management Systems

November 19, 2012

insideanalysis insideanalysis.com

The data warehousing community has always made room for high performance database management systems (DBMSs) that used proprietary hardware because massive ingest rates and fast response times for big data analytics were not achievable on standard hardware. Now, however, today’s standard x86 hardware, combined with next generation software DBMSs, can deliver the goods at a much lower cost and with many other advantages that are inherent to software running on standard hardware.
 
Evolution of Standard Hardware
With few exceptions, big data warehouse appliances – i.e., DBMS software configured on vendor-engineered hardware – use massively parallel (MPP) architectures to reliably deliver performance for big data analytics and information management. By “big” we’re talking about systems handling terabyte to petabytes of data. MPP architectures divvy up a workload to independent processors so that adding nodes delivers linear speed-up gains.
 
The secret sauce of the first generation of MPP data warehouse appliances in the 1990s (e.g., Teradata and Tandem) was their proprietary high speed networking between processing nodes. But today, affordable high speed network solutions, including 10gigE and InfiniBand, make proprietary network solutions unnecessary.
 
The next generation of turbo-charged appliances (e.g., Netezza and Kickfire) amped up performance even further with field programmable gate arrays (FPGAs), which are specialized processors for DBMS tasks. CPU evolution after early 2000s, however, turned towards multi-core CPUs. After a slow start, this movement picked up momentum first with two-core, then four-core and now eight-core CPUs shipping in volume. This is only the beginning as further advances are on the way. With more cores, the advantages of co-processors such as FPGAs for computation decrease, and the disadvantages of moving data to and from co-processors increase. Memory bandwidth has increased in step with multi-core processors, making the CPU a much more powerful computation engine.
 
Based on off-the-shelf and relatively inexpensive hardware, shipments of servers and storage to host Hadoop farms are growing at a rate of nearly 60% per year – rack upon rack of relatively inexpensive x86 servers running Linux. Much of this gear is being used to land and assimilate big data, both structured and unstructured. The newer class of software-only MPP DBMS can run on the same hardware. These solutions offer a variety of methods of sharing data between Hadoop and database processes. Like Hadoop, software-based DMBMs allow customers to scale at the level needed, for example, adding a single node or 100 nodes.
 
Business Intelligence and Data Warehousing in the Cloud
Cloud computing is the biggest IT game-changer in decades. Estimates of the market growth rates for public cloud products and services keep getting revised upward. In fact, conservative estimates predict 20% growth annually on roughly $110 billion in 2012 in the U.S. alone. On the face of it, the case for moving to the cloud is killer: pay for use, capacity on demand, always-on operations, economies of scale through resource pooling and sharing, price and performance competition between cloud providers (versus captive data centers), IT headcount reduction, et al.
 
Countless young companies are growing up knowing only public clouds as their IT infrastructure. Their IT departments are skeletal or non-existent, relying instead on software as a service (SaaS), platforms as a service (PaaS) and databases as service (DBaaS). Established companies are moving toward public clouds more cautiously but measurably, starting with one-time projects and shadow IT operations. Private clouds are growing rapidly as well as spending on private cloud infrastructure in 2011 was about $11 billion compared to $6 billion just a year earlier.
 
New BI tools are emerging that are strictly for cloud-based business intelligence and several established tools are pivoting hard into cloud business. For a data mart, a data warehouse or an analytic sandbox to run in the cloud, public or private, the DBMS must be software that runs on virtualized hardware. Data warehouse appliances don’t fit the cloud paradigm. The good news is that there are MPP software solutions out there to fit the bill.
 
Empowering the Front Line
The era of cloud computing and open source software ushered in a renaissance in business technology sales methods as well. It’s characterized by free software downloads, try-before-you-buy products and free community editions. Consumer-workers are on the front line of the procurement process, and they are likely to become a user of a product before becoming a customer. Software-based DBMS solutions are geared to empower the consumer-worker when the vendors enable users to download software or provision it in cloud software at no cost.
 
Conclusion
“Cost take-out” is a widespread mandate for IT leadership. Standard hardware and cloud computing help to drive down costs massively. Data warehouse appliances have generated abundant revenue streams for their parent companies, and their value – enabling data-driven organizations – is worth even more than their costs. Now, however, customers can look to a new generation of software MPP DBMS to deliver big data-driven solutions from standard hardware and on the cloud.