Every business wants real-time analytics. Why should you wait overnight for batch analytic and data integration jobs to understand your customer data better? Failure to adapt to today’s data-driven economy where the game changes minute-to-minute can mean business failure for a startup or new innovation project in a public corporation. But should this really cost millions of dollars in upfront investment for the potential for better business results?
That’s exactly what we heard this week with Oracle’s Big Memory Machine announcement. As the saying goes, “when you make hammers, everything is a nail.” And when you’ve spent billions on a proprietary hardware company, you’ll do anything to create demand :-). Especially when your former friends like SAP and HP that you’ve ticked off are eating your lunch with columnar analytics databases like HANA and Vertica, respectively.
Here are the top three reasons that Oracle M6 Big Memory Machine is simply the wrong answer for the future of the database.
1) Specialized Compute vs. Standard Compute.
No one except Oracle is optimizing their software for SPARC, as much as I admire their technology. Hey, my products at Veritas made gazillions on the Sun Platform – 12 years ago when the internet was in its first big commercialization wave. But the reality is that x86 multi-core processors that can now be clustered together in modular building blocks for high performance and will continue their relentless commoditization of big RISC machines. While Oracle in their announcement carefully compared their server price performance to 32×2-core Intel servers – what really matters is that your database architecture can take advantage of modular pay-as-you-grow building blocks for linear scale. That’s the difference between legacy scale-up databases and the new breed of scale-out databases that are optimized for cloud computing.
2) RAM vs. FLASH Memory for Real-Time Analytics.
In-Memory is all the rage nowadays to work around fundamental software architecture limitations – and that’s not just for Oracle. The theory goes “someday RAM will be cheap, so put everything in one big memory image” and hey the benchmarks look great! Who cares if you lose valuable data when the lights go out? The reality is FLASH Memory, especially in-server FLASH, is cheaper for TBs for data, its ultra-reliable, and its far better to use a lot of FLASH Memory and to size your RAM for hot frequently accessed data. A particularly good analysis of why FLASH memory is the right answer for future of the database is this article by David Floyer on Wikibon
3) “Real-Time” Analytics Means Live Operational Data.
One thing we think Oracle got right is using Hadoop to process fast moving streams of unstructured customer data from social media and then storing the results in a relational database where it can be joined with structured customer data for analysis. We just think that for wherever possible that should be done in a live operational database that can handle high concurrency OLTP and split-second response to analytics because it is radically simpler. The biggest companies in the world will always invest in specialized data warehouses, but 80% of the market needs a better answer.
At Clustrix we believe the real database platforms of the future are made up of distributed databases +Flash Memory + commodity compute. The combination of NoSQL, NewSQL, and Hadoop, present a real alternative to Oracle for the first time in 20 years and this paradigm is gaining speed in the enterprise.
The expensive hardware tricks we heard about today again from Oracle are really only about extending the life of its scale-up database architecture for its top customers. Over 20 years of engineering have gone into optimizations for SMP architectures with local memory so it makes sense for some to throw hardware and money at the problem. But it is clear by now that distributed scale-out databases are the right answer for any new hyper-scale data-rich application.
At Clustrix we’ve been doing this for years for real-world customers and we believe scale-out SQL is the right answer. What’s really cool is Google’s recent announcement that their franchise AdWords application runs on a home-grown F1 hybrid system, i.e. a scalable SQL database over 100TB in size that serves up trillions of ads and scans tens of trillions of data rows every day.
It runs on the commodity hardware in Google’s datacenters and has five nines of availability. It is nice to keep good company.