Clustrix just exhibited at two good shows: Structure and Velocity. I spent my time at Structure. One of the highlights of the show for Clustrix was the The Future of SQL In the Cloud panel that Paul was on. There was some good discussion about SQL vs. NoSQL and different products available for scaling your database. The title is a bit of a misnomer, however. Database challenges are orthogonal to the cloud. The database is a critical challenge no matter how the application is deployed.
But this isn’t what surprised me the most at Structure. That came in the How Does a Company Scale in Real Time? panel. This panel had representatives from PayPal, Engine Yard, Yahoo!, Facebook, and Zynga. These panelists are all responsible for ensuring their products continue to function smoothly. These companies are all recognizable and successful. I was really hoping to hear some good insight into operating large scale web properties and perhaps some good advice for new web startups. I think the money question came at around 16:30 in the video. Jonathan asks the panelists to rewind the clock and say what they would do differently and to give advice to startups for what they should focus on to avoid application bottlenecks. There was a lot of advice offered. There was talk of establishing a single-signon infrastructure. There was emphasis on avoiding SQL entirely, saying it is far and away the biggest scalability problem in web architectures today. There was a suggestion you should invest up front in separating out your architecture. You should think hard about the abstraction layers in key parts of your system and get the infrastructure right first. Another talks about getting the right instrumentation and metrics in your application. You should put in the right levels of abstraction, and the right levels of caching. You need to force yourself to run your application on two servers. Think about sharding and replicating at the beginning and force your scaling challenges to come up in advance before you have to scale.
None of this advice resonated with me. Startups have enough challenges finding the right business model and securing the initial set of customers. Why would you spend your very limited money, and more importantly, time on infrastructure and architecture of the back-end before you’ve proved your business? Matthew Mengerink, VP of Customer Quality, Engineering Services, and Site Operations at PayPal finally stepped in and made some sense. He says he would do nothing different. To him, a startup working on architecture is a waste of a dollar. He advises spending your time and money on making the business model work. Thank you, Matthew!
This discussion reminded me of a post titled Start In the Middle I read a while ago. In there, it offers a bit of advice that should be obvious that is so often not: solve the interesting bit of the problem first. Prove there’s value in the core idea, and then flesh out the infrastructure around it. Do what makes your business unique first and put the majority of time into that. Delay building the rest of the surrounding architecture until its really needed or when possible, just buy those bits. The rest of that BS is not what your customers see and not what makes you money.
So what’s my advice? Make the code and architecture as simple as you possibly can but no simpler. Don’t abstract too early and don’t make premature optimizations. Rewriting code and adding abstractions later is not a sign of coding failure, rather it’s a sign the product is successful. Choose the data model that fits your problem. That is frequently SQL. Full transactional and relational SQL is exceptionally expressive and it fits many problems so well. Why wouldn’t you use it? Don’t re-invent things that are not part of your core business. When the product becomes successful and the load starts to ramp, collect hard data to find the bottlenecks and fix them when they become a problem. When your data shows that your database is that bottleneck, perhaps Clustrix can help.