Dropbox Moving Data IN-House shows everyone it’s about the best solution, not the newest buzzword

Today Dropbox released a blog post announcing their multi-year effort to bring their file storage in-house and almost completely off of Amazon S3. They have kept their metadata services in-house since the beginning. Had they not announced it, nobody except Amazon and Dropbox would have known.

The very definition of a successful IT Engineering project.

During this simple blog post they also highlighted what truly competent IT Operations staff do every day.

They used the best tools for the job to solve their problem.
When Dropbox started they had nowhere near the expertise to build out a storage infrastructure that would scale at a pace to suite their customers’ needs. So they used Amazon S3 to solve their scalability issues. It worked like a dream.

And all the while they were gathering data and figuring out what was important to their application from the view of their infrastructure.
They adapted as they learned more
Dropbox has grown to over half-a-billion users and 500 petabytes of data since 2008.

They started their ‘Magic Pocket’ project to bring their data in-house in 2013. That means they have SPENT HALF OF THEIR COMPANY’S LIFE WORKING ON THIS SOLUTION.

They spent years cultivating data on how to build the best-performing infrastructure for their unique use case. And then they spent years developing it. In a world consumed with ‘release early, release often’ they decided to take the tack that defined success for them.
They didn’t get caught up in buzzwords.
Obviously they haven’t released details on this new infrastructure. But I would bet anyone lunch that this infrastructure isn’t ‘hyperconverged’. It’s going to end up being a properly tuned, layered, robust infrastructure.

They also didn’t ‘rush to the cloud’. They actually moved AWAY from Amazon. While they will adopt a Hybrid Cloud approach for at least some of their regions (details were a little fuzzy), they have in-housed 90% of their data at this point.

Dropbox IS a cloud application. It has an API, and mobile apps, and a clean interface and everything else a cloudy thing is supposed to have. But here they are solving their problems with a good solution tailored to their needs instead of the latest buzzword. There’s no school like the old school. Sometimes. As long as the old school is in a container to future-proof it.

Atomic Host and Kubernetes Clusters Made Easy(ish)

Recently I got to go out to visit a customer and talk about containers. Even though I call containers parlor tricks, it (seriously) is one of my favorite things to do. They had some questions about container performance tuning as well as how to run an internal registry.

So I came up with a ~2 hour workshop to have with them. I put it out on GitHub so they could access the code afterward if they wanted. I had a few realizations while I was putting this together.

Atomic Host is getting really easy to configure. Back in the 7.0 days you really had to be double-jointed to configure a kubernetes cluster. In 7.2, you edit 3 files per cluster member (master or node). The total lines edited is around 8. That doesn’t include flannel or your SDN solution of choice.
NFS as persistent storage for a multi-node replication controller for docker-registry is way harder than it should be. There are several bugs out there (Red Hat as well as upstream) that show issues when you have a multi-container docker-registry rc and have it use NFS to store the registry data.Once I thought this through it made sense. NFS (especially NFSv4) uses client-side caching to make writes more efficient. Since both pods are in play for these writes, the confirmation in the registry code barfs all over itself when container A looks for data that is still in the NFS write cache inside container B.
There are work-arounds with the NFS server settings as well as the k8s service definition to tweak the kubernetes scheduler. It works for demos, but I would have mountains of fear trying this for a production environment.
OMG ANSIBLE IS AWESOME. I hadn’t really had a chance to use ansible to solve a problem. So I used this project to start to get used to the technology a little. I watched some videos where the ansible folks said it had become the defacto language to define an infrastructure. I totally see that now. I can’t wait to learn more about it.I included the ansbile playbook as well as all of the templates in the github repo along with the asciidoc for the workshop itself. I intentionally kept it simple, so people who hadn’t used it before could see what work was happening and where it was coming from. I can’t wait to need to get deeper into ansible.

VMWare – A Cautionary Tale for Docker?

Of course VMWare has made a ton of money over the last ~12 years. They won every battle in ‘The Hypervisor Wars‘. Now, at the turn of 2015 it looks to me like they’ve lost the wars themselves.

What? Am I crazy? VMWare has made stockholders a TON of money over the years. There’s certainly no denying that. They also have a stable, robust core product. So how did they lose? They lost because there’s not a war to fight anymore.

Virtualization has become a commodity. The workflows and business processes surrounding virtualization is where VMWare has spent the lion’s share of their R&D budgets on over the years. And now that is the least important part of virtualization. With kvm being the default hypervisor for OpenStack, those workflows have been abstracted higher up the Operations tool chain. Sure there will always be profit margins in commodities like virtualization. But the sizzle is gone. And in IT today, if your company doesn’t have sizzle, you’re a target for the wolves.

Of course docker and VMWare are very different companies. Docker, inc. has released its code as an open source project for ages. They also have an incredibly engaged (if not always listened to) community around it. They had a the genius idea, not of containers, but of making containers easily portable between systems. It’s a once-in-a-lifetime idea, and it is revolutionizing how we create and deliver software.

But as an idea, there isn’t a ton of money in it. Sure Docker got a ton of VC to go out and build a business around this idea. But where are they building that business?

Docker Control Plane – On-premises tooling for deploying containers
Docker Swarm – Container host clustering and orchestration
Docker Machine – Container Provisioning
other tools found on their Products Page

I’m not saying these aren’t good products. Most of them have value. But they are all business process improvements for their original idea (docker-style containers).

VMWare had a good (some would call great) run by wrapping business process improvements around their take on a hypervisor. Unfortunately they now find themselves trying to play catch-up as they shoehorn new ideas like IaaS and Containers into their suddenly antiquated business model.

I don’t have an answer here, because I’m no closer to internal Docker, Inc. strategy meetings than I am Mars. But I do wonder if they are working on their next great idea, or if they are focused on taking a great idea and making a decent business around it. It has proven to be pennywise for them. But will it be pound-foolish? VMWare may have some interesting insights on that.

Share this:

Share this:

Share this: