Dropbox Moving Data IN-House shows everyone it’s about the best solution, not the newest buzzword

Today Dropbox released a blog post announcing their multi-year effort to bring their file storage in-house and almost completely off of Amazon S3. They have kept their metadata services in-house since the beginning. Had they not announced it, nobody except Amazon and Dropbox would have known.

The very definition of a successful IT Engineering project.

During this simple blog post they also highlighted what truly competent IT Operations staff do every day.

  1. They used the best tools for the job to solve their problem.

    When Dropbox started they had nowhere near the expertise to build out a storage infrastructure that would scale at a pace to suite their customers’ needs. So they used Amazon S3 to solve their scalability issues. It worked like a dream.

    And all the while they were gathering data and figuring out what was important to their application from the view of their infrastructure.

  2. They adapted as they learned more

    Dropbox has grown to over half-a-billion users and 500 petabytes of data since 2008.

    They started their ‘Magic Pocket’ project to bring their data in-house in 2013. That means they have SPENT HALF OF THEIR COMPANY’S LIFE WORKING ON THIS SOLUTION.

    They spent years cultivating data on how to build the best-performing infrastructure for their unique use case. And then they spent years developing it. In a world consumed with ‘release early, release often’ they decided to take the tack that defined success for them.

  3. They didn’t get caught up in buzzwords.

    Obviously they haven’t released details on this new infrastructure. But I would bet anyone lunch that this infrastructure isn’t ‘hyperconverged’. It’s going to end up being a properly tuned, layered, robust infrastructure.

    They also didn’t ‘rush to the cloud’. They actually moved AWAY from Amazon. While they will adopt a Hybrid Cloud approach for at least some of their regions (details were a little fuzzy), they have in-housed 90% of their data at this point.

Dropbox IS a cloud application. It has an API, and mobile apps, and a clean interface and everything else a cloudy thing is supposed to have. But here they are solving their problems with a good solution tailored to their needs instead of the latest buzzword. There’s no school like the old school. Sometimes. As long as the old school is in a container to future-proof it.

Atomic Host and Kubernetes Clusters Made Easy(ish)

Recently I got to go out to visit a customer and talk about containers. Even though I call containers parlor tricks, it (seriously) is one of my favorite things to do. They had some questions about container performance tuning as well as how to run an internal registry.

So I came up with a ~2 hour workshop to have with them. I put it out on GitHub so they could access the code afterward if they wanted. I had a few realizations while I was putting this together.

  1. Atomic Host is getting really easy to configure. Back in the 7.0 days you really had to be double-jointed to configure a kubernetes cluster. In 7.2, you edit 3 files per cluster member (master or node). The total lines edited is around 8. That doesn’t include flannel or your SDN solution of choice.
  2. NFS as persistent storage for a multi-node replication controller for docker-registry is way harder than it should be. There are several bugs out there (Red Hat as well as upstream) that show issues when you have a multi-container docker-registry rc and have it use NFS to store the registry data.Once I thought this through it made sense. NFS (especially NFSv4) uses client-side caching to make writes more efficient. Since both pods are in play for these writes, the confirmation in the registry code barfs all over itself when container A looks for data that is still in the NFS write cache inside container B.

    There are work-arounds with the NFS server settings as well as the k8s service definition to tweak the kubernetes scheduler. It works for demos, but I would have mountains of fear trying this for a production environment.

  3. OMG ANSIBLE IS AWESOME. I hadn’t really had a chance to use ansible to solve a problem. So I used this project to start to get used to the technology a little. I watched some videos where the ansible folks said it had become the defacto language to define an infrastructure. I totally see that now. I can’t wait to learn more about it.I included the ansbile playbook as well as all of the templates in the github repo along with the asciidoc for the workshop itself. I intentionally kept it simple, so people who hadn’t used it before could see what work was happening and where it was coming from. I can’t wait to need to get deeper into ansible.

 

VMWare – A Cautionary Tale for Docker?

Of course VMWare has made a ton of money over the last ~12 years. They won every battle in ‘The Hypervisor Wars‘.  Now, at the turn of 2015 it looks to me like they’ve lost the wars themselves.

What? Am I crazy? VMWare has made stockholders a TON of money over the years. There’s certainly no denying that. They also have a stable, robust core product. So how did they lose? They lost because there’s not a war to fight anymore.

Virtualization has become a commodity. The workflows and business processes surrounding virtualization is where VMWare has spent the lion’s share of their R&D budgets on over the years. And now that is the least important part of virtualization. With kvm being the default hypervisor for OpenStack, those workflows have been abstracted higher up the Operations tool chain. Sure there will always be profit margins in commodities like virtualization. But the sizzle is gone. And in IT today, if your company doesn’t have sizzle, you’re a target for the wolves.

Of course docker and VMWare are very different companies. Docker, inc. has released its code as an open source project for ages. They also have an incredibly engaged (if not always listened to) community around it. They had a the genius idea, not of containers, but of making containers easily portable between systems. It’s a once-in-a-lifetime idea, and it is revolutionizing how we create and deliver software.

But as an idea, there isn’t a ton of money in it.  Sure Docker got a ton of VC to go out and build a business around this idea. But where are they building that business?

I’m not saying these aren’t good products. Most of them have value. But they are all business process improvements for their original idea (docker-style containers).

VMWare had a good (some would call great) run by wrapping business process improvements around their take on a hypervisor. Unfortunately they now find themselves trying to play catch-up as they shoehorn new ideas like IaaS and Containers into their suddenly antiquated business model.

I don’t have an answer here, because I’m no closer to internal Docker, Inc. strategy meetings than I am Mars. But I do wonder if they are working on their next great idea, or if they are focused on taking a great idea and making a decent business around it. It has proven to be pennywise for them. But will it be pound-foolish? VMWare may have some interesting insights on that.

 

RSS from Trello with Jenkins

Trello is a pretty nice web site. It is (sort of) a kanban board that is very useful when organizing groups of people in situations where a full agile framework would be too cumbersome. Kanban is used a lot in IT Operations. If you want a great story on it, go check out The Phoenix Project.

One thing Trello is lacking, however, is the ability to tap into an RSS-style feed for one or more of your boards. But, where there is an API, there’s a way. This took me about 30 minutes to iron out, and is heavily borrowed from the basic example in the documentation for trello-rss.

Step One – Three-legged OAuth

Trello uses OAuth. So you will need to get your developer API keys from Trello. You will also need to get permanent (or expiring whenever you want) OAuth tokens from them. This process is a little cloudy, but I found a post on StackOverflow that got me over the hump.

Step Two – a little python

I created a little bit of python to handle this for me. Bear in mind it’s still VERY rough. My though is to start to incorporate other Trello automation and time-savers into it down the road. If that happens I’ll stick it out on github.

#!/usr/bin/env python
from trello_rss.trellorss import TrelloRSS
from optparse import OptionParser
import sys
class TrelloAutomate:
 ''' 
 Used for basic automation tasks with Trello, 
 particularly with CI/CD platforms like Jenkins.
 Author: jduncan
 Licence: GPL2+
 Dependencies (py modules):
 - httplib2
 - oauthlib / oauth2
 '''
 def __init__(self):
  reload(sys)
  sys.setdefaultencoding('utf8')
  self.oauth_token = $my_token
  self.oauth_token_secret = $my_token_secret
  self.oauth_apikey = $my_api_key
  self.oauth_api_private_key = $my_api_private_key
 def _get_rss_data(self):
  try:
   rss = TrelloRSS(self.oauth_apikey,
     self.oauth_api_private_key,
     self.oauth_token,
     channel_title="My RSS Title",
     rss_channel_link="https://trello.com/b/XXX/board_name",
     description="My Description")
   rss.get_all(50)
   data = rss.rss
   return data
  except Exception,e:
   raise e
 def create_rss_file(self, filename):
  data = self._get_rss_data()
  fh = open(filename,'w')
  for line in data:
   fh.write(line)
  fh.close()
 def main():
  parser = OptionParser(usage="%prog ", version="%prog 0.1")
  parser.add_option("-r", "--rss", 
    action="store_true", 
    dest="rss", 
    help="create the rss feed", 
    metavar="RSS")
  parser.add_option("-f", "--file", 
    dest="filename", 
    default="trellorss.xml", 
    help="output filename. 
    default = trello.xml", 
    metavar="FILENAME")
  (options, args) = parser.parse_args()
  trello = TrelloAutomate()
  if options.rss:
   trello.create_rss_file(options.filename)

if __name__ == '__main__':
main()

Step Three – Jenkins Automation

At this point I could stick this little script on a web server and have it generate my feed for me with a cron tab. But that would mean my web server would have to have to build content instead of just serving it. I don’t like that.

Instead I will build my content on a build server (Jenkins) and then move deploy it to my web server so people can access my RSS feed easily.

Put your python on your build server

Get your python script to your build server, and make sure you satisfy all of the needed dependencies. You will know if you haven’t, because your script won’t work.:) For one-off scripts like this I tend to put them in /usr/local/bin/$appname/. But that’s just my take on the FHS.

Create your build job

This is a simple build job, especially since it’s not pulling anything out of source control. You just tell it what command to run, how often to run it, and where to put what is generated.

trello-rss-1
The key at the beginning is to not keep all of these builds. If you run this frequently you could fill up lots of things on your system with old cruft from 1023483248 builds ago. I run mine every 15 minutes (you’ll see later) and keep output from the last 10.
trello-rss-2
Here I tell Jenkins to run this job every 15 minutes. The syntax is sorta’ like a crontab, but not exactly. The help icon is your friend here.
trello-rss-3
I have previously defined where to send my web docs (see my previous post about automating documentation). If you don’t specify a filename, the script above saves the RSS feed as ‘trello.xml’. I just take the default here and send trello.xml to the root directory on my web server.
trello-rss-4
And this is the actual command to run. You can see the -f and -r options I define in the script above. $WORKSPACE is a Jenkins variable that is the filesystem location for the current build workspace. I just output the file there.

Summary

So using a little python and my trusty Jenkins server, I now have an RSS Feed at $mywebserver/trello.xml that is updated every 15 minutes (or however often you want).

Of course this code could get way more involved. The py-trello module that it uses is very robust and easy to use for all of your Trello needs. I highly recommend it.

If I have time to expand on this idea I’ll post a link to the github where I upload it.

-jduncan

 

CI/CD Documentation for people who hate writing docs

I like solving problems. I hate writing up the documentation that comes along with it. Since I took a new position within Red Hat, I have found an increasing amount of my time taken up with writing docs. I decided the time had come for some innovation and workflow improvement.

Problem Statement

  • Our current document store of record is Google Drive. So any solution has to keep the final product in there. It can keep them in other locations, but this one is a requirement.
  • I don’t want to have to transcribe notes from calls and meetings. It’s annoying enough to take notes. It’s doubly-annoying to have to then transcribe them into another format for consumption by other people. A little clean-up is OK, but nothing far beyond that.
  • Copy/Paste into multiple platforms isn’t something I want to do. I want to take my notes, perform an action and have them published.
  • I need a universal format. PDF, HTML. Something.

My Solution

Available Tools

Internally, Red Hat uses the following self-service tools that I am utilizing for this time-saver.

  • DDNS (Dynamic DNS)
  • OpenStack
  • GitLab for version control
  • Jenkins for CI/CD
  • Google Drive for docs store

Asciidoc

I looked around quite a bit before setting on using asciidoctor to process asciidoc files for me. It will take extremely light markdown and use it to render really pretty HTML. I’m a huge fan of it. I won’t be providing a primer on it here, but the one from my bookmarks that I use the most is found on the asciidoctor website.

The biggest benefit is that I can generate it almost as fast as someone can talk. So after a meeting, it’s just a minute or two of clean up and clarification and BANG, I have a consumable record of the event.

The Workflow

Having settled on asciidoc as my format, the workflow cleared up a lot.

  1. Generate asciidoc files during meetings / events / whatever.
  2. Manage them per-project / customer in git repos on GitLab.
  3. When a push is made to a given repo, have GitLab trigger a Jenkins build job.
  4. The Jenkins build job will take the updated repo, render the finished HTML, and upload it to Google Drive as well as a secondary web server that I will maintain (my choice, not a hard requirement).
  5. Profit and World Domination

Challenges

I ran into a few obstacles when I started bringing this to life

  • I had never really used the more advanced features in GitLab.
  • I hadn’t used Jenkins in years
  • I am not familiar with the Google Drive API.

Dynamic DNS

I used our internal DynamicDNS for my web server and my Jenkins server. I don’t control any DNS zones inside Red Hat, so this was a quick and easy solution.

We have an internal registration page, as well as an RPM that configures a system. I just edit a file with the host, domain and hash and POOF, I have DDNS wherever I want it.

Setting up GitLab

The GitLab instance I’m using is 7.2.2, and is maintained by our internal IT team. So I won’t be covering how to set it up. I have done this in the past and it was dead simple, however. I followed their walkthrough and it worked like a champ.

Installing and Configuring Jenkins

We do have multiple internal Jenkins servers for our Engineers. However, I decided to go with my own so I could play around with plugins and break it without incurring the wrath of some project manager or delaying a major product release. The process was very straight-forward. I followed their wiki to get it  up and running in approximately 20 minutes.

Of course, adding a job to an existing Jenkins environment is possible, too. You just need the correct plugins installed.

Jenkins Plugins

I am utilizing a handful of Jenkins plugins to produce this workflow. Note: A few of these me come installed in a default install. I simply don’t remember. It comes with quite a few plugins to enable the default configuration.

  • Git plugin (this may be pulled in with the GitLab plugin, but I installed it first while experimenting)
  • Gitlab plugin – for integration with our GitLab instance
  • Publish Over SSH plugin – for publishing to a simple web server

Helpful Tip I forgot about Jenkins

Make sure your Jenkins server has any needed build software installed. In my case, git and asciidoctor are very important. That is 20 minutes I’ll never get back.

Integrating with Google Drive

This turned out to be the biggest challenge. There is no good glue out there for this already. The biggest obstacle is OAuth. It’s just designed for user interaction. I didn’t want to enable less secure passwords, so I decided to try and tackle this.

This is the only place I had to write any new code. I ended up using PyDrive to access the Google Drive API more easily because I’m not very familiar with the API itself. It worked well. Since GDrive is really an object store more than anything else, updating a document instead of just adding another copy of it. This is a first attempt to deal with that cleanly. I worked on it for about an hour, so there is no concept of polish there as of yet. Think of it more as a POC that it’s doable.

The code is in a public Github repo. Ideas and code-heckling are welcome.

Gluing it all together

I now have my asciidoc code, GitLab, Jenkins, GDrive, and a web server. Now I need to glue them together to make my life easier.

The local git repo itself doesn’t get changed at all. No post_commit hooks, although that method would work as well I’m sure.

  1. Create a GitLab repo
    1. This is outside of this blog’s scope, and more importantly it’s dead easy.
  2. Getting Jenkins Connected
    1. Define a server to SSH your finished HTML to
      1. Manage Jenkins > Configure System
        1. Publish Over SSH
          1. Key
            1. a private key that will work on your web server. Since I’m using a VM from our internal OpenStack instance, I am using the same key I use for ‘cloud-user’ on those VM’s
          2. SSH Server
            1. Name – anything you like. I used ‘Web Docs Server’
            2. Hostname – I used the DDNS name I set up for this system
            3. Username – cloud-user (the key is already there)
            4. Remote Directory – /var/www/html
              1. Since this is a standard RHEL 7.1 install, that is where the default DocRoot is for apache. Since I am the only one using this system and it has no external visibility I chown’d /var/www/html to cloud-user.cloud-user. I know that’s a total hack, but this is also just a POC. I promise, I do know a little bit about web security.
  3. Setting up the google drive updated code to work on your Jenkins server
    1. This was put together from several PyDrive tutorials, especially this one.
    2. Since this code uses OAuth2 to handle authentication, you have to set that up for your Google Account.
    3. Go into the Google API Control Panel and create a new Application. Their instructions are pretty solid.
      1. You will need the ‘Client ID’ and ‘Client Secret’ for that application.
      2. Inside the App Details, click the ‘Download JSON’. Save this file as ‘client_secrets.json’ (what PyDrive looks for by default)
      3. Create a file called ‘settings.yaml’ and populate it like the sample file in the links above. All you need to change are the values for your Client ID and Client Secret.
      4. At this point I used their demo code at to generate an additional file named ‘credentials.json’. This is the active token that is referenced during the login session. It is refreshed by the OAuth code in PyDrive. Take these 4 files and upload them somewhere easily read on your Jenkins server. I placed them all in /usr/local/bin/gdriveupdate. Be sure to make sure the gdriveupdate file itself is executable. It is what will be called during the Jenkins Build
        1. I’m not sure how long this token will be refreshed. I guess I will ultimately know that once the build fails because of it. Hopefully I’ll have conquered that little challenge by then. Feel free to file an issue on GitHub.
  4. Create a Jenkins job
    1. Fill in the GitLab Repository Name (user/project)
    2. Source Control Management
      1. Git
      2. Repository URL – the ssh compact version for your project
      3. Credentials
        1. I’m using an ssh key for this one. It’s associated with my Jenkins user credentials
      4. Repository Broswer – gitlab
        1. URL – URL for your project
        2. Version – this auto-populated for me
    3. Build Triggers
      1. Build when a change is pushed to GitLab
        1. make note of the CI Service URL
      2. I took the default values
    4. Build Environment
      1. Select the Server you created previously
      2. Source Files – index.html (or more if you’re generating other stuff)
      3. Remote Directory – this will be auto-created in the remote server’s root directory auto-magically. You can name it anything that makes sense for your project
    5. Build – Execute Shell
      1. /usr/bin/asciidoctor -dbook index.adoc
        cp -r /usr/local/bin/gdriveupdate/* $WORKSPACE
        /usr/local/bin/gdriveupdate/gdriveupdate -f $WORKSPACE/index.html -g CSA_Philips_Home_Monitoring
      2. index.adoc is just the convention I’ve adopted. You can redirect the output name via the command line and call it whatever you like.
      3. copying everything for the GDrive into the build workspace is a total hack. I know that. PyDrive can’t find the json and secrets files unless they’re in the current working directory for some reason. Some weird pathing issue that I don’t yet feel like debugging in the project.
  5. Configure GitLab to trigger a Jenkins build
    1. This is based on the GitLab plugin for Jenkins documentation
      1. It is very version specific, but I found that the simple instructions for version 8.0 and higher worked just fine for me. You just have to create a web hook for push and merge events.
      2. The Jenkins URL for the webhook is in the Project Config page in the Build Trigger section where you select the GitLab option.
      3. Go to your GitLab project > Settings > Web Hooks
        1. Select Merge Request and Push events
        2. Paste in the URL from your Jenkins project.

Summary

And that’s it. You write about 100 lines of Python to incorporate Google Drive, create a GitLab repo and Jenkins build job. You then link GitLab to Jenkins with a web hook. The Jenkins build then creates your HTML (or your desired format) from your asciidoc and uploads it to Google Drive and your web server.

Now, when I make a push to my GitLab repo after taking notes or writing docs for a given project or customer, the workflow kicks off and publishes my docs in both locations. The build takes < 10 seconds on average. And since it’s a push and not a poll-driven event, they are available almost instantly.

Ratings

Technical Knowledge Needed – 8/10. You’re not writing kernel modules but it is gluing together several large tools).
Time Requirement – 4/10. This is less than a full day’s work once you have the answers in front of you. To generate the workflow took me about 2 days, all told.

Virtualization vs. Containers – a fight you should never have

I have some time to while away before I get on to a plane to head back home to my amazing wife and often-amazing zoo of animals. Am I at the Tokyo Zoo, or in some ancient temple looking for a speed date with spiritual enlightenment? Of course not. I came to the airport early to find good wifi and work on an OpenStack deployment lab I’m running in a few weeks with some co-workers. Good grief, I’m a geek. But anyway.

Auburn the beagle in her natural habitat
Auburn the beagle in her natural habitat

But before I get into my lab install I wanted to talk a little bit about something I saw way too much at OpenStack Summit. For some reason, people have decided that they are going to try to make money by making Linux Containers ‘the next Virtualization’. Canonical/Ubuntu is certainly the worst example of this, but they are certainly not the only one. To repeat a line I often use when I’m talking about the nuts and bolts of how containers work:

If you are trying to replace you virtualization solution with a container solution, you are almost certainly doing both of them wrong.

First off, at the end of the day it’s about money. And the biggest sinkhole inside a datacenter is not fully utilizing your hardware.

Think about how datacenter density has evolved:

  1. The good ole days – a new project meant we racked a new server so we could segregate it from other resources (read: use port 80 again). If the new project only needed 20% of the box we racked for it, we just warmed the datacenter with the other 80%.
  2. The Virtualization Wars – Instead of a new pizza box or blade, we spun up a new VM. This gives us finer grain control of our resources. We are filling in those resource utilization gaps with smaller units. So that same 20% could be set up multiple times on the same pizza box, giving up closer to 100% resource consumption. But even then, admins tended to err on the side of wasted heat, and we were only using a fraction of the VM’s allocated resources.
  3. The Golden Age of Containers – Now we can confidently take a VM and run multiple apps on it (zomg! multiple port 80s!) So we can take that VM and utilize much more of it much more of the time without the fear that we’ll topple something over and crash a server or a service.

This is where someone always shoves their hand in the air and says

<stinkeye>But I need better performance than a VM can give me so I’m running MY CONTAINERS on BAREMETAL.</stink_eye>

My response is always the same. “Awesome. Those use cases DO exist. But what performance do you need?”.

Here’s the short version.

A properly tuned KVM virtual machine can get you withing 3-4% of bare-metal speed.

Leaving out the VM layer of your datacenter means that once you consume that extra 3-4% of your baremetal system that KVM was consuming, you have to go rack another system to get past it. You lose a lot of the elastic scalability that virtualization gives you. You also lose a common interface for your systems that allow you to have relatively homogeneous solutions across multiple providers like your own datacenter and AWS.

Containers bring some of that flexibility back. but they only account for the Dev side of the DevOps paradigm. What happens when you need to harden an AWS system and all you care about are the containers?

Secondly, hypervisors and Linux containers are FUNDAMENTALLY DIFFERENT TECHNOLOGIES.

A hypervisor virtualizes hardware (with QEMU in the case of KVM), and runs a completely independent kernel on that virtual hardware.

A container isolates a process with SELinux, kernel namespaces, and kernel control groups. There are still portions of the kernel that are shared among all the containers. And there is ONLY ONE KERNEL.

All of that to say they are not interchangeable parts. They may feel that way to the end user. But they don’t feel that way to the end user’s application.

So take the time to look at what your application needs to do. But also take some time to figure out how it needs to do it. All of the use cases are valid under the right circumstances.

  • Straight virtualization
  • Straight baremetal
  • containers on VMs
  • containers on baremetal

A well thought out It infrastructure is likely to have a combination of all of these things. Tools like OpenStack, Project Atomic, and CloudForms make that all much much easier to do these days.

Open Stack Summit Day 3 – Closing Thoughts

wow. I’m exhausted.

OpenStack Summit 2015, Tokyo Edition is over. It was amazing. I have a handful of ideas for follow up technical posts after I have time to get home and dig into them a little bit. But I want to get a few thoughts down on the conference as a whole while I’m sitting in my incredibly small room in Tokyo being too tired to go out on the town.

There could have been a container summit inside OpenStack Summit. Everywhere I turned, people were talking about containers. How to use them effectively and innovate around scaling them. It was awesome. These 2 technologies (IaaS and Containers) are going going to collide somewhere not very far up the road. When they do it is going to be something to behold. I can’t wait to be part of it.

The conference on the whole was incredible. I can’t give enough credit to the team who put it all together. It was stretched out across (at least) 4 buildings on multiple floors, and it worked the vast majority of the time. The rooms were a little over-crowded for the biggest talks (or any talk that had the words ‘container’ or ‘kubernetes’ or ‘nfv’ in the title), and they tended to be a little too warm. The warm seems to be common for most public areas in Japan. I guess that’s just how they roll here.

Probably my biggest criticism of the conference is angled at most of the keynote speakers. They were, on the whole, not great. When I am at a large IT conference like this, I expect the keynote presentations to be motivational and polished. Too many of these were history lessons and needed a few more rounds in front of a mirror. There were exceptions of course (particular kudos to the IBM BlueBox folks!). But that was my biggest ‘needs improvement’ factor the OpenStack Summit Tokyo.

Out of 10, I would give this conference a solid 8. My score for Tokyo would be similar, if not higher.

I can’t wait to see what happens in Austin. I’m already working on ideas for talks.:)