Cloud Brokerage: Applications and service delivery with Docker
BEE PART OF THE CHANGE [email protected]
1. Introducción 2. What is Docker 2.1. Dockerized applications 3. Docker Ecosystem and Orchestration 3.1. Considerations of the POC 3.2. Juju 3.2.1. Deploying a Juju Cluster 3.2.2. Conclusions 3.3. Helios 3.3.1. Deploying an Helios Cluster 3.3.2. Conclusions 3.4. Fleet 3.4.1. Ambassadord and Consul 3.4.2. Deploying a Fleet Cluster 3.4.3. Conclusions 4. Continuous Delivery/Integration with Docker 4.1. Life Cycle with Docker 5. Conclusions
1. INTRODUCTION This document is intended to provide a technical review of Docker, the leading container technology, and how it can be used on production environments efficiently. In order to test the benefits and restrictions of a multicloud multinode docker deployment we created containers for each component of an existing sample application we built (Frontend, DB, Worker nodes, Queue) then installed and configured different orchestration tools (Juju, Helios, Fleet) with Docker support on AWS and HP Helion (OpenStack), and deployed the full stack of our sample application.
2. WHAT IS DOCKER? Docker is an open source tool for DevOps that enables the creation and deployment of portable, lightweight and isolated applications that run inside containers. The applications live inside a container and can be run on any hardware in a virtualized sandbox without the need to make custom builds for different environments. Its based on Linux Containers (LXC), which has proven to be robust and secure when running applications. The main benefit of docker is that once the container is created it can be easily deployed on either the developer’s computer or private or public cloud infrastructure. Docker can be run on top of Virtual Machines or bare metal, this eases the whole application deployment on DevOps environments with cloud infrastructures since the Virtual Machine Technology is excellent to provision boxes with running OS. Then, docker simplifies application provision since all the applications are self contained. The application provision is simplified since there is no need to configure the underlying OS where the application will be deployed as all containers share the same Host OS, and all libraries are included in the container, this means that even applications that required different library versions can live on the same Docker Host.
Docker has been designed to easily control and deploy containers on single nodes, this means it doesn’t provides tools to solve complex networking architectures between several docker nodes or the application provision on a cluster. Docker is a native unix application written in Go, that is installed on each node we want to host containers. But, in order to share the containers among all the infrastructure, we need to install a Docker Registry server which acts an private repository from which the docker nodes can fetch the containers to be deployed. One of the most interesting features of docker is that when you are pushing changes on an existing container, the image is treated as with a version control system were only deltas are pushed/pulled (git-like, tiny footprint with lightning fast performance). This means that its possible to cook and provision the containers faster as with standard VMs because the “golden image” model its avoided, which causes image sprawl on large systems where you have several images in varying states of versioning.
Its also very important to note that with docker its possible to increase the service density per machine, as you are able to deploy multiple applications on the same server with much less overhead since containers have no OS. These features are attracting a lot of big names and have turned Docker into one of the most successful open source projects of the last year.
2.1. DOCKERIZED APPLICATIONS In order to create a container with an application (Tomcat7, Apache2, NodeJS, PSP, C++, etc.) or a specific service (MySQL, RabbitMQ) you must create a Dockerfile, which are plain text files that contain the definition of steps required to create a container with that particular application, such as installing packages (apache) pulling sourcecode from repositories (git) or even setting environment variables or application configuration properties. The following is a sample dockerfile to create an apache container.
These dockerfile when used with the docker build command, will generate a running container with those particular applications and services installed. Once the container is created, it can be committed into a new version. Then, the container can be distributed on different docker nodes in order to be deployed via the docker push/pull command. When a new version of the application needs to be built, the last commit can be used to build a new container, this will generate a new commit on the Docker Registry (a new version of the container) meaning only deltas will be uploaded to the docker hosts in order to deploy the new version of the application.
3. DOCKER ECOSYSTEM AND ORCHESTRATION As mentioned before, Docker is a software to create and manage LXC on a single box and while its very good at doing so, its not practical to run a group of servers since you would have to manually operate them. This has pushed the community to create tools to orchestrate a docker cluster in which you have a fleet of docker hosts where you deploy containers, also existing cluster platforms and hosting services are adding docker support, here is an incomplete list of the most notableopen source docker based projects created specifically for Docker or that have included some form of support for Docker containers1. • • • • • • • • • • • • •
Kubernetes by Google. Fleet by CoreOS. Centurion by Newrelic. Helios by Spotify. Geard by Redhat. Consul by Hashicorp. Shipper by Rackspace. Maestro by Jumpbox. Ferry by Opencore. Mesos by Apache (added support). Openstack by Redhat (added support). vSphere by VMware (added support) Juju by Canonical (added support).
There are also many tools that have been open sourced and showcased on the first DockerCon (Summer 2014) that improve the docker in some way or create new systems and process around it.
3.1. CONSIDERATIONS OF THE POC For this POC we wanted to be able to deploy an application with several decoupled services which normally are deployed on separate machines. The application consisted of a group of crawlers which are subscribed to a specific topic or hashtag in the Twitter Streaming API. These nodes ingest the raw tweets and cleans them according to basic some rules, its
For a more comprehensive list check this mind map of the ecosystem.
even possible to filter-out tweets based. Then, the tweets are inserted into the queue in order to be processed by the worker nodes. For the sake of the example we created a word counter worker node, this means that the worker will take one tweet from the queue and split it by words, group and count them, and then insert this information in the database. Crawlers are subscribed to specific hashtags/topics. Then, the information on the database can be show on a dashboard to the user in the WebApp console. For the POC we created a container for each component, meaning that we have an application that uses at least 5 different containers, each of which needs to communicate over some particular ports with at least another container or service so networking was a consideration for the tests. We didn’t want to publish all ports to all IP address, but only to grant the least privileged network permissions to each node. On the other hand, we wanted to test the multicloud capabilities of Docker on public clouds so we choose AWS and HP Helion, an OpenStack based Infrastructure service provider.
3.2. JUJU Juju is an orchestration platform from Canonical. Currently, it only supports Ubuntu Server and has compatibility with several infrastructure providers. It has both a CLI and GUI to orchestrate the deployment of services and applications on top of any public or private cloud. In order to use Juju, you must configure one infrastructure provider like Amazon Web Services, Windows Azure, HP Public Cloud, Joyent, OpenStack, MAAS (bare metal) or Vagrant; and then, you are able to deploy services using recipes called charms. Charms are self contained definitions of the steps required to configure and deploy applications. This means that if you want to deploy an application with Juju, you need to create a charm of it. Charms define the relationships and actions allowed when you are linking that particular service to other services. Once a service its deployed, it can be linked to other services by its relations. For example, if you have charm of a web application such as Wordpress (PHP) and another charm for a MySQL database, after you deploy them these services will expose all their relationships as a provider or a consumer of other services. In this case, the Wordpress requires a database and MySQL provides a relationship webapp and the mysql, so both services can be linked since both have a relationship of the same type (DB) with the correct dependency (one provides, the other requires). Once all the relationships of require are meet, the service is deployed and can be exposed, in the case of the Wordpress the expose action will open the port 80.
Juju requires you to install a master node, where the environments and providers are configured and then you can start deploying services. The master node with the juju-core package doesn’t need to be on the same infrastructure provider as the service nodes. Each service is deployed on a single node, defined by a instance type (size, memory, CPU). Also some services can manually scaleout2 using Juju; for example, a fleet of web servers that are listening HTTP connections behind a load balancer, or slaves/replica nodes of a distributed database or shard, or worker nodes of a computing cluster like hadoop. To do this, we must issue an add units command to the service. This will provision new machines and install them all the required packages and software to finally add them (with whatever logic that is required) to the service.
3.2.1. DEPLOYING A JUJU CLUSTER Since we wanted to test Juju with Docker, each application component was created in a docker container. Then, in order to deploy it with Juju, we needed to create a charm per each component. For each charm we defined all 9 hooks: • • • • • • •
Config-changed, executed whenever any config param is changed. Crawler-relation-broken, execute when the relations of dependency are broken. Crawler-relation-joined, executed when a new relation is created between services. Install, execute once during the bootstrap of the service in the VM. Relation-departed relation-joined, executed when a relation is created/destroyed with another service. Start, stop, executed when those actions are sent to the service. Upgrade-charm, to update the current charm on the node.
Hooks are the scripts executed when certain actions occur; for example, when a new relationship is made between services or when we need to install all the required packages and configuration for a particular service. The hooks can be written in any scripting language and there are helpers3 for Python. The following example is the install hook for the crawler container, which bootstraps the machine with docker and the charm helpers and then pulls the docker image from our private dockerhub. All the hooks must be defined inside a charm.
Scaling out with Juju. Helpers are small programs/libreries that encapsulate common operations thru a simple and usable interface.
After this hook is executed on a pristine virtual machine, we can deploy the docker container inside it. On the first version of our charms, we ran all the docker containers on the same machine, this means that we only provisioned a single node and pulled all the docker images into it and then started the containers. Everything worked as expected but, since our application components talk to each other through several ports, we wanted to test it by deploying the containers on distinct machines, so we could play with the Juju charms and auto configure the firewall rules; for example, to let the crawler write to the rabbit queue or let the slave node read from the rabbit queue and then write the results into the MySQL. The tests and configurations were completed with small effort from our team. We wrote a couple of charms and configured the Juju environments for each Cloud provider. We were able to deploy the same application with all its components (5 containers with separate services and applications) on both AWS and HP Helion (OpenStack), enabling its easy portability between Cloud providers.
3.2.2. CONCLUSIONS Juju is a great tool to easily deploy complex applications and services. For example, you can create a full OpenStack cluster with the charm available, this will deploy 19 services on 23 machines with one click or a full mongo shard. The GUI provides a clean visualization of the deployed services, displaying you the state and capacity of each node, the relationships with the rest of services and a search tool to find existing charms in the charm store, also the CLI its powerful and easy to use, but we see some restrictions when using Juju. The most important is that it only supports Ubuntu Server and it seems that Canonical won’t be adding support for other distributions. We see it usage in our dev teams whenever they need to quickly deploy and install services like databases or applications clusters (Hadoop, Storm, Spark, OpenStack, etc.) since little or no system administration knowledge is needed to use it. This enable developers to quickly prepare a full stack infrastructure in few clicks without the help of a system operator, which is excellent for POC, demos or exploration of new software applications which already have charms on the charm store. We think Juju its very useful for creating complex stacks or quickly deploy and configure their services but, in our case, we won’t be using it on production environments because we use RHEL and not Ubuntu Server to deploy our applications and services. Another restriction its you are not able to mix providers on same environment; for example, you can’t deploy services on the same environment for both AWS and Openstack, since you need to create a separate environment for each provider, this means that if you want to use Juju to deployed and create relationships between services and applications, all of them must be on either Provider, so it enables portability but you need to have all of your services on the same provider. During our tests, we preferred the CLI over the GUI since its faster and can be fully automated through scripting. The Juju architecture requires to have the environment configuration files from the machine you want to submit commands, meaning that an additional node is required if you have multiple system administrators of that environment.
3.3. HELIOS Helios is a Docker orchestration platform for deploying and managing containers across one or many data centers. It was release and been used in production by Spotify with great success enabling their squads to quickly provision services. Internally it uses a Zookeeper to save the state of the docker fleet. It requires a master node where the provision commands are submitted to deploy a container, and the helios agent needs to be installed on each node where you want to deploy docker containers with Helios. This means the node needs to be bootstrapped and started with docker previously since Helios doesn’t not provision new virtual machines its sole purpose is to ship containers thru the docker cluster. We managed to easily deploy our sample application and also many other services with provided public charms. But it has some restrictions since its been developed for internal use in Spotify. We think it doesn’t provide some basic functionalities that are required in some use cases where process of application development and deployment differs from the one that Spotify squads use. For instance, we were not able to use the links4 functionality or Ambassadord5. This means we had to fully expose all the traffic between our cluster nodes to be able to connect to different services.
3.3.1. DEPLOYING AN HELIOS CLUSTER As explained before, Helios doesn’t provision any new machines instead it uses running machines running from its pool to deploy containers. This idea plays well with an on-prem infrastructure where you have provisioned 24x7 your machines, this doesn’t meant it can be used with cloud providers like AWS or HP Helion but, unlike Juju, you need to provision and bootstrap the machines with docker and helios agent yourself (or automate it). In order to deploy a container in Helios you first create a job, which is the logical description of the container (which image, tagname, version, etc..) you want to deploy you also specify some other parameters like ports that you want open in the host. After the job is defined you can submit it into a host, this will command the host to pull the docker image and start it with the specified parameters. Docker links are used to create some sort of local network in the docker host Ambassadord is containerized application that supports static forwards, DNS-based forwards (with SRV), Consul+Etcd based forwards. (4)
All this commands must be be issued from the Helios master node, which use Zookeeper to store the state and then call the Helios Agents on the provisioned nodes thru a REST API to deploy a container.
3.3.2. CONCLUSIONS Helios is a great open source platform. It’s easy to install and use and it does allow you to deploy services simultaneously on different clouds providers. This means if we had a local OpenStack infrastructure we could spill the extra capacity or just deploy services with the same tool into AWS, there will be some network restrictions we should address but after all that is solved we should be able to seamlessly deploy containers on both infrastructures. One of the most itchy restriction we found for our use case was that we couldn’t create links between containers, been this a problem for our use case since we had to explicitly configure ports on each container, this could be solved with the ambassador pattern. Helios is well suited for orchestrating a multi node multi datacenter docker cluster with a focus on on-prem private cloud. It also supports any underlying linux distribution (Ubuntu Server, CentOS, Debian, RHEL, Suse), so as long you can install java and docker you can use Helios.
3.4. FLEET Fleet is container management and deployment tool in a docker cluster, similar to Helios. It allows the orchestration of docker containers. This means you can deploy/undeploy services with any configuration you need. One key difference with Helios is that in Fleet you don’t need (but you can) to define a host to deploy the container, instead you define a set of rules so the scheduler decides where the container can be deployed. For example, you can specify that a container should not be deployed on the same region more than one, this to gain high availability. Another example is that certain containers should be always deployed on different machines to spread evenly the network traffic. The fleet scheduler will try to find the best strategy to deploy the containers based on the defined rules.
Fleet is a headless cluster. This means there is no need for master or central node, instead it uses an etcd instance to store its state and the metadata of all docker nodes in the cluster, the fleet daemon (fleetd) must be installed on each node of the cluster. It encapsulates the fleet engine and fleet agent. The agent is responsible for pulling and deploy the docker containers on that particular instance while the engine is responsible to deploy the containers based on the least-loaded scheduling algorithm. This algorithm which prefers the agent running the smallest number of units. The engine process is on every node of the cluster, but its only execute once in the whole cluster. This is ensure with a reconciliation model, where the nodes polls to define which node should run the reconciliation process for that particular loop, so if the node that is currently running the scheduler is lost, then the cluster will wait until the next reconciliation loop to run again all the scheduling on another node and the cluster will continuing functioning as normal. The scheduler decides where and how the jobs should be resolved based on the information stored in etcd. This information is updated continuously by each fleetd of the cluster. The fleetd also exposes a REST (experimental) API which can be used instead of the fleetctl cli, to issue deploy commands remotely to the fleet cluster.
3.4.1. AMBASSADORD AND CONSUL Ambassador is a containerized docker application that can be used as a TCP reverse proxy or forwarder. It supports static forwards, DNS-based forwards or Consul+Etcd based forwards. Its main purpose is to route network traffic between containers. During the process of learning and understanding docker, we realized that when you deploy docker containers on different host that need to communicate each other, those
containers network needed to be manually configured. this means that you must define inside the container the hostname or IP/port of the source or destination for a connection that a particular container needs.
The interesting thing is that the ambassador can be easily configured as its just a small application that only serves traffic between services. It has a tiny footprint and opens new whole idea when architecting applications since its computationally cheap to deploy a tiny application to serve other programs in your cluster. Each ambassador knows how to route the all network communications, thanks to Consul which knows the IP address and ports of each service. Consul is a service discovery application that makes simple for services to announce their availability to the cluster, so other nodes that consume that type of service can consume it. This can be done via a HTTP or a DNS (SRV record). Consul is really good at detecting failures and removing those services from the healthy pool. Ambassadord can be configured with Consul to know how to route the traffic. This means that if a new container announces itself to its Consul, then each Consul on the docker cluster will know the availability of that service. Another example is a container goes down, consul will notice it and when the ambassador of a service wants to consume the dead service, it will be redirected to another healthy unit within the cluster that provides the required service.
1. The MySQL service just started, the Consul client on that docker host detects it (via health checks) and announces to consul server there is a MySQLservice on that node, this will add a new service to the cluster. 2. Lets say that a webserver wants to connect to a MySQL service, so since his Ambassadord doesn’t know how to reach this service, it will ask the Consul client to get it from the Consul server. 3. Once the Consul has the service information, the Ambassadord request is solved and it will communicates with the MySQL service Ambassadord’s to complete the petition.
3.4.2. DEVELOPING A FLEET CLUSTER We used all previously describe components in the following way: •
Fleet was used to provision all the application containers (red) in our docker cluster, this containers were previously bootstrapped and stored in a private Docker Registry we deployed. With fleet we defined rules so the RabbitMQ and the MySQL were deployed on the same node, and no web or java application was deployed on that particular node. Then we asked fleet to deploy the front application (Twitter master) in a node other than the worker nodes (Twitter slave, Twitter crawler). Fleet uses etcd as a persistence layer for the cluster. Four Ambassadord containers were deployed as clients and consumers, so containers can find the route to the service they need to (i.e. Twitter crawler puts message on rabbit; Twitter slave reads a message from the queue and inserts it into the MySQL). A Consul client was deployed on each node, so the services are announced to the whole cluster, and by using the Consuld mode for Ambassadord, traffic was routed to the right host node.
The consul web application provides a nice interface to check health of nodes and services alike. You have a fine detail of all the host nodes that are part of the docker cluster and the number of services deployed on each of them.
It was possible to deploy the whole application in less than a minute and since its dockerized its fully portable to any public cloud provider or local infrastructure given that the docker/fleet stack is already configured. In addition to this, we created a web console for deploying the docker containers with fleet. This was done because fleet only provides a CLI. We stored the fleet units in a db so they can be reused.
Other sections of the web console display status information about the fleet cluster, also you are able to select, edit and send fleet units with a GUI. This will deploy containers across the cluster following all the rules and configuration defined on the submitted fleet unit. It was built with Twitter bootstrap and deployed on a separate host in a tomcat server (MVC Spring). The web console uses Fleet’s REST API to perform all the commands and queries.
3.4.3. CONCLUSIONS Service Discovery is a very interesting pattern for deploying large distributed decoupled applications. We gained application portability and deployment agility. High availability can be easily achieved by using Fleet deployment rules (x-conflicts), autoscaling is still a missing piece but we think this will be addressed shortly as the rest of foundations that are needed for it already exists (ability to monitor resources, ability to automate scale up/down). We need to test other pieces like logs recovery and storage from the application containers (i.e. syslogs). We modified, configured and deployed a five component application painlessly. This combination of technologies (Docker, Fleet, Consul, Ambassador, etcd) and processes (continuous delivery/integration) can be successfully used to deploy distributed applications with lots of backend services very quickly also by splitting up services in smaller pieces it’s easier to achieve a high degree of decoupling, scalability and fault tolerance. Containers seem the natural form to deploy microservices, meaning you can have fleet of bare metal or provisioned VMs in which you could throw a large number of containers (each one with a single service) and start serving petitions in a very short time.
4. CONTINUOUS DELIVERY / INTEGRATION WITH DOCKER It’s common that modern application projects, especially big ones that are available to thousands of clients publicly on the Internet, are tested and deployed using CI and CD, this is because it reduces the time between deployments, thereforehelping to reduce the time to market of features and bug fixes delivering. This is where docker really shines. In traditional deployment processes the application will be pre-cook on a virtual machine image disk (i.e. AMI in AWS) either manually or with a CM tool like puppet/chef, it’snot easy to move a file of this size (several GB) across several data centers specially if you want to be able to deploy it in the order of magnitude of seconds. For example, companies like Ebay or Amazon Retail, have several deploys per second, those deployments can be text corrections or small increments of features and fixes, it’s very difficult to achieve low deploy times when you need need to ship big files to every corner of your network and when you have many of them of different applications, then VMS can take several minutes to actually start and start serving. This well known approach golden image (or foil ball) pattern, in which you have a clean pristine up-to-date OS distribution, on top of which you bootstrap the applications and services using puppet/chef can become quickly unmanageable as the number of images grows, at best you only have to rebuild the golden image, rebuild all your applications images and redistribute them which means lots of bandwidth and time. With containers you gain complexity in the deployment process, but that just means you need to automate more your deployments.
4.1. LIFE CYCLE WITH DOCKER The process of deploying a change (or a tag or a branch) can start when its pushed to the source code repository, after that Jenkins will try to pull and build the changes, if the build is successful then Jenkins will call Puppet (or a Maven plugin for docker) to build the new version of the image container, and push it to the docker registry. Once the container is created, we can hook a script to ship it that container using fleet. Of course we will need to create the fleet unit file to all this will deploy the the container in our docker cluster. After the service in the container has fully started, the Consul will detect it and announce it to the cluster. All this sounds too good, but certainly it’s not a trivial process to setup this pipeline, but once the pipeline is working it should be very easy to the deploy changes and features. The interesting thing is that by using all of these open source technologies and tuning them, and by making the process as much automatic as possible we should save a great amount of time in when delivering new features or fixes to our systems.
5. CONCLUSIONS Portability One of the most mentioned benefits of Docker is that if you containerize your applications you gain a high degree of portability between Cloud Infrastructure Providers, Virtualization Solutions or just a laptop (you could deploy a datacenter in your computer) and while this is true portability its improved we don’t think it’s the most interesting feature Docker (read below). Service Density Another of the interesting things is that it is possible to achieve with a container technology its that it increases the number of services we can deploy (with less OS overhead) on the same baremetal/vm, meaning that on the long road it could potentially reduce infrastructure costs by a small, but reasonable amount. Lighting Fast Deployments Fast and nimble deployments, thats the from our point of view the most interesting feat, as it’s possible to achieve true Continuous Deployments. This means you can ship your containers around many datacenters faster than bundled VM images, this combined with the fact that booting a container its cheaper than a VM, makes the whole deployment pipeline faster. Microservices Sinergy As should be obvious, docker plays well with huge applications composed of many of small decoupled services that change fast we mean services that need several deploys per day, or even per minute. Service Discovery Combined with Consul + Ambassador + Fleet, provides a very effective way to deal with the auto configuration of large systems, since the Service Discovery encourages application configuration at runtime. This makes a big distributed system more resilient to system failures and less complicated to orchestrate (configure load balancers, db connections or endpoints on backend services, routing network services). More complex deployments One of the downsides of using docker is that it adds a complexity to the definition of the deployment process, but that just means you need to automate more all the steps in your deployment pipeline. Once the pipeline is set, deployments are easy and very fast, you can fully automate canary testing or A/B testing. Docker is not for everyone We think that if you have a big fat application (read legacy) that can’t be splitted or a project with scheduled weekly/monthly deployments it doesn’t pay all the trouble and effort needed to set the whole Docker Pipeline, so again there are no silver bullets you should always try your own use case and adapt the available tools to your processes in order to see the real benefits (or the lack of them).