If your data center were a beehive, Docker Swarm would be the pheromone that keeps all the bees working efficiently together.
Here’s what I mean by that. In some ways, Docker containers are like bumblebees. Just as an individual bee can’t carry much of anything on her own, a single container won’t have a big impact on your data center’s bottom line.
It’s only by deploying hundreds or thousands of containers in tandem that you can leverage their power, just like a colony of bees prospers because of the collective efforts of each of its individual members.
Unlike bumblebees, however, Docker containers don’t have pheromones that help them coordinate with one another instinctively. They don’t automatically know how to pool their resources in a way that most efficiently meets the needs of the colony (data center). Instead, containers on their own are designed to operate independently.
So, how do you make containers work together effectively, even when you’re dealing with many thousands of them? That’s where Docker Swarm comes in.
Swarm is a cluster orchestration tool for Docker containers. It provides an easy way to configure and manage large numbers of containers across a cluster of servers by turning all of them into a virtual host. It’s the hive mind that lets your containers swarm like busy bees, as it were.
Why Use Swarm for Cluster Configuration?
There are lots of similar cluster orchestration tools beyond Swarm. Kubernetes and Mesos are among the most popular alternatives, but the full list of options is long.
Deciding which orchestrator is right for you is fodder for a different post. I won’t delve too deeply into that discussion here. But it’s worth briefly noting a couple of characteristics about Swarm.
First, know that Swarm happens to be Docker’s homegrown cluster orchestration platform. That means it’s as tightly integrated into the rest of the Docker ecosystem as it can be. If you like consistency, and you have built the rest of your container infrastructure with Docker components, Swarm is probably a good choice for you.
Docker also recently published data claiming that Swarm outperforms Kubernetes. Arguably, the results in that study do not necessarily apply to all real-world data centers. (For a critique of Docker’s performance claims by Kelsey Hightower, an employee of the company — Google — where Kubernetes has its roots, click here.) But if your data center is similar in scale to the one used in the benchmarks, you might find that Swarm performs well for you, too.
Setting Up a Docker Swarm Cluster
Configuring Swarm to manage a cluster involves a little bit of technical know-how. But as long as you have some basic familiarity with the Docker CLI interface and Unix command-line tools, it’s nothing you can’t handle.
Here’s a rundown of the basic steps for setting up a Swarm cluster:
Step 0. Set up hosts. This is more a prerequisite than an actual step. (That’s why I labeled it step 0!) You can’t orchestrate a cluster till you have a cluster to orchestrate. So before all else, create your Docker images — including both the production containers that comprise your cluster and at least one image that you’ll use to host Swarm and related services.
You should also make sure your networking is configured to allow SSH connections to your Swarm image(s), since I’ll use this later on to access them.
Step 1. Install Docker Engine. Docker Engine is a Docker component that lets images communicate with Swarm via a CLI or API. If it’s not already installed on your images, install it with:
curl -sSL https://get.docker.com/ | sh
Then start Engine to listen for Swarm connections on port 2375 with a command like this:
sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock
Step 2. Create a discovery backend. Next, I need to launch a Docker daemon that Swarm can use to find and authenticate different images that are part of the cluster.
To do this, SSH into an image that you want to use to host the discovery backend. Then run this command:
docker run -d -p 8500:8500 --name=consul progrium/consul -server -bootstrap
This will fire up the discovery backend on port 8500 on the image.
Step 3. Start Swarm. With that out of the way, the last big step is to start the Swarm instance. For this, SSH into the image you want to use to host Swarm. Then run:
docker run -d -p 4000:4000 swarm manage -H :4000 --replication --advertise :4000 consul://
Fill in the and fields in the command above with the IP addresses of the images you used in steps 1 and 2 for setting up Engine and the discovery backend, respectively. (It’s fine if you do all these using the same server, but you can use different ones if you like.)
Step 4. Connect to Swarm. The final step is to connect your client images to Swarm. You do that with a command like this:
docker run -d swarm join --advertise=:2375 consul://:8500
is the IP address of the image, and is the IP from steps 2 and 3 above.
Using Swarm: Commands
The hard part’s done! Once Swarm is set up as per the instructions above, using it to manage clusters is easy. Just run the docker command with the -H flag and the Swarm port number to monitor and control your Swarm instance.
For example, this command would give information about your cluster if it is configured to listen on port 4000:
docker -H :4000 info
You can also use a command like this to start an app on your cluster directly from Swarm, which will automatically decide how best to deploy it based on real-time cluster metrics:
docker -H :4000 run some-app
Getting the Most out of Swarm
Here are some quick pointers for getting the best performance out of Swarm at massive scale:
- Consider creating multiple Swarm managers and nodes to increase reliability.
- Make sure your discovery backend is running on a highly available image, since it needs to be up for Swarm to work.
- Lock down networking so that connections are allowed only for the ports and services (namely, SSH, HTTP and the Swarm services themselves) that you need. This will increase security.
- If you have a lot of nodes to manage, you can use a more sophisticated method for allowing Swarm to discover them. Docker explains that in detail here.
If you’re really into Swarm, you might also want to have a look at the Swarm API documentation. The API is a great resource if you need to build custom container-based apps that integrate seamlessly with the rest of your cluster (and that don’t already have seamless integration built-in, like the Sumo Logic log collector does).
How to Configure a Docker Cluster Using Swarm is published by the Sumo Logic DevOps Community. If you’d like to learn more or contribute, visit devops.sumologic.com. Also, be sure to check out Sumo Logic Developers for free tools and code that will enable you to monitor and troubleshoot applications from code to production.