Of all the technologies to emerge over the past decade, Kubernetes is one of the most important. By automating management tasks that would not be feasible to perform by hand in most situations, it plays a critical role in deploying containerized applications both in the cloud and on-premises.
But Kubernetes is also a complex technology. Getting started with Kubernetes requires becoming familiar with several types of tools and concepts (like nodes, pods, clusters, and services). And, depending on exactly how you are using Kubernetes, the specific approach you take to getting started will vary.
If that sounds intimidating, keep reading. This page explains all of the essentials you need to know to begin your Kubernetes journey.
Kubernetes is an orchestrator, which means that it manages application environments by automating tasks that human operators would otherwise have to perform manually. Those tasks include operations such as starting and stopping different infrastructure components; providing load-balancing to ensure that requests are distributed evenly across an environment; and managing the exchange of information between different parts of an application environment.
Kubernetes is most often used to orchestrate containers. However, Kubernetes can also be used to orchestrate other types of application infrastructures, including virtual machines.
What Does Kubernetes Do?
The main reason to use Kubernetes is to eliminate the need to perform tedious tasks, like manually starting and stopping containers or assigning containers to individual servers.
Indeed, if you have a large-scale container deployment, Kubernetes (or a similar orchestration tool) is essential for making it practical to manage the environment. You can get away with managing perhaps a half-dozen container instances by hand, but beyond that point, it becomes infeasible to manage an application environment without the automation provided by Kubernetes.
Beyond its automation benefits, Kubernetes provides some other valuable features. Although Kubernetes is not a security tool, it lets you implement some security protections (using features like role-based access control and pod security policies) that add security to containerized application environments. Kubernetes also makes it easy to migrate an application deployment from one infrastructure to another, since Kubernetes configurations and data are portable across different infrastructures.
Kubernetes Core Components
Kubernetes is a broad platform that consists of more than a dozen different tools and components. Among the most important are:
- Kube-scheduler: This tool runs on the Kubernetes “master” node (see the following section for more on this) and decides which servers should host groups of containers.
- Kubelet: An agent that runs on individual servers and connects them together to form clusters.
- Kube-proxy: This agent is installed on each worker node and provides a network proxy interface to support network-based communication with the master and other worker nodes.
- Etcd: A key-value store that houses the data required to run a Kubernetes cluster.
- Kubectl: The command-line tool that you use to manage Kubernetes clusters.
- Kube-apiserver: The service that exposes the Kubernetes API.
If you use Kubernetes to manage containers, this will require a container runtime, which is the software that runs individual containers. Kubernetes supports a number of container runtimes; the most popular are Docker, containerd, and cri-o.
There are several other Kubernetes components (such as a Web interface and a monitoring service) that you might choose to deploy, depending on your needs and configuration. The official Kubernetes documentation describes these components in more detail.
Key Kubernetes Concepts
In order to get started with Kubernetes, you should familiarize yourself with the essential concepts that Kubernetes uses to manage the different components of a Kubernetes deployment. They include:
- Nodes: Nodes are servers that host Kubernetes environments. They can be physical or virtual machines. It’s possible to run Kubernetes with just a single node (which you might do if you are testing Kubernetes locally), but production-level deployments almost always consist of multiple nodes.
- Master vs. worker nodes: Nodes can be either “masters” or “workers.” Master nodes host the processes (like kube-scheduler) that manage the rest of the Kubernetes environment. Worker nodes host the containers that power your actual application. Worker nodes were known as "minions" in early versions of Kubernetes, and sometimes you may still hear them referred to as such.
- Pods: Groups of containers that are deployed together. Typically, the containers in a pod provide functions that are complementary to each other; for instance, one container might host an application frontend while another provides a logging service. It’s possible to have a pod that consists of just one container, too.
- Services: Services are groups of pods. Each Service can be assigned an IP address and a resolvable domain name in order to make its resources accessible via the network.
- Clusters: A cluster is what you get when you combine nodes together (technically, a single node could also constitute a cluster). It’s most common to have one cluster per deployment and, if desired, workloads divided within the cluster using namespaces. However, in certain cases you might choose to have multiple clusters; for instance, you might use different clusters for hosting a test and a production version of the same application. That way, if something goes catastrophically wrong with your test cluster, your production cluster will remain unaffected.
- Namespace: You can define namespaces in Kubernetes to separate a Kubernetes cluster into different parts and allow only certain resources to be accessible from certain namespaces. For example, you might create a single Kubernetes cluster for your entire company, but configure a different namespace for each department in the company to use to deploy its workloads. Generally speaking, using namespaces to divide clusters into virtually segmented parts is better than creating a separate cluster for each unit.
Kubernetes is open source. You can download the Kubernetes source code from GitHub and compile it yourself if you wish. However, installing Kubernetes in this way is complicated. So is keeping it updated (because you would have to recompile from source every time you wanted to upgrade). Unless you want to build Kubernetes from source in order to help teach yourself the ins and outs of the platform, or you are using a host environment where prebuilt Kubernetes distributions are not available for some reason, compiling Kubernetes from source is usually not worth all the trouble and effort.
For most teams, using a Kubernetes distribution makes more sense. A Kubernetes distribution is a prebuilt version of Kubernetes that you can install using packages instead of having to compile from source. Most Kubernetes distributions are also preconfigured in certain ways to make installation and setup easier, and many come with additional tools or integrations that add functionality to the core Kubernetes platform.
In this way, you can think of Kubernetes distributions as being akin to Linux distributions. While it's possible to install a Linux-based operating system from scratch, almost no one does that. Most people use Linux distributions that come prebuilt and preconfigured to serve different purposes (like powering desktops, servers, or networking equipment).
Popular Kubernetes distributions include Red Hat OpenShift, Rancher, Canonical's Kubernetes distribution for Ubuntu, and SUSE's CaaS platform. These distributions can be installed on-premises or on a cloud-based infrastructure that you provision yourself. As noted below, there are also special Kubernetes distributions designed for different types of deployments.
In addition, all of the major public cloud providers offer hosted Kubernetes services, such as AWS EKS and Azure AKS. These cloud-based services allow you to set up a Kubernetes cluster without having to maintain or manage your own infrastructure, although they typically offer fewer opportunities for configuration tweaks.
Kubernetes Host Operating Systems: Linux vs. Windows
Kubernetes is primarily a Linux-based technology. The core infrastructure on which Kubernetes runs must be configured using some kind of Linux distribution. However, starting with Kubernetes version 1.14, it is possible to include Windows machines within Kubernetes clusters, although those servers are limited to operating as worker nodes. In this way, Kubernetes can be used to orchestrate containerized applications that are hosted using Windows containers as well as Linux ones.