Of all the technologies to emerge over the past decade, Kubernetes is one of the most important. By automating management tasks that would not be feasible to perform by hand in most situations, it plays a critical role in deploying containerized applications both in the cloud and on-premises.
But Kubernetes is also a complex technology. Getting started with Kubernetes requires becoming familiar with several types of tools and concepts (like nodes, pods, clusters, and services). And, depending on exactly how you are using Kubernetes, the specific approach you take to getting started will vary.
If that sounds intimidating, keep reading. This page explains all of the essentials you need to know to begin your Kubernetes journey.
Kubernetes is an orchestrator, which means that it manages application environments by automating tasks that human operators would otherwise have to perform manually. Those tasks include operations such as starting and stopping different infrastructure components; providing load-balancing to ensure that requests are distributed evenly across an environment; and managing the exchange of information between different parts of an application environment.
Kubernetes is most often used to orchestrate containers. However, Kubernetes can also be used to orchestrate other types of application infrastructures, including virtual machines.
What Does Kubernetes Do?
The main reason to use Kubernetes is to eliminate the need to perform tedious tasks, like manually starting and stopping containers or assigning containers to individual servers.
Indeed, if you have a large-scale container deployment, Kubernetes (or a similar orchestration tool) is essential for making it practical to manage the environment. You can get away with managing perhaps a half-dozen container instances by hand, but beyond that point, it becomes infeasible to manage an application environment without the automation provided by Kubernetes.
Beyond its automation benefits, Kubernetes provides some other valuable features. Although Kubernetes is not a security tool, it lets you implement some security protections (using features like role-based access control and pod security policies) that add security to containerized application environments. Kubernetes also makes it easy to migrate an application deployment from one infrastructure to another, since Kubernetes configurations and data are portable across different infrastructures.
Kubernetes Core Components
Kubernetes is a broad platform that consists of more than a dozen different tools and components. Among the most important are:
- Kube-scheduler: This tool runs on the Kubernetes “master” node (see the following section for more on this) and decides which servers should host groups of containers.
- Kubelet: An agent that runs on individual servers and connects them together to form clusters.
- Kube-proxy: This agent is installed on each worker node and provides a network proxy interface to support network-based communication with the master and other worker nodes.
- Etcd: A key-value store that houses the data required to run a Kubernetes cluster.
- Kubectl: The command-line tool that you use to manage Kubernetes clusters.
- Kube-apiserver: The service that exposes the Kubernetes API.
If you use Kubernetes to manage containers, this will require a container runtime, which is the software that runs individual containers. Kubernetes supports a number of container runtimes; the most popular are Docker, containerd, and cri-o.
There are several other Kubernetes components (such as a Web interface and a monitoring service) that you might choose to deploy, depending on your needs and configuration. The official Kubernetes documentation describes these components in more detail.
Key Kubernetes Concepts
In order to get started with Kubernetes, you should familiarize yourself with the essential concepts that Kubernetes uses to manage the different components of a Kubernetes deployment. They include:
- Nodes: Nodes are servers that host Kubernetes environments. They can be physical or virtual machines. It’s possible to run Kubernetes with just a single node (which you might do if you are testing Kubernetes locally), but production-level deployments almost always consist of multiple nodes.
- Master vs. worker nodes: Nodes can be either “masters” or “workers.” Master nodes host the processes (like kube-scheduler) that manage the rest of the Kubernetes environment. Worker nodes host the containers that power your actual application. Worker nodes were known as "minions" in early versions of Kubernetes, and sometimes you may still hear them referred to as such.
- Pods: Groups of containers that are deployed together. Typically, the containers in a pod provide functions that are complementary to each other; for instance, one container might host an application frontend while another provides a logging service. It’s possible to have a pod that consists of just one container, too.
- Services: Services are groups of pods. Each Service can be assigned an IP address and a resolvable domain name in order to make its resources accessible via the network.
- Clusters: A cluster is what you get when you combine nodes together (technically, a single node could also constitute a cluster). It’s most common to have one cluster per deployment and, if desired, workloads divided within the cluster using namespaces. However, in certain cases you might choose to have multiple clusters; for instance, you might use different clusters for hosting a test and a production version of the same application. That way, if something goes catastrophically wrong with your test cluster, your production cluster will remain unaffected.
- Namespace: You can define namespaces in Kubernetes to separate a Kubernetes cluster into different parts and allow only certain resources to be accessible from certain namespaces. For example, you might create a single Kubernetes cluster for your entire company, but configure a different namespace for each department in the company to use to deploy its workloads. Generally speaking, using namespaces to divide clusters into virtually segmented parts is better than creating a separate cluster for each unit.
Kubernetes is open source. You can download the Kubernetes source code from GitHub and compile it yourself if you wish. However, installing Kubernetes in this way is complicated. So is keeping it updated (because you would have to recompile from source every time you wanted to upgrade). Unless you want to build Kubernetes from source in order to help teach yourself the ins and outs of the platform, or you are using a host environment where prebuilt Kubernetes distributions are not available for some reason, compiling Kubernetes from source is usually not worth all the trouble and effort.
For most teams, using a Kubernetes distribution makes more sense. A Kubernetes distribution is a prebuilt version of Kubernetes that you can install using packages instead of having to compile from source. Most Kubernetes distributions are also preconfigured in certain ways to make installation and setup easier, and many come with additional tools or integrations that add functionality to the core Kubernetes platform.
In this way, you can think of Kubernetes distributions as being akin to Linux distributions. While it's possible to install a Linux-based operating system from scratch, almost no one does that. Most people use Linux distributions that come prebuilt and preconfigured to serve different purposes (like powering desktops, servers, or networking equipment).
Popular Kubernetes distributions include Red Hat OpenShift, Rancher, Canonical's Kubernetes distribution for Ubuntu, and SUSE's CaaS platform. These distributions can be installed on-premises or on a cloud-based infrastructure that you provision yourself. As noted below, there are also special Kubernetes distributions designed for different types of deployments.
In addition, all of the major public cloud providers offer hosted Kubernetes services, such as AWS EKS and Azure AKS. These cloud-based services allow you to set up a Kubernetes cluster without having to maintain or manage your own infrastructure, although they typically offer fewer opportunities for configuration tweaks.
Kubernetes Host Operating Systems: Linux vs. Windows
Kubernetes is primarily a Linux-based technology. The core infrastructure on which Kubernetes runs must be configured using some kind of Linux distribution. However, starting with Kubernetes version 1.14, it is possible to include Windows machines within Kubernetes clusters, although those servers are limited to operating as worker nodes. In this way, Kubernetes can be used to orchestrate containerized applications that are hosted using Windows containers as well as Linux ones.
Getting Started with Kubernetes: Installation and Setup
The approach you take to getting started with Kubernetes will depend on which type of deployment you are setting up: a local Kubernetes environment for learning purposes or a large-scale, distributed Kubernetes cluster for production deployment.
Setting up a Kubernetes Learning Environment
If your goal is to run Kubernetes locally for learning purposes, the most seamless approach is to use a Kubernetes distribution designed specifically for this purpose. MicroK8s, Minikube, and K38s are popular options. Installation methods vary depending on which distribution you choose and which Linux-based operating system is hosting your installation, but the process is typically quite simple.
For instance, you can install MicroK8s on Ubuntu with a single short command:
sudo snap install microk8s --classic
After that, you can start interacting with your Kubernetes environment using the microk8s.kubectl CLI tool. There are additional packages you may wish to install to add more functionality to your local Kubernetes environment, but if you're just getting started with Kubernetes, this is all you have to do to get the bare essentials up and running on a local, single-host Kubernetes machine.
Setting up a Production Kubernetes Cluster
Not surprisingly, things are a bit more complicated if you are setting up a Kubernetes cluster that runs on multiple servers and needs production-grade reliability and functionality.
To set up a production Kubernetes cluster, you follow these steps:
- Acquire and provision host infrastructure: You need hardware to host your Kubernetes cluster. The hardware could be on-premise physical servers or virtual servers, or it could be a cloud-based environment. Either way, you'll need to set up the servers and install Linux-based operating systems on them as the first step toward getting started with Kubernetes. (As noted above, you can also use Windows for machines that you intend to operate as worker nodes.)
- Install a container runtime: In addition to installing an operating system, you also need to provision your host infrastructure with a container runtime such as Docker or containerd.
- Install Kubernetes: Once your host infrastructure is ready, you can begin installing Kubernetes. Since Kubernetes (as noted above) is composed of multiple components, you'll need to install each part individually. Start by installing all of the components you'll need on your master node (such as etcd, kube-scheduler, and kube-apiserver). Then, you can install kubelet and kube-proxy on your worker nodes. The way you go about installing these various tools will vary depending on which Kubernetes distribution you use; in general, however, each can be installed in your Linux distribution's package manager by downloading and opening the requisite package for each tool.
- Install Kubectl: In order to interact with your cluster, you'll need to install Kubectl on whichever machines you plan to use to manage the cluster.
- Configure Kubernetes: With all of the Kubernetes components installed on your hardware, you need to configure your installation so that the tools and nodes can talk to each other. A full discussion of configuring Kubernetes is beyond the scope of this page, but in a nutshell, you'll be editing various configuration files that are stored (in most Kubernetes distributions) under the /etc directory of your Linux file system. The configuration files will need to be modified with the IP addresses of your various nodes. You will also likely want to change various other configuration options from their defaults.
The Kubernetes setup process described above represents everything you have to do if you set up Kubernetes manually. Fortunately, Kubernetes distributions provide interactive installation tools (such as Canonical's Charmed Kubernetes tool for Ubuntu and the atomic-openshift-installer tool for OpenShift) that will walk you through the process of installing and configuring the various Kubernetes components. Or, if you use a fully managed Kubernetes service in the cloud, you don’t need to set anything up at all, as it’s already done for you.
Getting Started with Kubernetes: Deploying Apps
With your Kubernetes cluster up and running, you are ready to start deploying applications. You can deploy as many apps as you want using a single cluster (up to the limits of what your hardware resources can reasonably support). As noted above, apps can be isolated from one another using namespaces, which makes it easy to deploy many applications on the same infrastructure without worrying that one app can intrude on another.
As with most things related to Kubernetes, the exact approach you take for deploying an app depends on which app you are deploying and how your Kubernetes cluster is set up. But in most cases, the process looks like the following:
Containerize the app
First, you need a containerized image of the app that you want to run. Prebuilt container images for popular apps (such as WordPress, Node.js, or MySQL, to name just a few examples) are available from Docker Hub or other public container registries. If you are deploying a custom app, you will need to package it as a container and upload it to a registry yourself.
Deploy the app
Use kubectl to deploy the app to your cluster. There are two main ways to do this. One is to use the kubectl create command, which tells kubectl to deploy the app based on configurations you specify on the command line. The other is to use kubectl apply; with this approach, you first create a configuration file telling Kubernetes how to deploy the app, and then you use the kubectl apply command to tell Kubernetes to put that configuration into force.
The former strategy is an example of imperative management, while the latter is declarative management. Both approaches have their benefits and drawbacks, but if you’re just getting started, the imperative approach (with kubectl create) is simpler.
Once your app is deployed, Kubernetes does all of the dirty work required to keep it running healthily. If one of the nodes hosting the app fails, Kubernetes will automatically move it to another node. If network or compute resources in one part of the cluster become constrained, Kubernetes will make others available to the app to ensure that things keep running smoothly.
In some Kubernetes distributions, apps are not exposed to the Internet by default. If that is the case, and you want the app to be accessible over the Internet, you will need to use the kubectl expose command to make the app available over the public Internet.
GUI-based Kubernetes management
As an alternative to working from the command line with kubectl, many of the operations described above can also be performed using graphical user interfaces (GUIs). The most commonly used tool for this purpose is the Kubernetes Dashboard, an official Web UI developed as part of the Kubernetes project. In addition, many third-party Kubernetes GUIs are available as part of Kubernetes distributions.
Kubernetes GUIs do not provide all of the functionality of kubectl, so it’s wise to teach yourself how to use kubectl, too. But for common tasks like deploying an application or seeing which applications are running on a cluster, the GUI solutions come in handy.
Kubernetes is a powerful tool -- or, as is perhaps more accurate to say, a powerful set of tools. Given the complexity of the platform, taking your first steps toward using Kubernetes may seem daunting. But Kubernetes becomes simple to use once you understand the core concepts behind its architecture and familiarize yourself with key tools like Kubectl. Installation and configuration tools provided by certain Kubernetes distributions, as well as GUI management tools, make getting started with Kubernetes even easier.
Getting Started with Integrating Kubernetes with Sumo Logic
While Kubernetes provides some basic functionality for exposing log data to log collection tools, it does not come close to providing everything you need to collect, analyze, and manage log data on its own. That’s where Sumo Logic comes in. Paired with a logging agent like Fluentbit, Sumo Logic can serve as the logging backend for Kubernetes, allowing you to collect and interpret critical monitoring data about the health of your Kubernetes host infrastructure as well as the Kubernetes environment and the services themselves.