Just-in-Time Kubernetes: A Beginner’s Guide to Understanding Kubernetes Core Concepts

Adriana Villela
Dzero Labs
Published in
13 min readApr 22, 2021

--

A snail spotted in the West Toronto Rail Path. Image by Dzero Labs

The Elephant in the Room

So, you want to learn yourself some Kubernetes. That buzzwordy bundle of techy goodness that everyone seems to be talking about. I can’t tell you how many recruiters have approached me with Kubernetes work. Kubernetes is definitely the cool kid in town!

Perhaps one of the following applies to you:

  • You’ve heard all about Kubernetes and finally decided that it’s high time to see what all the fuss is about.
  • You are a seasoned software engineer who has been working with Kubernetes for a while à la just-in-time learning. That is, you learn what you need to get your job done. Now you’re getting more into the weeds, so you’ve decided that it’s high time to dig into some of the fundamentals. (Shhhh…I won’t tell — I’ve been there myself!)
  • You are a tech leader, and you have a team working on Kubernetes (or perhaps your team interacts with a team working on Kubernetes), and you would like to educate yourself

Whatever brought you here, welcome! I’m glad that you’re here!

My aim here is to keep things simple and to provide you with the basic building blocks for learning and understanding Kubernetes. Kubernetes can seem big and scary. Let’s face it…any new tech is scary. So much stuff to learn. So many nuances. Especially when you’re starting from zero. Are you scared yet? 😱

Don’t be…because I’m here to guide you! My goal here is to show you that Kubernetes is not so scary, and to inspire you to take your knowledge further. (And check out one of my other advanced posts when you do!) If you’ve been doing Kubernetes for a while, let this be your refresher! Maybe you’ve been doing things a certain way, taking it for granted that “that’s just the way it works”. Maybe you’ll have that “A-ha!” moment, whereby you gain deeper insight into why something works the way it does.

Let’s get started!

Overview

In this post, I’ll be covering the following topics:

  1. What is Kubernetes?
  2. High-level view: Kubernetes Architecture
  3. Digging deeper: Resources, Controllers, and Operators
  4. The kubectl command-line tool for interacting with Kubernetes

If you’re interested in certain bits here and there, feel free to skip ahead to the topic that tickles your fancy. I won’t mind! 😊

Note: I’m assuming that you have a basic knowledge of containers.

What is Kubernetes?

First things first…what the heck is Kubernetes? Kubernetes is a container orchestration system that lets you deploy, scale, and manage containerized applications. It is also known in its abbreviated form as k8s. Cuz we software folk are laaaaazy. So if you see the term “k8s”, and have been wondering what it is, now you know!

Fun fact: Kubernetes traces its roots to the Borg project, which was originally developed by Google.

Image credit: memegenerator.net

Now, you might be wondering what the heck container orchestration is. Let’s look at an example.

Say you’ve got a bunch of Docker containers running on your computer. Maybe some of these containers need to talk to each other. Or maybe your friend Nancy is trying to access an API endpoint running on one of these containers. How do you ensure that the containers can talk to each others? How can you ensure that Nancy can hit that API endpoint securely? What happens when one of your containers dies? What happens if you need to scale your containers because all of a sudden, it’s not just Nancy needing to access that API endpoint…it’s Nancy and all her dev friends who are taking part a hackathon. Then what?

That’s where container orchestration comes in. It can help you with all that, and more. Container orchestration is defined as the capability to define, deploy, and operate a compute cluster consisting of multiple virtual machines or physical servers to launch containers and manage their lifecycle.

Frenemies of Kubernetes

Now, while Kubernetes is super cool and popular, you may be shocked to find out that it’s not the only container orchestrator out there! It’s true. So it’s only fair to mention a few of Kubernetes’ competitors:

Flavours of Kubernetes

Some of the flavours that Kubernetes comes in — yum!

Though not as exciting as ice cream flavours, it’s worth mentioning that there are a few different ways to run Kubernetes.

Local

First off, you can run Kubernetes locally on your machine. Here are a few tools which make this possible:

Cloud Vendor Solutions

Running k8s locally is all well and good for app development and local testing, but we do eventually need this thing to run in prod, and at scale. This is where it’s nice to know that most of the major cloud vendors have their own flavours of k8s. These include:

Each cloud vendor will typically have a nice little CLI command set for automagically creating a k8s cluster in that cloud. It’s quite a magical experience to issue a CLI command which brings up an entire cluster in 5–10 minutes. We’re talking about not only spinning up virtual machines on-demand, but also starting up all the services that make k8s tick. When you think about it, it’s super impressive!

They also provide you with some basic k8s GUIs, though nothing super fancy.

Enterprise Solutions

For the enterprise-minded folks who like pretty admin GUIs and like to have their hands held (no judgment — there’s actually a huge market for this type of thing), you might want to look at some of these vendors:

These tend to add a management layer on top of Kubernetes, catering to the enterprise crowd. As I mentioned earlier, they tend to have fancy admin consoles. They also tend to have a ton of plugins available through their respective “marketplaces”, and may have opinionated implementations of k8s (e.g. service mesh selection). These vendors also give you the option of running their k8s clusters in public clouds or self-hosted private clouds.

DIY

And finally, if you’re a glutton for punishment, you could always create your own k8s clusters from scratch. Not my personal cup of tea, but if you really want to understand how Kubernetes works, this is definitely a way to make that happen!

Kubernetes Components

Now that we have a little bit of background on Kubernetes, lets’s look under the hood.

Kubernetes master and worker (minion) nodes

A Kubernetes cluster is made up of nodes. These nodes can be either physical machines, or virtual machines. You can have a cluster of one or more nodes, though ideally, you’ll want at least two nodes in a non-dev scenario.

A cluster typically has a master node, and a bunch of worker (or minion) nodes.

Note: This is why you’ll want at least two nodes — because it would suck to have a one-node cluster whereby the one node acts as both master and worker in a non-dev setup.

The master node is responsible for watching the workers and performing the orchestration (think of what a manager does). The worker nodes are responsible for running the containers.

The diagram below shows us goes on inside the master node and worker nodes:

Kubernetes components. Image from kubernetes.io

Master Node

As the manager of this whole operation, the master node has quite a few components:

  • etcd
  • API Server
  • Controller Manager
  • Scheduler

Let’s dig into these.

etcd

etcd is a distributed key-value database. It is Kubernetes’ source of truth. Every time you make a change to Kubernetes — e.g. by way of sending it YAML or JSON via k8s’ REST API or via the kubectl CLI tool (more on that later) — that change is stored in etcd as JSON. It is also versioned, so you also have some serious version control action going on.

One of the key features of etcd is its ability to keep an eye out for changes. That is, it checks what’s currently configured in the system, against any incoming changes sent to Kubernetes via REST API call or kubectl.

Nerd alert: I personally think that etcd one of the coolest Kubernetes components. You can actually install etcd on your local machine (OSX and Ubuntu, for example), and play around with it by using the etcdctl CLI tool. Python even has a few libraries for interacting etcd programmatically. I’ve personally played around with the Python etcd3 library, and I highly recommend exploring etcd for yourself!

API Server

The API server is responsible for serving up the Kubernetes API. Wanna talk to Kubernetes and tell it to do things for you? This is is your direct line — whether you’re a user, a program, or kubectl. The API Server is also what’s responsible for sending data to and pulling data from etcd.

Controller Manager

The controller manager is the brain behind the orchestration. Kubernetes has multiple controllers, each responsible for different things. Controllers watch the state of your cluster, then make or request changes where needed. The controller manager makes sure that it tells the right controllers to do the right things. For example, there are controllers for:

  • Taking action when a pod goes down
  • Connecting services to pods
  • Creating accounts and accessing API tokens

And many others…

Note: In case you’re wondering what a pod is, it’s basically a wrapper around one or more containers.

Scheduler

The scheduler distributes work across multiple nodes. Its looks at resource requirements (e.g. CPU and memory requirements) to figure out when to run a pod, and what node to run it on.

Worker Nodes

Clearly the master node does a lot, but like a manager, it’s nothing without the workers. Worker nodes contain 2 main components:

  • Kubelet
  • Kube-proxy

Kubelet

The kubelet is an agent (small app) that runs on each worker node in the cluster. Its main job is to make sure that containers are running in a pod (wrapper of one or more containers). But who tells it to run these containers? That comes from the control plane (where our good friend the controller manager resides). When the control plane needs something to happen in a node, the kubelet makes it happen.

The kubelet runs a container runtime, and, as its name implies, is responsible for actually running containers. More specificially, it manages the complete lifecycle of a container: container image pulling (from a container registry such as Docker Hub) and storage, container execution, network attachment, etc. Docker is a popular container runtime; however, there are others, such as containerd, CRI-O.

Note: Kubernetes currently uses the Docker container runtime; however, it will be deprecating Docker in the near future, in favor of runtimes that use the Container Runtime Interface (CRI) created for Kubernetes. Docker-produced images will continue to work in your cluster with all runtimes, as they always have, so no need to panic!

Kube-proxy

The kube-proxy handles network communications inside and outside your cluster. This means that if pods need to talk to each other, or if some external service needs to talk to a pod, kube-proxy helps make that happen.

Resources, Controllers, and Operators

Now that we understand the basics of Kubernetes components, let’s get into some other key Kubernetes concepts and terminology.

Resources

A Kubernetes resource refers to either an object or operation in Kubernetes, accessed via the Kubernetes API. A resource type is known as a kind, and is represented as JSON object. This JSON object is stored (and versioned) in our good friend, etcd.

There are two categories of resources: primitive resources, and custom resources. Primitive resources come “out of the box” with Kubernetes. Primitive resources include Pod, Service, Deployment, ServiceAccount, PersistentVolumeClaim, RoleBinding…I could go on.

Sample primitive resource in Kubernetes

Custom resources are an extension of the Kubernetes API, and are therefore not necessarily available in the default Kubernetes installation. Upon installation, you can use the kubectl CLI tool to create and access these resources (more on kubectl later). Custom resources are created when you want/need Kubernetes to do some stuff that you don’t get out of the box with k8s.

Here’s an example of what a custom resource looks like:

Sample custom resource in Kubernetes

You’ll notice that on the most part, both the primitive resource and the custom resource have the same main fields:

  • apiVersion: Version of the Kubernetes API that you’re using to create your resource
  • kind: Type of resource to create
  • metadata: Data that uniquely identifies the resource
  • spec: Tells Kubernetes the desired state of the resource

Note: Not all resources have a spec field (e.g. ServiceAccount, Role, RoleBinding).

Controllers

As discussed earlier, a controller watches the state of your cluster, then makes or requests changes where needed, to achieve the desired state. Blah blah blah. But what exactly does that mean?

Let me give you a fun example. Suppose you’re a lifeguard at a pool. Your job is to keep everyone safe at the pool. To do this, you must constantly scan the pool to make sure that no swimmers are in distress. This is the desired state. Therefore, if you see someone who’s drowning, for example, you go and pull them out of the water, and perform any other necessary life-saving tasks to ensure that they are no longer in distress.

Operators

An operator is a type of controller. Remember those custom resources that I talked about earlier? It’s all well and good to define a custom resource, but at the end of the day, how do you get Kubernetes to do something useful with it? That’s where operators come in — they’re the code behind the scenes that make those custom resources do that useful something.

While all operators are controllers, not all controllers are operators. This can be rather confusing, because on the outset, they seem to be pretty much the same thing. The main difference is that operators extend Kubernetes functionality, and they work alongside custom resources to make that happen.

Fun fact: Operators can be written in any language, and there are a few frameworks out there that set up some boilerplate code to help you write your own operator.

Anyone can create operators: you, your org, or some external vendor. For example, RedHat OpenShift has its own set of operators (along with accompanying custom resources) that are part of its core product, which it runs on top of plain ‘ole Kubernetes. And thanks to the wonderful open source community, many of operators are made available for sharing. You can check some of these out on OperatorHub.io.

kubectl

You’ve heard me talk a lot about kubectl throughout this post, and now we finally get to see what all the fuss is about! kubectl is a command-line interface (CLI) for managing operations on Kubernetes clusters. That’s it! It communicates with our good friend, the API Server, to get information about our cluster, and to tell Kubernetes to do stuff for us, like create a new resource, or modify an existing one. As I’ve mentioned before, when you extend the Kubernetes API using the magical combination of operators and custom resource definitions (CRDs), you can use kubectl to access/update those resources too!

Since it’s a command-line tool, kubectl runs on your local machine. It works in conjunction with the kubeconfig file, which is a YAML file that’s by default installed in $HOME/.kube/config. Here’s what a sample kubeconfig file might look like:

Screen shot of a sample kubeconfig file

Before you panic trying to understand this garble, I want to tell you that the entries in the kubeconfig file are automagically generated when you connect to an existing Kubernetes cluster.

For example, if I wanted to connect to an Azure Kubernetes cluster I would use Azure’s az CLI to run a command like this:

az aks get-credentials --resource-group my-resource-group --name my-aks-cluster

Which would populate my kubeconfig file for me. Magic!

Similarly, I could use Google Cloud’s gcloud CLI to connect to a Google Kubernetes cluster like this:

gcloud container clusters get-credentials my-gke-cluster --region=us-central1-a

The point is, each cloud provider has its own cloud CLI and command set for connecting to a Kubernetes cluster in that cloud, and updating your kubeconfig file accordingly.

Once you have that cluster registered in your kubeconfig file, you can run various commands against your cluster. You can also use kubectl to learn about the clusters that you currently have registered, and to update individual cluster configs.

Note: You absolutely can register multiple Kubernetes clusters from different clouds in the same kubeconfig file. Alternatively, you can also have multiple kubeconfig files, if that’s how you like to roll.

If you’re interested in learning more about kubectl and kubeconfig, check out the Recommended Reading section below.

Conclusion

Whew! That was a lot to take in! This was by no means meant to be a technical deep-dive into Kubernetes. Rather, my goal was to introduce you to basic Kubernetes concepts, to give you an appreciation for what makes Kubernetes tick, and to hopefully inspire you to dig a little deeper into this buzzworthy tech that everyone seems to be talking about.

You should also now be able to rhyme off some Kubernetes fun facts to your friends and family on your next Zoom call. You’ll be able to tell them things like:

  • What Kubernetes is and why you need it
  • The difference between a master node and a worker node, and all the goodies that make each tick
  • What resources are
  • The difference between controllers and operators
  • What kubectl does

If anything in this post requires further clarification, please reach out in the comments, and let me know, so I can update this post accordingly. Many thanks. ❤️

And now, I shall reward you with a picture of some cute ducklings.

Photo by Olivia Colacicco on Unsplash

Peace, love, and code.

More from the Just-in-Time Kubernetes Series

Recommended Reading

--

--

Adriana Villela
Dzero Labs

DevRel | OTel End User SIG Maintainer | {CNCF, HashiCorp} Ambassador | Podcaster | Former corporate 🤖 | Speaker | Bouldering Addict | Opinions my own