Kubernetes

Training & Certification Path

KUBERNETES AND CLOUD NATIVE ESSENTIALS (LFS250)

This wiki page will focus on that course training. Here under are the chapters under that course

Cloud Native Architecture

Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.

The caracteristics of cloud native architecture are:

  • High level of automation

    • (A reliable automated system also allows for much easier disaster recovery if you have to rebuild your whole system.)

  • Self healing (guerison)

  • Scalable

    • Just like scaling up your application for high traffic situations, scaling down your application and the usage-based pricing models of cloud providers can save costs if traffic is low

  • (Cost-) Efficient

  • Easy to Maintain

  • Secure by default

    • Just like scaling up your application for high traffic situations, scaling down your application and the usage-based pricing models of cloud providers can save costs if traffic is low. Patterns like zero trust computing mitigates that by requiring authentication from every user and process.

A very good best practice when developping an application as a service: https://12factor.net

Auto-Scaling

Vertical Scaling (add more CPU/RAM on existing VM) and horizontal scaling (you add more new server to take the load, via balancing)

Serverless

The idea of serverless computing is to relieve developers of these complicated tasks(repare and configure several resources like a network, virtual machines, operating systems and load balancers to run a simple web application). In a nutshell, you can just provide the application code, while the cloud provider chooses the right environment to run your application.

Function as a service (FaaS) is a emerging subclass of serverless -> https://www.youtube.com/watch?v=EOIja7yFScs , allow for a very fast deployment and make for excellent testing and sandbox environments.

Systems like Knative that are built on top of Kubernetes allow to extend existing platforms with serverless computing abilities -> https://knative.dev/docs/

Although there are many advantages to serverless technology, it initially struggled with standardization. Many cloud providers have proprietary offerings that make it difficult to switch between different platforms. To address these problems, the CloudEvents project was founded and provides a specification of how event data should be structured.

Open Standard

While Docker is often used synonymously with container technologies, the community has committed to the open industry standard of the Open Container Initiative (OCI).

Under the umbrella of the Linux Foundation, the Open Container Initiative provides two standards which define the way how to build and run containers. The image-spec defines how to build and package container images. While the runtime-spec specifies the configuration, execution environment and lifecycle of containers. A more recent addition to the OCI project is the Distribution-Spec, which provides a standard for the distribution of content in general and container images in particular.

Other systems like Prometheus or OpenTelemetry evolved and thrived in this ecosystem and provide additional standards for monitoring and observability.

Cloud Native Roles & Site Reliability Engineering

New Cloud natives roles:

  • Cloud Architect

    • Responsible for adoption of cloud technologies, designing application landscape and infrastructure, with a focus on security, scalability and deployment mechanisms.

  • DevOps Engineer

    • DevOps engineers use tools and processes that balance out software development and operations. Starting with approaches to writing, building, and testing software throughout the deployment lifecycle.

  • Security Engineer

  • DevSecOps Engineer

    • Bridge the 2 previous roles: Devops and Security Engineer in one head

  • Data Engineer

    • Data engineers face the challenge of collecting, storing, and analyzing the vast amounts of data that are being or can be collected in large systems.

  • Full Stack Developer

  • Site Reliability Engineer (SRE)

    • The overarching goal of SRE is to create and maintain software that is reliable and scalable. To achieve this, software engineering approaches are used to solve operational problems and automate operation tasks.

    • To measure performance and reliability, SREs use three main metrics:

      • Service Level Objectives (SLO): “Specify a target level for the reliability of your service. A goal that is set, for example reaching a service latency of less that 100ms.

      • Service Level Indicators (SLI): “A carefully defined quantitative measure of some aspect of the level of service that is provided” - For example how long a request actually needs to be answered.

      • Service Level Agreements (SLA): “An explicit or implicit contract with your users that includes consequences of meeting (or missing) the SLOs they contain. The consequences are most easily recognized when they are financial – a rebate or a penalty – but they can take other forms.” - Answers the question what happens if SLOs are not met.

Community and Governance

Nice visual of community actor: https://landscape.cncf.io

Additional Resources

Cloud Native Architecture

Well-Architected Framework

Microservices

Serverless

Site Reliability Engineering

Container Orchestration

Containers can be used to solve both of these problems, managing the dependencies of an application and running much more efficiently than spinning up a lot of virtual machines.

One of the earliest ancestors of modern container technologies is the chroot command that was introduced in Version 7 Unix in 1979. The chroot command could be used to isolate a process from the root filesystem and basically "hide" the files from the process and simulate a new root directory. The isolated environment is a so-called chroot jail, where the files can’t be accessed by the process, but are still present on the system. chroot directory can be created in different place in the filesystem.

Container technologies that we have today still embody this very concept, but in a modernized version and with a lot of features on top.

To isolate a process even more than chroot can do, current Linux kernels provide features like namespaces and cgroups:

  • Namespaces are used to isolate various resources

  • cgroups are used to organize processes in hierarchical groups and assign them resources like memory and CPU. When you want to limit your application container to let’s say 4GB of memory, cgroups are used under the hood to ensure these limits.

The Linux Kernel 5.6 currently provides 8 namespaces:

  • pid - process ID provides a process with its own set of process IDs.

  • net - network allows the processes to have their own network stack, including the IP address.

  • mnt - mount abstracts the filesystem view and manages mount points.

  • ipc - inter-process communication provides separation of named shared memory segments.

  • user - provides process with their own set of user IDs and group IDs.

  • uts - Unix time sharing allows processes to have their own hostname and domain name.

  • cgroup - a newer namespace that allows a process to have its own set of cgroup root directories.

  • time - the newest namespace can be used to virtualize the clock of the system.

Launched in 2013, Docker became synonymous with building and running containers. Although Docker did not invent the technologies that are used to run containers, they stitched together existing technologies in a smart way to make containers more user friendly and accessible.

While virtual machines emulate a complete machine, including the operating system and a kernel, containers share the kernel of the host machine and, as explained, are only isolated processes.

Virtual machines come with some overhead, be it boot time, size or resource usage to run the operating system. Containers on the other hand are literally processes, like the browser you can start on your machine, therefore they start a lot faster and have a smaller footprint.

Reminder:

  • Operating system is system program that runs on the computer to provide an interface to the computer user so that they can easily operate on the computer.

  • Kernel is also a system program that controls all programs running on the computer. Kernel is basically a bridge between software and hardware of the system.

Running Containers

The OCI (Open Container Intiative) Run time spec: https://github.com/opencontainers/runtime-spec. There is a reference implementation of it:https://github.com/opencontainers/runc, that ref implementation is used by the docker product

// Run a docker container
$ docker run nginx

// Check the version
$ docker version

// To look at available command
$ docker --help

// To run on the cmd on detach way the container for nginx
// to see the possible parameter of the command
$ docker run --help
$ docker run --detach --publish-all nginx
//or 
$ docker run -d -p nginx
// --publish-all = Publish all exposed ports to random ports
// --detach = Run container in background and print container ID

// To list the container
$ docker ps

// To Stop a container
$ docker stop id-container

The runtime-spec goes hand in hand with the image-spec, which we will cover in the next chapter, since it describes how to unpack a container image and then manage the complete container lifecycle, from creating the container environment, to starting the process, stopping and deleting it.

For your local machine, there are plenty of alternatives to choose from, some of which are only for building images like buildah or kaniko, while others line up as full alternatives to Docker, like podman does.

Documentation on Docker: https://docs.docker.com

To search for a public docker image: https://hub.docker.com/search?q=

Building Container image

Container images are what makes containers portable and easy to reuse on a variety of systems. Docker describes a container image as following:

“A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings.”

In 2015, the image format made popular by Docker was donated to the newly founded Open Container Initiative and is known also as the OCI image-spec that can be found on GitHub. Images consist of a filesystem bundle and metadata.

From Code -> layer + image index + config

Build an image

Images can be built by reading the instructions from a buildfile called Dockerfile.

Example of a dockerfile here under

# Every container image starts with a base image. 
# This could be your favorite linux distribution 
FROM ubuntu:20.04 

# Run commands to add software and libraries to your image 
# Here we install python3 and the pip package manager 
RUN apt-get update && \ 
    apt-get -y install python3 python3-pip 

# The copy command can be used to copy your code to the image 
# Here we copy a script called "my-app.py" to the containers filesystem 
COPY my-app.py /app/ 

# Defines the workdir in which the application runs 
# From this point on everything will be executed in /app 
WORKDIR /app

# The process that should be started when the container runs 
# In this case we start our python app "my-app.py" 
CMD ["python3","my-app.py"]
// If docker is installed on your PC, run to build a docker image
$ docker build -t my-python-image -f Dockerfile
// -t my-python-image --> specify a name tag for your image
// with -f Dockerfile you specify where your Dockerfile can be found.

// To Push image to a docker Registry
$ docker push my-registry.com/my-python-image

// To download image from a docker Registry
$ docker pull my-registry.com/my-python-image

Other useful commands

// List docker images on your PC
$ docker images

// to delete a specific image
$ docker image rm imageID

Example Exercice to build a new Docker image

Go to https://github.com/docker/getting-started

// Clone the Repo
$ git clone https://github.com/docker/getting-started.git

// go to cd getting-started/app 
// create a Dockerfile and copy in it the following
# syntax=docker/dockerfile:1,0
FROM node:12-alpine
RUN apk add --no-cache python2 g++ make
WORKDIR /app
COPY . .
RUN yarn install --production
CMD ["node", "src/index.js"]
EXPOSE 3000

// Then build the docker image from the /app repo location
$ docker build -t getting-started .

// Then launch the builded image
$ docker run -dp 3000:3000 getting-started

Security

Sysdig has a great blog article on how to avoid a lot of security issues and build secure container images.

The 4C cloud security: https://kubernetes.io/docs/concepts/security/overview/

Container Orchestration Fundamentals

Problems to be solved can include:

  • Providing compute resources like virtual machines where containers can run on

  • Schedule containers to servers in an efficient way

  • Allocate resources like CPU and memory to containers

  • Manage the availability of containers and replace them if they fail

  • Scale containers if load increases

  • Provide networking to connect containers together

  • Provision storage if containers need to persist data.

Container orchestration systems provide a way to build a cluster of multiple servers and host the containers on top. Most container orchestration systems consist of two parts: a control plane that is responsible for the management of the containers and worker nodes that actually host the containers.

Networking

Most modern implementations of container networking are based on the Container Network Interface (CNI). CNI is a standard that can be used to write or configure network plugins and makes it very easy to swap out different plugins in various container orchestration platforms.

Service Discovery & DNS

The solution to the problem again is automation. Instead of having a manually maintained list of servers (or in this case containers), all the information is put in a Service Registry. Finding other services in the network and requesting information about them is called Service Discovery.

Approaches to Service Discovery:

  • DNS: Modern DNS servers that have a service API can be used to register new services as they are created.

  • Key-Value: Using a strongly consistent datastore especially to store information about services. A lot of systems are able to operate highly available with strong failover mechanisms. Popular choices, especially for clustering, are etcd, Consul or Apache Zookeeper.

For info:

  • etcd: https://github.com/etcd-io/etcd, etcd is a distributed reliable key-value store for the most critical data of a distributed system, with a focus on being:

    • Simple: well-defined, user-facing API (gRPC)

    • Secure: automatic TLS with optional client cert authentication

    • Fast: benchmarked 10,000 writes/sec

    • Reliable: properly distributed using Raft

    • etcd is written in Go and uses the Raft consensus algorithm to manage a highly-available replicated log.

  • Consul from Hashicorp: https://www.consul.io, A modern service networking solution requires that we answer four specific questions: Where are my services running? How do I secure the communication between them? How do I automate routine networking tasks? How do I control access to my environments?

  • Apache Zookeper: https://zookeeper.apache.org, ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications.

Service Mesh

Because the networking is such a crucial part of microservices and containers, the networking can get very complex and opaque for developers and administrators. In addition to that, a lot of functionality like monitoring, access control or encryption of the networking traffic is desired when containers communicate with each other.

Instead of implementing all of this functionality into your application, you can just start a second container that has this functionality implemented. The software you can use to manage network traffic is called a proxy. This is a server application that sits between a client and server and can modify or filter network traffic before it reaches the server. Popular representatives are nginx, haproxy or envoy.

Taking this idea a step further, a service mesh adds a proxy server to every container that you have in your architecture.

When a service mesh is used, applications don’t talk to each other directly, but the traffic is routed through the proxies instead. The most popular service meshes at the moment are istio and linkerd. While they have differences in implementation, the architecture is the same.

The proxies in a service mesh form the data plane. This is where networking rules are implemented and shape the traffic flow.

These rules are managed centrally in the control plane of the service mesh. This is where you define how traffic flows from service A to service B and what configuration should be applied to the proxies.

Istio service mesh: https://istio.io/v1.10/docs/ops/deployment/architecture/

Istio uses an extended version of the Envoy proxy. Envoy is a high-performance proxy developed in C++ to mediate all inbound and outbound traffic for all services in the service mesh. Envoy proxies are the only Istio components that interact with data plane traffic.Envoy proxies are deployed as sidecars to services.

The Service Mesh Interface (SMI) project aims at defining a specification on how a service mesh from various providers can be implemented. With a strong focus on Kubernetes, their goal is to standardize the end user experience for service meshes, as well as a standard for the providers that want to integrate with Kubernetes. You can find the current specification on GitHub.

Envoy: https://www.envoyproxy.io

SMI: Service Mesh Interface specification: https://smi-spec.io, https://github.com/servicemeshinterface/smi-spec

Storage

In order to keep up with the unbroken growth of various storage implementations, again, the solution was to implement a standard. The Container Storage Interface (CSI) came up to offer a uniform interface which allows attaching different storage systems no matter if it’s cloud or on-premises storage.

Additional Resources

The History of Containers

Chroot

Container Performance

Best Practices on How to Build Container Images

Alternatives to Classic Dockerfile Container Building

Service Discovery

Container Networking

Container Storage

Container and Kubernetes Security

Docker Container Playground

Kubernetes Fundamental

Originally designed and developed by Google, Kubernetes got open-sourced in 2014, and along the release v1.0 Kubernetes was donated to the newly formed Cloud Native Computing Foundation as the very first project. A lot of cloud native technologies evolve around Kubernetes, be it low-level tools like container runtimes, monitoring or application delivery tools.

Kubernetes Architecture

Kubernetes is often used as a cluster, meaning that it is spanned across multiple servers that work on different tasks and to distribute the load of a system.

From a high-level perspective, Kubernetes clusters consist of two different server node types that make up a cluster:

  • Control plane node(s) These are the brains of the operation. Control plane nodes contain various components which manage the cluster and control various tasks like deployment, scheduling and self-healing of containerized workloads.

  • Worker nodes The worker nodes are where applications run in your cluster. This is the only job of worker nodes and they don’t have any further logic implemented. Their behavior, like if they should start a container, is completely controlled by the control plane node.

Similar to a microservice architecture you would choose for your own application, Kubernetes incorporates multiple smaller services that need to be installed on the nodes.

Control plane nodes typically host the following services:

  • kube-apiserver

    • This is the centerpiece of Kubernetes. All other components interact with the api-server and this is where users would access the cluster.

  • etcd

    • A database which holds the state of the cluster. etcd is a standalone project and not an official part of Kubernetes.

  • kube-scheduler

    • When a new workload should be scheduled, the kube-scheduler chooses a worker node that could fit, based on different properties like CPU and memory.

  • kube-controller-manager

    • Contains different non-terminating control loops that manage the state of the cluster. For example, one of these control loops can make sure that a desired number of your application is available all the time.

  • cloud-controller-manager (optional)

    • Can be used to interact with the API of cloud providers, to create external resources like load balancers, storage or security groups.

Components of worker nodes:

  • container runtime

    • The container runtime is responsible for running the containers on the worker node. For a long time, Docker was the most popular choice, but is now replaced in favor of other runtimes like containerd.

  • kubelet

    • A small agent that runs on every worker node in the cluster. The kubelet talks to the api-server and the container runtime to handle the final stage of starting containers.

  • kube-proxy

    • A network proxy that handles inside and outside communication of your cluster. Instead of managing traffic flow on it’s own, the kube-proxy tries to rely on the networking capabilities of the underlying operating system if possible.

Kubernetes also has a concept of namespaces, which are not to be confused with kernel namespaces that are used to isolate containers. A Kubernetes namespace can be used to divide a cluster into multiple virtual clusters, which can be used for multi-tenancy when multiple teams share a cluster. Please note that Kubernetes namespaces are not suitable for strong isolation and should more be viewed like a directory on a computer where you can organize objects and manage which user has access to which folder.

containerd is an industry standard container runtime: https://containerd.io

Setup Kubernetes

Setting up a Kubernetes cluster can be achieved with a lot of different methods. Creating a test "cluster" can be very easy with the right tools:

If you want to set up a production-grade cluster on your own hardware or virtual machines, you can choose one of the various installers:

A few vendors started packaging Kubernetes into a distribution and even offer commercial support:

The distributions often choose an opinionated approach and offer additional tools while using Kubernetes as the central piece of their framework.

If you don’t want to install and manage it yourself, you can consume it from a cloud provider:

You can learn how to set up your own Kubernetes cluster with Minikube in this interactive tutorial.

// To see the POD running on the Kubernetes Cluster
$ kubectl get po -A
// Create a sample deployment and expose it on port 8080:
$ kubectl create deployment hello-minikube --image=k8s.gcr.io/echoserver:1.4
$ kubectl expose deployment hello-minikube --type=NodePort --port=8080
// It may take a moment, but your deployment will soon show up when you run:
$ kubectl get services hello-minikube
// Alternatively, use kubectl to forward the port:
$ kubectl port-forward service/hello-minikube 7080:8080

Different Kubectl usefull command

// see version
$ kubectl version
// view the cluster details
$ kubectl cluster-info
// shows all nodes that can be used to host our applications
$ kubectl get nodes

To install a full prod cluster on several node (so create a AKS like cluster) -> https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

Need to install an overlay network for the cluster (Calico for example): https://projectcalico.docs.tigera.io/about/about-calico

Kubernetes API

The Kubernetes API is the most important component of a Kubernetes cluster. Without it, communication with the cluster is not possible, every user and every component of the cluster itself needs the api-server.

Access Control Overview, retrieved from the Kubernetes documentation

Before a request is processed by Kubernetes, it has to go through three stages:

  • Authentication The requester needs to present a means of identity to authenticate against the API. Commonly done with a digital signed certificate (X.509) or with an external identity management system. Kubernetes users are always externally managed. Service Accounts can be used to authenticate technical users.

  • Authorization It is decided what the requester is allowed to do. In Kubernetes this can be done with Role Based Access Control (RBAC).

  • Admission Control In the last step, admission controllers can be used to modify or validate the request. For example, if a user tries to use a container image from an untrustworthy registry, an admission controller could block this request. Tools like the Open Policy Agent can be used to manage admission control externally.

Like many other APIs, the Kubernetes API is implemented as a RESTful interface that is exposed over HTTPS. Through the API, a user or service can create, modify, delete or retrieve resources that reside in Kubernetes.

// To get the full dump config of a service account
$ kubectl get serviceaccounts/default -o yaml 

Running containers in K8S

In Kubernetes, instead of starting containers directly, you define Pods as the smallest compute unit and Kubernetes translates that into a running container. We will learn more about Pods later, for now imagine it as a wrapper around a container.

In an effort to allow using other container runtimes than Docker, Kubernetes introduced the Container Runtime Interface (CRI) in 2016.

Container runtime:

  • containerd

  • CRI-O

  • Docker: The standard for a long time, but never really made for container orchestration. The usage of Docker as the runtime for Kubernetes has been deprecated and will be removed in Kubernetes 1.23. Kubernetes has a great blog article that answers all the questions on the matter.

Networking

Kubernetes distinguishes between four different networking problems that need to be solved:

  1. Container-to-Container communications This can be solved by the Pod concept as we'll learn later.

  2. Pod-to-Pod communications This can be solved with an overlay network.

  3. Pod-to-Service communications It is implemented by the kube-proxy and packet filter on the node.

  4. External-to-Service communications It is implemented by the kube-proxy and packet filter on the node.

There are different ways to implement networking in Kubernetes, but also three important requirements:

  • All pods can communicate with each other across nodes.

  • All nodes can communicate with all pods.

  • No Network Address Translation (NAT).

To implement networking, you can choose from a variety of network vendors like:

In Kubernetes, every Pod gets its own IP address, so there is no manual configuration involved. Moreover, most Kubernetes setups include a DNS server add-on called core-dns, which can provide service discovery and name resolution inside the cluster.

Scheduling

In its most basic form, scheduling is a sub-category of container orchestration and describes the process of automatically choosing the right (worker) node to run a containerized workload on. In a Kubernetes cluster, the kube-scheduler is the component that makes the scheduling decision, but is not responsible for actually starting the workload. The scheduling process in Kubernetes always starts when a new Pod object is created. Remember that Kubernetes is using a declarative approach, where the Pod is only described first, then the scheduler selects a node where the Pod actually will get started by the kubelet and the container runtime.

The scheduler will use that information to filter all nodes that fit these requirements. If multiple nodes fit the requirements equally, Kubernetes will schedule the Pod on the node with the least amount of Pods. This is also the default behavior if a user has not specified any further requirements.

Additional Resources

Kubernetes history and the Borg Heritage

Kubernetes Architecture

RBAC

Container Runtime Interface

Kubernetes networking and CNI

Internals of Kubernetes Scheduling

Kubernetes Security Tools

Kubernetes Playground

Working with K8S

One of the core concepts of Kubernetes is providing a lot of mostly abstract resources, also called objects, that you can use to describe how your workload should be handled. Some of them are used to handle problems of container orchestration, like scheduling and self-healing, others are there to solve some inherent problems of containers.

Kubernetes objects can be distinguished between workload-oriented objects that are used for handling container workloads and infrastructure-oriented objects, that for example handle configuration, networking and security. Some of these objects can be put into a namespace, while others are available across the whole cluster.

As a user, we can describe these objects in the popular data-serialization language YAML and send them to the api-server, where they get validated before they are created.

// Some usefull Commands
// List available K8S objects in the cluster
$ kubectl api-resources
// kubectl has a built-in explanation function!
$ kubectl explain pod
// To learn more about the pod spec
$ kubectl explain pod.spec
// for basic command
$ kubectl --help
// To create an object in Kubernetes from a YAML file
$ kubectl create -f <your-file>.yaml
// kubectl comes with a config file
$ kubectl config view

Other tools for interaction with Kubernetes:

There are also advanced tools that allow the creation of templates and the packaging of Kubernetes objects. Probably the most frequently used tool in connection with Kubernetes today is Helm.

Helm is a package manager for Kubernetes, which allows easier updates and interaction with objects. Helm packages Kubernetes objects in so-called Charts, which can be shared with others via a registry. To get started with Kubernetes, you can search the ArtifactHub to find your favorite software packages, ready to deploy.

// Example Pod Creation
// Edit the empty file pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
    - name: nginx
      image: nginx:1.20
      ports:
        - containerPort: 80
        
// Create the pod on K8S cluster
$ kubectl create -f pod.yaml
//check that the pod is running on the cluster
$ kubectl get pod
// To delete the pod
$ kubectl delete pod nginx
// to have information on the syntax of a K8S object pod
$ kubectl explain pod
// to go deeper on a specific fied of the pod object
$ kubectl explain pod.spec

POD

A pod describes a unit of one or more containers that share an isolation layer of namespaces and cgroups. It is the smallest deployable unit in Kubernetes, which also means that Kubernetes is not interacting with containers directly. The pod concept was introduced to allow running a combination of multiple processes that are interdependent. All containers inside a pod share an IP address and can share via the filesystem.

You could add as many containers to your main application as you want. But be careful since you lose the ability to scale them individually! Using a second container that supports your main application is called a sidecar container.

All containers defined are started at the same time with no ordering, but you also have the ability to use initContainers to start containers before your main application starts.

Some examples of important settings that can be set for every container in a Pod are:

  • resources: Set a resource request and a maximum limit for CPU and Memory.

  • livenessProbe: Configure a health check that periodically checks if your application is still alive. Containers can be restarted if the check fails.

  • securityContext: Set user & group settings, as well as kernel capabilities.

For more detailed info on POD: https://kubernetes.io/docs/concepts/workloads/pods/

// Installing a POD
$ kubectl run nginx --image=nginx:1.19
// to have more info on the pod
$ kubectl describe pod nginx

Workload objects

Working just with Pods would not be flexible enough in a container orchestration platform. For example, if a Pod is lost because a node failed, it is gone forever. To make sure that a defined number of Pod copies runs all the time, we can use controller objects that manage the pod for us.

Kubernetes objects:

  • ReplicaSet: A controller object that ensures a desired number of pods is running at any given time. ReplicaSets can be used to scale out applications and improve their availability. They do this by starting multiple copies of a pod definition.

  • Deployment: The most feature-rich object in Kubernetes. A Deployment can be used to describe the complete application lifecycle, they do this by managing multiple ReplicaSets that get updated when the application is changed by providing a new container image, for example. Deployments are perfect to run stateless applications in Kubernetes.

  • StatefulSet: Considered a bad practice for a long time, StatefulSets can be used to run stateful applications like databases on Kubernetes. Stateful applications have special requirements that don't fit the ephemeral nature of pods and containers. In contrast to Deployments, StatefulSets try to retain IP addresses of pods and give them a stable name, persistent storage and more graceful handling of scaling and updates.

  • DaemonSet: Ensures that a copy of a Pod runs on all (or some) nodes of your cluster. DaemonSets are perfect to run infrastructure-related workload, for example monitoring or logging tools.

  • Job: Creates one or more Pods that execute a task and terminate afterwards. Job objects are perfect to run one-shot scripts like database migrations or administrative tasks.

  • CronJob: CronJobs add a time-based configuration to jobs. This allows running Jobs periodically, for example doing a backup job every night at 4am.

// Create a deployement
$ kubectl create deployment kubernetes-bootcamp --image=gcr.io/google-samples/kubernetes-bootcamp:v1
// see if deployment done
$ kubectl get deployment
// or the pod command
$ kubectl get pod
// The kubectl command can create a proxy that will forward communications into the cluster-wide, private network.
$ echo -e "\n\n\n\e[92mStarting Proxy. After starting it will not output a response. Please click the first Terminal Tab\n"; kubectl proxy
// Now we have a direct connection to kubectl
$ curl http://localhost:8001/version
// to get the list ok K8S objects:
$ kubectl api-resources
// to have more info about an object
$ kubectl explain "object"

// To get details info on container running in pods
$ kubectl describe pods

// Anything that the application would normally send to STDOUT becomes logs for the container within the Pod. We can retrieve these logs using the kubectl logs command:
$ kubectl logs "podname

// We can execute commands directly on the container once the Pod is up and running. For this, we use the exec command and use the name of the Pod as a parameter. Let’s list the environment variables:
$ kubectl exec $POD_NAME -- env

// Next let’s start a bash session in the Pod’s container:
$ kubectl exect -ti $POD_NAME -- bash

// To see on which Worker node are running the pods
$ kubectl get pods -o wide

// To dynamically scale an existing replicas set of pods
$ kubectl scale --replicas=3 rs/ReplicatName

// They are different ways to deploy Pods:
// Can be via
$ kubectl create -f file.yaml
// Can be via deployment, which is more advanced
$ kubeclt create -f deployment.yaml

// Possible to set the image of the deployment
$ kubectl set image deployment/nginx nginx=nginx:1.20

! By default they are visible from other pods and services within the same kubernetes cluster, but not outside that network. The kubectl command can create a proxy that will forward communications into the cluster-wide, private network.

Networking objects

Since a lot of Pods would require a lot of manually network configuration, we can use Service and Ingress objects to define and abstract networking.

Services can be used to expose a set of pods as a network service. Type of Services:

  • ClusterIP: The most common service type. A ClusterIP is a virtual IP inside Kubernetes that can be used as a single endpoint for a set of pods. This service type can be used as a round-robin load balancer.

  • NodePort: The NodePort service type extends the ClusterIP by adding simple routing rules. It opens a port (default between 30000-32767) on every node in the cluster and maps it to the ClusterIP. This service type allows routing external traffic to the cluster.

  • LoadBalancer: The LoadBalancer service type extends the NodePort by deploying an external LoadBalancer instance. This will only work if you’re in an environment that has an API to configure a LoadBalancer instance, like GCP, AWS, Azure or even OpenStack.

  • EternalName: A special service type that has no routing whatsoever. ExternalName is using the Kubernetes internal DNS server to create a DNS alias. You can use this to create a simple alias to resolve a rather complicated hostname like: my-cool-database-az1-uid123.cloud-provider-i-like.com. This is especially useful if you want to reach external resources from your Kubernetes cluster.

Ingress provides a means to expose HTTP and HTTPS routes from outside of the cluster for a service within the cluster. It does this by configuring routing rules that a user can set and implement with an ingress controller. Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.

An Ingress does not expose arbitrary ports or protocols. Exposing services other than HTTP and HTTPS to the internet typically uses a service of type Service.Type=NodePort orService.Type=LoadBalancer.

Ingress documentation: https://kubernetes.io/docs/concepts/services-networking/ingress/

Standard features of ingress controllers may include:

  • LoadBalancing

  • TLS offloading/termination

  • Name-based virtual hosting

  • Path-based routing

A lot of ingress controllers even provide more features, like:

  • Redirects

  • Custom errors

  • Authentication

  • Session affinity

  • Monitoring

  • Logging

  • Weighted routing

  • Rate limiting.

Kubernetes also provides a cluster internal firewall with the NetworkPolicy concept. NetworkPolicies are a simple IP firewall (OSI Layer 3 or 4) that can control traffic based on rules. You can define rules for incoming (ingress) and outgoing traffic (egress). A typical use case for NetworkPolicies would be restricting the traffic between two different namespaces.

The set of Pods targeted by a Service is usually determined by a LabelSelector

Services match a set of Pods using labels and selectors, a grouping primitive that allows logical operation on objects in Kubernetes. Labels are key/value pairs attached to objects and can be used in any number of ways:

  • Designate objects for development, test, and production

  • Embed version tags

  • Classify an object using tags

// Some Services commands
// List services running on the cluster
$ kubectl get services
// Create a new service and expose it to external
$ kubectl expose deployment/kubernetes-bootcamp --type="NodePort" --port 8080
// To know what port is expose externally
$ kubectl describe services/kubernetes-bootcamp
// Create an env variable that has the value of the node port exposed externally
$ export NODE_PORT=$(kubectl get services/kubernetes-bootcamp -o go-template='{{(index .spec.ports 0).nodePort}}')
echo NODE_PORT=$NODE_PORT
// Test the app is exposed outside the cluster
$ curl $(minikube ip):$NODE_PORT

// To find the label of a deployment
$ kubectl describe deployment
// To query a pod by his label
$ kubectl get pods -l app=kubernetes-bootcamp
// Idem to list service by label
$ kubectl get services -l app=kubernetes-bootcamp

// To attach a new label to a pod:
$ kubectl label pods $POD_NAME version=v1

// To delete a service:
$ kubectl delete service -l app=kubernetes-bootcamp
$ kubeclt get services

// To run a command inside the pod
$ kubectl exec -ti $POD_NAME -- curl localhost:8080
 // expose via service a pod
 $ kubectl expose deploument test -—port=80
 

Volume and storage object

Containers already had the concept of mounting volumes, but since we’re not working with containers directly, Kubernetes made volumes part of a Pod, just like containers are. ceph storage solution: https://rook.io/docs/rook/v1.7/ceph-storage.html

Configuration object

In Kubernetes, this problem is solved by decoupling the configuration from the Pods with a ConfigMap. ConfigMaps can be used to store whole configuration files or variables as key-value pairs. There are two possible ways to use a ConfigMap:

  • Mount a ConfigMap as a volume in Pod

  • Map variables from a ConfigMap to environment variables of a Pod.

Right from the beginning Kubernetes also provided an object to store sensitive information like passwords, keys or other credentials. These objects are called Secrets. Secrets are very much related to ConfigMaps and basically their only difference is that secrets are base64 encoded.

There is an on-going debate about the risk of using Secrets, since their - in contrast to their name - not considered secure. In cloud-native environments purpose-built secret management tools have emerged that integrate very well with Kubernetes. One example would be HashiCorp Vault.

AutoScaling Object

Auto scaling Mechanisms:

  • Horizontal Pod Autoscaler (HPA)

    • Horizontal Pod Autoscaler (HPA) is the most used autoscaler in Kubernetes. The HPA can watch Deployments or ReplicaSets and increase the number of Replicas if a certain threshold is reached. Imaging your Pod can use 500MiB of memory and you configured a threshold of 80%.

  • Cluster Autoscaler

    • Of course, there is no point in starting more and more Replicas of Pods, if the Cluster capacity is fixed. The Cluster Autoscaler can add new worker nodes to the cluster if the demand increases. The Cluster Autoscaler works great in tandem with the Horizontal Autoscaler.

  • Vertical Pod Autoscaler

    • The Vertical Pod Autoscaler is relatively new and allows Pods to increase the resource requests and limits dynamically. As we discussed earlier, vertical scaling is limited by the node capacity

Unfortunately, (horizontal) autoscaling in Kubernetes is NOT available out of the box and requires installing an add-on called metrics-server.

Other third party integration to manage scaling in Kubernetes with Pods metrics:

// Some Lab
$: kubectl get deployments
// See the number of ReplicatSet created by the deployment
$: kubectl get rs
//let’s scale the Deployment to 4 replicas. 
$: kubectl scale deployments/kubernetes-bootcamp --replicas=4
// check number of deployed replicat set
$: kubectl get deployments
$: kubectl get pods -o wide
// change was registered in the Deployment events log
$: kubectl describe deployments/kubernetes-bootcamp
// Check service is loadbalancing Traffic
$: kubectl describe services/kubernetes-bootcamp
// Create an en variable
$: export NODE_PORT=$(kubectl get services/kubernetes-bootcamp -o go-template='{{(index .spec.ports 0).nodePort}}')
echo NODE_PORT=$NODE_PORT
// We hit a different pod at each query
$: curl $(minikube ip):$NODE_PORT
// Scale down number of replicat:
$: kubectl scale deployments/kubernetes-bootcamp --replicas=2

Additional Resources

Differences between Containers and Pods

kubectl tips & tricks

Storage and CSI in Kubernetes

Autoscaling in Kubernetes

Cloud Native Application Delivery

Application Delivery Fundamentals

In 2005 Linus Torvalds created Git, which is the standard version control system that almost everybody is using today. Git is a decentralized system that can be used to track changes in your source code. In essence, Git can work with copies of the code, so called in branches or forkswhere you can work in, before your changes get merged back in a main branch.

Git Doc: https://git-scm.com

If your target platform is Kubernetes, you can write a YAML file to deploy your application while your newly built container image can be pushed to a container registry, where Kubernetes will download it for you.

To make full use of cloud resources, the principle of Infrastructure as Code (IaC) became popular. Instead of installing infrastructure manually, you describe it in files and use the cloud vendors' API to set up your infrastructure. This allows developers to be more involved in the setup of the infrastructure.

CI-CD

Automation is the key to overcoming these barriers, and today we know and use the principles of Continuous Integration/Continuous Delivery (CI/CD), which describe the different steps in the deployment of an application, configuration or even infrastructure.

Continuous Integration is the first part of the process and describes the permanent building and testing of the written code. High automation and usage of version control allows multiple developers and teams to work on the same code base.

Continuous Delivery is the second part of the process and automates the deployment of the pre-built software. In cloud environments, you will often see that software is deployed to Development or Staging environments, before it gets released and delivered to a production system.

To automate the whole workflow, you can use a CI/CD pipeline, which is actually nothing more than the scripted form of all the steps involved, running on a server or even in a container. Pipelines should be integrated with a version control system that manages changes to the code base. Every time a new revision of your code is ready to be deployed, the pipeline starts to execute scripts that build your code, run tests, deploy them to servers and even perform security and compliance checks.

Popular CI/CD tools include:

Very nice free training on Devops Intro: https://training.linuxfoundation.org/training/introduction-to-devops-and-site-reliability-engineering-lfs162/

GitOps

Infrastructure as Code was a real revolution in increasing the quality and speed of providing infrastructure, and it works so well that today, configuration, network, policies, or security can be described as code, and often even live in the same repository as the software.

GitOps takes the idea of Git as the single source of truth a step further and integrates the provisioning and change process of infrastructure with version control operations.

There are two different approaches how a CI/CD pipeline can implement the changes you want to make:

  • Push-based The pipeline is started and runs tools that make the changes in the platform. Changes can be triggered by a commit or merge request.

  • Pull-based An agent watches the git repository for changes and compares the definition in the repository with the actual running state. If changes are detected, the agent applies the changes to the infrastructure.

Two examples of popular GitOps frameworks that use the pull-based approach are Flux and ArgoCD. ArgoCD is implemented as a Kubernetes controller, while Flux is built with the GitOps Toolkit, a set of APIs and controllers that can be used to extend Flux, or even build a custom delivery platform.

ArgoCD Architecture, retrieved from the ArgoCD documentation

Very nice training on the subject: To learn more about GitOps in action and the usage of ArgoCD and Flux, consider enrolling for the free course Introduction To GitOps (LFS169).

Additional Resources

10 Deploys Per Day - Start of the DevOps movement at Flickr

Learn git in a playful way

Infrastructure as Code

Beginners guide to CI/CD

Cloud Native Observability

Conventional monitoring for servers may include collecting basic metrics of the system like CPU and memory resource usage and logging of processes and the operating system. A new challenge for a microservice architecture is monitoring requests that move through a distributed system. That discipline is called tracing and is especially useful when a lot of services are involved in answering a request.

We will learn how container infrastructure is still relying on collecting metrics and logs, but changes the requirements quite a bit. There is a lot more focus on network problems like latency, throughput, retrying of requests or application start time, while the sheer volume of metrics, logs, and traces in distributed systems calls for a different approach to managing these systems.

Observability

The higher goal of observability is to allow analysis of the collected data. This helps to get a better understanding of the system and react to error states. The term observability is closely related to the control theory which deals with behavior of dynamic systems.

Telemetry

In container systems, each and every application should have tools built in that generate information data, which is then collected and transferred in a centralized system. The data can be divided into three categories:

  • Logs

    • These are messages that are emitted from an application when errors, warnings or debug information should be presented. A simple log entry could be the start and end of a specific task that the application performed.

  • Metrics

    • Metrics are quantitative measurements taken over time. This could be the number of requests or an error rate.

  • Traces

    • They track the progression of a request while it’s passing through the system. Traces are used in a distributed system that can provide information about when a request was processed by a service and how long it took.

Logging

Application frameworks and programming languages come with extensive logging tools built-in, which makes it very easy to log to a file with different log levels based on the severity of the log message.

Unix and Linux programs provide three I/O streams from which two can be used to output logs from a container:

  • standard input (stdin): Input to a program e.g. via keyboard

  • standard output (stdout): The output a program writes on the screen

  • standard error (stderr): Errors that a program writes on the screen

If you want to learn more about I/O streams and how they originated, make sure to visit the stdin(3) - Linux manual page.

The documentation of the kubectl logs command provides some examples.

To ship the logs, different methods can be used:

  • Node-level logging The most efficient way to collect logs. An administrator configures a log shipping tool that collects logs and ships them to a central store.

  • Logging via sidecar container The application has a sidecar container that collects the logs and ships them to a central store.

  • Application-level logging The application pushes the logs directly to the central store. While this seems very convenient at first, it requires configuring the logging adapter in every application that runs in a cluster.

There are several tools to choose from to ship and store the logs. The first two methods can be done by tools like fluentd or filebeat.

Popular choices to store logs are OpenSearch or Grafana Loki. To find more datastores, you can visit the fluentd documentation on possible log targets.

To make logs easy to process and searchable make sure you log in a structured format like JSON instead of plaintext. The major cloud vendors provide good documentation on the importance of structured logging and how to implement it:

For more infomation on ElasticSearch -> https://www.elastic.co/what-is/elasticsearch

Prometheus

Prometheus is an open source monitoring system, originally developed at SoundCloud, which became the second CNCF hosted project in 2016. Over time, it became a very popular monitoring solution and is now a standard tool that integrates especially well in the Kubernetes and container ecosystem.

Prometheus can collect metrics that were emitted by applications and servers as time series data - these are very simple sets of data that include a timestamp, label and the measurement itself. The Prometheus data model provides four core metrics:

  • Counter: A value that increases, like a request or error count

  • Gauge: Values the increase or decrease, like memory size

  • Histogram: A sample of observations, like request duration or response size

  • Summary: Similar to a histogram, but also provides the total count of observations.

Monitoring only makes sense if you use the data collected. The most used companion for Prometheus is Grafana, which can be used to build dashboards from the collected metrics. You can use Grafana for many more data sources and not only Prometheus, although that is the most used one.

You can also use one of the many unofficial client libraries listed in the Prometheus documentation.

Here are some examples taken from the Prometheus documentation.

Another tool from the Prometheus ecosystem is the Alertmanager. The Prometheus server itself allows you to configure alerts when certain metrics reach or pass a threshold. When the alert is firing, Alertmanager can send a notification out to your favorite persistent chat tool, e-mail or specialized tools that are made for alerting and on-call management.

Tracing

Logging and Monitoring with the collection of metrics are not particularly new methods. The same thing cannot be said for (distributed) tracing. Metrics and logs are essential and can give a good overview of individual services, but to understand how a request is processed in a microservice architecture, traces can be of good use.

A trace describes the tracking of a request while it passes through the services. A trace consists of multiple units of work which represent the different events that occur while the request is passing the system. Each application can contribute a span to the trace, which can include information like start and finish time, name, tags or a log message.

These traces can be stored and analyzed in a tracing system like Jaeger.

While tracing was a new technology and method that was geared towards cloud native environments, there were again problems in the area of standardization. In 2019, the OpenTracing and OpenCensus projects merged to form the OpenTelemetry project, which is now also a CNCF project.

OpenTelemetry is a set of application programming interfaces (APIs), software development kits (SDKs) and tools that can be used to integrate telemetry such as metrics, protocols, but especially traces into applications and infrastructures. The OpenTelemetry clients can be used to export telemetry data in a standardized format to central platforms like Jaeger. Existing tools can be found in the OpenTelemetry documentation.

Cost Management

All these methods can be combined to be more cost-efficient. It is usually no problem to mix on-demand, reserved and spot instances:

  • Identify Wasted and unused resources

    • With a good monitoring of your resource usage, it is very easy to find unused resources or servers that don’t have a lot of idle time. A lot of cloud vendors have cost explorers that can break down costs for individual services. Autoscaling helps to shut down instances that are not needed.

  • Right-Sizing

    • When you start out, it can be a good idea to choose servers and systems with a lot more power than actually needed. Again, good monitoring can give you indications over time of how much resources are actually needed for your application. This is an ongoing process where you should always adapt to the load you really need. Don’t buy powerful machines if you only need half of their capacity.

  • Reserverd Instances

    • On-demand pricing models are great if you really need resources on-demand. Otherwise, you’re probably paying a lot for the "on-demand" service. A method to save a lot of money is to reserve resources and even pay for them upfront. This is a great pricing model if you have a good estimate about the resources you need, maybe even for years in advance.

  • Spot Instances

    • If you have a batch job or heavy load for a short amount of time, you can use spot instances to save money. The idea of spot instances is that you get unused resources that have been over-provisioned by the cloud vendor for very low prices. The "problem" is that these resources are not reserved for you, and can be terminated on short notice to be used by someone else paying "full price".

Additional Resources

Cloud Native Observability

Prometheus

Prometheus at scale

Logging for Containers

Right-Sizing and cost optimization

Last updated