Kubernetes (K8S)

What is Kubernetes?

  • Open Source Container Orchestration Tool

  • Developed by Google

  • Helps to manage containerized applications in different deployment environments

Why Kubernetes?

Increased usage of containers and hence managing those thousands of containers.

Advantages of using Kubernetes

  1. High availability (fault tolerance / no downtime)

  2. Scalability or high performance

  3. Disaster recovery - backup and restore

Basic Components in Kubernetes

  • Node: working server or virtual machines running as an instance that contains pods

  • Pod:

    • It is an abstraction over container which is a mallest deployment/execution unit of Kubernetes which creates a running environment for container.

    • Each pod can have one or more containers/applications(for eg. docker container containing image).

    • The best practice is to have one pod with one container.

    • Each pod has its IP address. This IP is used to communicate with other pods. If one pod fails, then the new pod is re-created in that place and assigned a new IP address on re-creation. This is not convenient so we have Service component (that needs to configure IP every time when re-started the pod).

  • Service:

  • Has static permanent IP address attached to each pod.

  • Even if a pod dies, the service and IP address remain alive.

  • It also acts as a load balancer if we have multiple replicas of nodes to prevent fault tolerance. It catches the request and forward it to which ever pod is least busy. Analogous to FeignClient in Spring Boot.

  • Ingress: It provides routing rules to manage external users' access to the services in a K8s cluster. For example, if we try to connect from a browser with https://192.128.0.1:8080 then ingress link to the particular service with the respective domain name myapp.com

External Configurations:

  1. ConfigMap: External configuration of the application. For example database configuration in spring boot application.properties/yaml file. These configurations are separated in ConfigMap. It is an alternative to the config server in Spring Cloud.

    What happens if we put configuration properties in (application.properties file in Spring Boot project) in a built image? See the image below.

    Problem Solved by Config Map: If we need to change database credentials then we do not need to rebuild the application since it is entirely isolated as a different component. We only need to change and rebuild ConfigMap. NOTE: Do not put credentials in ConfigMap.

    This is connected to the pod and hence respective pod can get access to ConfigMap using environment variables or as a properties file.

  2. Secret: It is used to store secret data (for example database credentials, certificates etc) in base64 encoded format. This is connected to the pod and hence respective pod can actually get access to Secret using environment variables or as a properties file.

  3. Volumes: It is an external storage(local meaning the same node where the pod is running or remote meaning outside of the K8s cluster) attached to the database-related pod.

    Problem solved by Volumes: When we restart the database-related pod or docker container then all data would be removed from the database. Since K8s is does not manage data persistence, Volumes came into the picture.

  4. Deployment: It is an abstraction of pods. It lets you:

    The Kubernetes deployment object lets you:

    • Deploy a replica set of pods or pod

    • Deployment for stateless applications

    • Update pods and replica sets

    • Rollback to previous deployment versions

    • Scale a deployment: increase or decrease no. of pods

    • Pause or continue a deployment

NOTE: To create a replica of the stateful application like a database application, we do not use deployment configuration. Instead, we use the StatefulSet component of the K8s.

  1. StatefulSet: It is similar to Deployment for stateless application but for stateful application such as Databases, ElasticSearch.

    Problem solved by StatefulSet: If there was no StatefulSet, then Deployment would not know which pod is reading or which pod is writing to the database which results in data inconsistency. So, to avoid data inconsistency StatefulSet came into the picture

    NOTE: It is tedious to maintain a stateful application inside the K8s cluster. So, it is common to host/deploy stateful applications outside of Kubernetes.

Kubernetes Architecture

Types of nodes in k8s:

  1. Master

  2. Slave

Node Processes:

Worker/Slave node: has 3 processes

  1. Docker: Container Runtime.

  2. Kubelet: It interacts with both the container and node. It runs the pod and assigns resources from node to conainer such as CPU, RAM etc.

  3. Kube Proxy: It is a Load balancer. It decides which pod the incoming request to forward to. It is implemented in Service component. It first give priority to the pods running in the same node which reduces the chance of network failure or overhead.

Master Node:

Why Master Node?

  1. Schedule pods: Which pod should be scheduled,

  2. Monitoring when the replica dies and reschedule/re-start it again.

  3. Add a new server: How does it join the cluster to become another node and get pods & other components?

Master node has 4 processes:

  1. API Server:

    • Cluster gateway (API gateway) and acts as a gatekeeper for authentication.

    • It gets any request of the update and queries from the cluster

    • It is entrypoint into the cluster

  2. Scheduler: analogous to CPU Scheduler

    • It has the intelligence to decide on which node the new pod should be scheduled.

    • It looks at the node which is least busy or has more resources(memory, CPU, etc) available.

    • It is done by kublete which gets the request from scheduler and executes the request on node.

  3. Controller Manager:

    • It detects cluster state changes. For example when pods die in cluster.

    • To recover the cluster state, it will restart the died pods requesting to the Scheduler.

    • Scheduler then does what has mentioned in Scheduler section above and repeats the same cycle.

  4. etcd:

    • This is the cluster brain which stores cluster state information.

    • Cluster changes such as when a pod dies, when a new pod scheduled, cluster's health are stored in the key value store in etcd.

    • It determines: How does scheduler know what resources are available in worker node? How does controller manager detects the cluster state change?

NOTE: In practice, K8s is usually made up of multiple master nodes.

Minikube:

Puspose: Test/Local Cluster Setup

One node cluster where master processes and node processes both run on one node/machine. This node has docker container runner pre-installed so we can run pods in this node. The Minikube runs on a Virtual Box or any type of hypervisor on your laptop and this node runs on that virtual box.

Kubectl

  • Command line tool for kubernetes cluster to interact (create/deleted/update pods, secret, service, config map etc) with clusters such as minikube cluster or cloud clusters talking to the API Server of Master Node.

  • API server enables interaction with cluster. This can be done by CLI(Kubectl) or UI dashboard or Kubernete's API client. Kubectl is the most powerful of 3 clients.

Steps to Set up and configure Kubernetes on macOS

  1. Install virtual box (eg. hyperkit or docker) : bikashmainali@Bikashs-MBP ~ % brew install hyperkit

  2. Install minikube: bikashmainali@Bikashs-MBP ~ % brew install minikube (it includes kubernetes-cli i.e. kubectl CLI dependency)

  3. start virtual box

  4. start minikube: bikashmainali@Bikashs-MBP ~ % minikube start --vm-driver=docker (using docker here)

  5. You can verify with the following command to see if minikube is running

    bikashmainali@Bikashs-MBP ~ % minikube status

    bikashmainali@Bikashs-MBP ~ % kubectl get nodes

  6. To check Kubernetes version

Layer of Abstraction

NameSpace:

When to use Name Space?

  1. Structure your components

  2. Avoid conflicts between teams

  3. Share services between different environments

  4. Access and Resource Limits on Namespaces Level

NOTE:

  • ConfigMap and Secret can not be shared between name space. Each name space should have their own ConfigMap and Secret.

  • Volume and node can not be put in name space

HELM:

  • Package manager for K8s.

  • To package yaml files and distribute them in public and private repositories.

    Scenario: If you are going to deploy elastic stack on K8s then you have to need StateFulSet, ConfigMap, Secret, Services etc. So instead of creating these components manually, we use helm to package all these components as a single bundle which is called helm charts.

Helm Charts:

  1. Bundle of yaml files

  2. Create your own helm charts with helm

  3. Push them to helm repository (Helm Hub) to make it available for others

    command example: helm install <chartname>

Templating Engine:

Volume:

  1. Persistent Volume :

    1. used to store data

    2. does not have namespace

    3. available to whole cluster to all namespace

  2. Persistent Volume Claim: This yaml file claims the volume. Whatever persistent volume matches the criteria defined in PVC will be used. PVC should be in the same name space.

  3. Storage Class: Provisions persistent volume dynamically whenever persistentvolumeclaim claims is done

Scaling Database Application/Stateful Application

Characteristics of stateful pod

Overview:

Ref: https://www.youtube.com/watch?v=X48VuDVv0do&t=228s&ab_channel=TechWorldwithNana