Getting Started With Kubernetes : Part 02

Jyoti
7 min readOct 20, 2023

--

In this blog we will see the architecture of kubernetes, different component of kubernetes and their functionality in brief and how communication happens within kubernetes.

Architecture of Kubernetes

K8s is a system for automating application deployment. Modern app are dispersed across cloud, vms and servers and managing and administrating app is no longer a viable option.

Kubernetes consists of two main components, a control plane (master node) and data plane (worker nodes).

Kubernetes architecture (source: cncf)

If you are running kubernetes you are running a cluster. A K8s cluster contains a control plane and one or more compute machines, or nodes. The control plane is responsible for maintaining the desired state of the cluster, such as which applications are running and which container images they use. Nodes actually run the applications and workloads.

To be able to control the cluster, you have ui or kubectl cli, you can give command to it and it will pass the request to api server. Api server is heart of k8s. It analyse the request and forward to respective system component.

So there are 2 components in kubernetes architechture:

  1. Control Plane (master node)
  2. Data Plane (worker node)

Node is nothing but just a virtual machine.

Lets see in brief each of kubernetes components:

1. Control Plane

In control plane we have API server, Scheduler, Controller manager, etcd.

  1. API Server: API Server is frontend of control plane and only component of control plane we interact directly with. Internal system component as well as external user all communicate through same api server.
  2. etcd: It is a database or you can say key value store. Kubernetes use it to backup all cluster data. It stores entire configuration and state of the cluster. Control plane queries it to retrieve parameter for state of nodes, pods, containers. You query from kubectl to figure out config, it will fetch from etcd. Kubectl is nothing but CLI tool to interact with K8s cluster.

Eg. suppose there is a scenario where you are doing kube apply -f “deployment.yml” file from kubectl, it will be handled by api server, api server will check in etcd the deployment.yml file in pod, if its matching it won’t deploy. If there are changes then it run new deployment for this new requirement.

3. Controller manager: Most Kubernetes users do not create Pods directly; instead they create a Deployment, CronJob, StatefulSet, or other Controller which manages the Pods for them.

Controllers continuously watch the state of your cluster, then make or request changes where needed. Each controller tries to move the current cluster state closer to the desired state.

Controllers continuously talk to the kube-apiserver (api-server) and the kube-apiserver receives all information of nodes through Kubelet (present in worker node).

There are many controller managed by controller manager, few of them are deployment controller (for managing the deployment of application updates), Replication Controller/ReplicaSet Controller (ensures that a specified number of replica pods for an application is running at all times), job controller, namespace controller, service controller, ingress controller, endpoint controller etc.

In simple terms you can remember it as controller manager make sure all pods and cluster are running in desired state.

node controller: noticing and responding when node goes down

replication controller: maintaining correct no of pods for app

endpoint controller: populate endpoint object , manages endpoints for services, dynamically updating them as pods come and go.

4. Scheduler: As name signifies you can easily guess what it does. It is a scheduler right? so something responsible for scheduling deployment of application. It watches the new request coming from api server and assign them to healthy nodes. It ranks quality of nodes and deploy pod to best suited nodes. If no suitable node found, it put pod in pending state.

Each component of master node itself run inside pod and if it goes down it will create itself again. Pod is virtual concept and it is the smallest deployment unit in kuberenetes.

2. Data Plane

Data Plane in kubernetes

Data plane consists of worker nodes. There can be one or multiple worker nodes. Node as i mentioned earlier is nothing but a virtual machine. So there can be one or multiple virtual machines. In each virtual machine or worker node there can be one or more pods. Pod is a virtual concept. It is smallest deployment unit in kubernetes. You cannot deploy container, you can deploy only pod. A pod can have multiple containers.

Worker node components:

  1. kubelet
  2. kubeproxy
  3. container runtime

1. Kubelet :

Kubelet is an agent that runs on each node on cluster. It make sure that container is running inside the pod. It takes the pod specification and ensure that those container are running and are healthy. It only manages the container created by kubernetes.

2. Kubeproxy :

It is a network proxy that runs on each node (vm) in your cluster. It maintains network rules on nodes, its for networking purpose inside the cluster. Make sure each node gets its ip adress, implement local ip tables and rules to handle routing and load balancing.

control and data plane (src: google)

3. Container Runtime

Pulls the images from container image registry and starts and stops the container. Docker is one of container runtime. Containerd is container runtime for kubernetes.

There are two component other that these above described one in worker nodes, i.e pods and containers:

  1. Pods: Pod is smallest scheduling unit in kubernetes. Pod is wrapper around the container. A pod can have more than one container but every container in a pod have same ip adress (port will be different for each container). We deploy app on pod with help of kubectl and api server.

Kubectl -> some command -> api server -> deploy on node

2. Container: A container in Kubernetes (K8s) is a lightweight, standalone, and executable software package that includes an application and its dependencies, encapsulated for efficient deployment and scaling within Kubernetes clusters. Container is inside the pod and in it app is running.

Based on availability of node, master schedules pod on specific node and co-ordinate with container runtime to launch the container.

Flow of communication within kubernetes:

flow of communication within kubernetes (src: google)

When a Kubernetes administrator runs the command kubectl apply -f mydeployment.yaml to submit the contents of a manifest file (mydeployment.yaml) to the API Server through kubectl, several internal processes occur:

  1. Validation: The Kubernetes API Server first validates the manifest file to ensure it follows the correct schema and doesn’t contain any syntax errors or invalid configurations. If there are any issues with the manifest, the API Server will reject it and return an error message to the administrator.
  2. API Server Processing: Assuming the manifest is valid, the API Server processes the request. It updates the desired state of the cluster with the contents of the manifest file in etcd. This includes creating or updating Kubernetes resources like Deployments, Pods, Services etc., as specified in the manifest.
  3. etcd Interaction: The API Server communicates with the etcd datastore, which serves as the cluster’s source of truth. It stores and retrieves the current and desired states of all resources. The updated desired state from the manifest is stored in etcd.
  4. Controller Managers and Schedulers: If the manifest file contains objects like Deployments or ReplicaSets, the Controller Manager will be notified of the changes in the desired state. The Controller Manager ensures that the actual state of the cluster matches the desired state. If there are discrepancies (e.g., not enough replicas running), it will take corrective actions, such as creating new Pods. The API Server notifies the Scheduler to find nodes to host the pods defined by the Deployment. The Scheduler will find nodes that meet the pods’ requirements. The Scheduler identifies a host node(s) for the deployment and sends that information back to the API Server
  5. Kubelet on Worker Nodes: If the manifest file specifies Pods that need to be created or updated, the Kubelet on the worker nodes receives instructions from the API Server. It starts or updates the containers as per the Pod’s definition. The API Server will send a message announcing a request to create and configure a pod container on a particular Kubernetes node.
  6. Container runtime: kubelet creates containers with the help of container engine.
  7. Kube Proxy: If there are changes to networking configurations (e.g., Services or NetworkPolicies) in the manifest, Kube Proxy on the worker nodes updates the network rules and routes to ensure proper network communication.

Bonus point: You can see how complex it is, creating and managing kubernetes cluster is hard. Nowadays cloud provider solves this problem. Cloud provider like azure provides AKS i.e azure kubernetes services, aws provides EKS (Amzon Elastic Kubernetes Service). It will manage and take care of master node. You just need to take care of worker nodes.

In next article we will see how you can set up Kubernetes Cluster with help of AKS and we will practically deploy one application on Kubernetes pod. Thanks for reading till here. This article Iwrote on basis of my understanding and learning with respect to kubernetes. Please let me know if you have any feedback for me in comments. See you in next blog. Till then keep reading and keep upskilling.

--

--

Jyoti

Explorer, Observer, Curious and a head full of questions and thoughts.