Kubernetes 101 : An overview of StatefulSets and Deployments



StatefulSets are kubernetes objects intended for handling "stateful" applications, like databases or applications that store data related to its configuration, state, ....

StatefulSets and deployments:

Stateless applications are deployed using the kubernetes deployment object.


Stateful applications are deployed using StatefulSet objects.

Below is a Yaml file of a deployment object:


Pods in our deployment will be similar, they will be created in a random order, and will get random hashes attached to their deployment's name, for example "deploy-1-5b9fb7b7d9".

The service that load-balances the request, will send the request to any of the pods, also when we scale down the pods, pods will be deleted randomly.

Below is a Yaml file of a StatefulSet:


Stateful pods
can't be deployed at the same time and a request can't go randomly to any of the stateful pods, because the "replicas" of the stateful pods are not identical.

Each stateful pod has its own "identity" that persists beyond any re-deployment following a crash for example.

Remark:

The persistent volume storage persists even if we delete the stateful pod
When the stateful pod is recreated, its "substitute" pod can re-connect to the persistent volume and retrieve data about its state (master, slave, ...) from it.

Stateful pods overview:


One pod, the "master" is allowed to read and write data. We can't allow two pods to write data in their separate respective storage, we will have data inconsistencies.

The "master" is the only one allowed to write data to its storage.

The "master" is created first.
When we create a "workerpod, data is copied from the "master" to the "worker", we also need to allow continuous replication from the "master" to the "worker" pod.

Adding a "worker" pod:

When we add another "worker" pod, data is copied from the recently updated/added "workerpod, "stateful-2" in our case.


Remark: 

We avoid cloning data directly from the "master" so as not to impact its performance especially the network resources.

Stateful pods naming:

Stateful pods get fixed ordered names, that consist of the below:

"statefulset-name-number"

The "number" starts from zero and is incremented by one for every newly added pod as we can see below:
  • stateful-0  (master)
  • stateful-1  (worker)
  • stateful-2  (worker)
Data synchronization:

Each newly deployed stateful worker pod will synchronize its data with the previous stateful pod

For example if we deploy a new stateful pod "stateful-3", it will synchronize its data with the stateful pod "stateful-2".

Stateful pods are created one after the other, waiting each time for the previous pod to be in a "Running" state before starting the next one
Deleting the stateful pods is done in reverse order.

Stateful pods and DNS names:

Each stateful pod gets its own DNS name that persists even if its IP address changes when it is restarted after a crash for example.

Below are examples of DNS names for the stateful pods in the form
"podname.servivename" with the "servicename" defined in the StatefulSet Yaml file above:
  • stateful-0.service-db
  • stateful-1.service-db
  • stateful-2.service-db
Remark:

The Storage of the stateful pods needs to be on a remote machine, and not locally in case the stateful pod gets re-scheduled on a different machine than where its persistent storage lives.

Comments

Leave as a comment:

Archive