Kubernetes & Container Orchestration
Kubernetes Architecture Deep Dive
Understanding K8s architecture is essential for any DevOps/SRE interview. Let's explore every component.
Control Plane Components
┌─────────────────── Control Plane ───────────────────┐
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ kube-apiserver│ │ etcd │ │
│ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ kube-scheduler│ │kube-controller│ │
│ │ │ │ -manager │ │
│ └──────────────┘ └──────────────┘ │
│ │
└──────────────────────────────────────────────────────┘
kube-apiserver
The front door to Kubernetes:
| Responsibility | Details |
|---|---|
| API endpoint | All K8s communication goes through it |
| Authentication | Validates user/service identity |
| Authorization | RBAC policy enforcement |
| Admission control | Mutating and validating webhooks |
| etcd communication | Only component that talks to etcd |
etcd
Distributed key-value store for all cluster data:
# etcd stores everything:
# - Pod definitions
# - Service configurations
# - Secrets
# - ConfigMaps
# - Cluster state
# Check etcd health (if you have access)
etcdctl endpoint health
etcdctl endpoint status
Interview question: "What happens if etcd goes down?"
Answer: The cluster becomes read-only. Running workloads continue, but no changes can be made. This is why etcd should be highly available (odd number of nodes: 3, 5, 7).
kube-scheduler
Decides which node runs each pod:
Scheduling steps:
- Filtering: Exclude nodes that can't run the pod
- Insufficient resources
- Node selectors don't match
- Taints not tolerated
- Scoring: Rank remaining nodes
- Resource balance
- Pod affinity/anti-affinity
- Data locality
- Binding: Assign pod to highest-scored node
kube-controller-manager
Runs controller loops:
| Controller | What It Manages |
|---|---|
| Node Controller | Node health, evictions |
| Deployment Controller | ReplicaSets, rolling updates |
| ReplicaSet Controller | Pod count maintenance |
| Service Controller | LoadBalancer provisioning |
| Endpoints Controller | Service endpoint updates |
Node Components
┌─────────────────── Worker Node ─────────────────────┐
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ kubelet │ │ kube-proxy │ │
│ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────────────────────────┐ │
│ │ Container Runtime (containerd)│ │
│ │ ┌─────┐ ┌─────┐ ┌─────┐ │ │
│ │ │ Pod │ │ Pod │ │ Pod │ │ │
│ │ └─────┘ └─────┘ └─────┘ │ │
│ └──────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────┘
kubelet
The node agent:
# kubelet responsibilities:
# - Registers node with cluster
# - Watches API for pod assignments
# - Starts/stops containers via CRI
# - Reports node and pod status
# - Executes liveness/readiness probes
# Check kubelet logs
journalctl -u kubelet -f
kube-proxy
Network proxy on each node:
| Mode | How It Works | Use Case |
|---|---|---|
| iptables | Creates iptables rules | Default, most clusters |
| IPVS | Uses kernel IPVS | High-scale clusters |
| userspace | Legacy, userspace proxy | Deprecated |
# Check kube-proxy mode
kubectl get configmap kube-proxy -n kube-system -o yaml | grep mode
API Request Flow
kubectl apply → API Server → Authentication → Authorization
↓
Admission Controllers (mutating → validating)
↓
etcd
↓
Controller notices change
↓
Scheduler assigns to node
↓
kubelet on node starts pod
Interview Questions
Q: "Walk me through what happens when you run kubectl create deployment nginx --image=nginx"
- kubectl sends POST to
/apis/apps/v1/namespaces/default/deployments - API server authenticates (kubeconfig) and authorizes (RBAC)
- Admission controllers process (mutate, validate)
- Deployment object stored in etcd
- Deployment controller sees new deployment, creates ReplicaSet
- ReplicaSet controller sees new RS, creates Pod objects
- Scheduler sees unscheduled pods, assigns to nodes
- kubelet on target node sees assigned pod
- kubelet tells containerd to pull image and start container
- kubelet reports pod status back to API server
Q: "How does Kubernetes achieve high availability?"
| Component | HA Strategy |
|---|---|
| etcd | Odd number cluster (3, 5, 7), Raft consensus |
| API server | Multiple replicas behind load balancer |
| Controllers | Leader election (only one active) |
| Scheduler | Leader election (only one active) |
| Nodes | Multiple, workloads spread across |
Q: "What's the difference between a Deployment and a StatefulSet?"
| Feature | Deployment | StatefulSet |
|---|---|---|
| Pod names | Random suffix | Ordered (pod-0, pod-1) |
| Scaling | Parallel | Sequential |
| Storage | Shared or none | Per-pod PVC |
| Network identity | Random | Stable DNS names |
| Use case | Stateless apps | Databases, Kafka |
Next, we'll cover workloads and networking in Kubernetes. :::