Kubernetes & Container Orchestration
Kubernetes Workloads and Networking
4 min read
Master K8s workloads and networking concepts for your DevOps/SRE interviews.
Pod Lifecycle
Pending → Running → Succeeded/Failed
↓
(CrashLoopBackOff)
Pod Phases
| Phase | Meaning |
|---|---|
| Pending | Waiting for scheduling or image pull |
| Running | At least one container running |
| Succeeded | All containers exited successfully (0) |
| Failed | At least one container failed |
| Unknown | Node communication lost |
Probes
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
image: myapp
# Liveness: restart if fails
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
failureThreshold: 3
# Readiness: remove from service if fails
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
# Startup: delay other probes during startup
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10
| Probe | Purpose | On Failure |
|---|---|---|
| Liveness | Is container alive? | Restart container |
| Readiness | Can serve traffic? | Remove from Service |
| Startup | Is still starting? | Delay other probes |
Resource Management
resources:
requests: # Scheduling guarantee
memory: "256Mi"
cpu: "250m" # 0.25 CPU cores
limits: # Hard ceiling
memory: "512Mi"
cpu: "500m"
QoS Classes
| Class | Condition | Priority |
|---|---|---|
| Guaranteed | requests = limits for all resources | Highest (last evicted) |
| Burstable | requests < limits | Medium |
| BestEffort | No requests or limits | Lowest (first evicted) |
Kubernetes Networking
Service Types
# ClusterIP (default) - Internal only
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
type: ClusterIP
selector:
app: web
ports:
- port: 80
targetPort: 8080
| Type | Access | Use Case |
|---|---|---|
| ClusterIP | Internal only | Inter-service communication |
| NodePort | <NodeIP>:<NodePort> |
Development, direct access |
| LoadBalancer | External LB | Production cloud exposure |
| ExternalName | DNS CNAME | External service aliasing |
Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
rules:
- host: app.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 80
- path: /
pathType: Prefix
backend:
service:
name: web-service
port:
number: 80
tls:
- hosts:
- app.example.com
secretName: app-tls-secret
Network Policies
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-network-policy
spec:
podSelector:
matchLabels:
app: api
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: web
ports:
- port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: database
ports:
- port: 5432
Scheduling and Affinity
Node Affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values:
- high-memory
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: zone
operator: In
values:
- us-east-1a
Pod Anti-Affinity
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: web
topologyKey: kubernetes.io/hostname
Use case: Spread replicas across nodes for HA
Taints and Tolerations
# Taint a node
kubectl taint nodes node1 dedicated=gpu:NoSchedule
# Pod must tolerate to be scheduled
spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
| Effect | Behavior |
|---|---|
| NoSchedule | Won't schedule new pods |
| PreferNoSchedule | Try to avoid scheduling |
| NoExecute | Evict existing pods too |
Interview Questions
Q: "A deployment has 3 replicas but only 2 pods are running. How do you troubleshoot?"
# Check deployment status
kubectl get deployment myapp
kubectl describe deployment myapp
# Check replicaset
kubectl get rs -l app=myapp
kubectl describe rs <rs-name>
# Check pods
kubectl get pods -l app=myapp
kubectl describe pod <pending-pod>
# Check events
kubectl get events --sort-by='.lastTimestamp'
# Common issues:
# - Insufficient resources (check node capacity)
# - Image pull errors (check image name/registry)
# - Node affinity can't be satisfied
# - PVC can't be bound
Q: "How do you ensure pods are evenly distributed across availability zones?"
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: web
Next, we'll master Kubernetes troubleshooting—the skill that defines senior engineers. :::