Kubernetes & Container Orchestration

Kubernetes Workloads and Networking

4 min read

Master K8s workloads and networking concepts for your DevOps/SRE interviews.

Pod Lifecycle

Pending → Running → Succeeded/Failed
     (CrashLoopBackOff)

Pod Phases

Phase Meaning
Pending Waiting for scheduling or image pull
Running At least one container running
Succeeded All containers exited successfully (0)
Failed At least one container failed
Unknown Node communication lost

Probes

apiVersion: v1
kind: Pod
spec:
  containers:
  - name: app
    image: myapp
    # Liveness: restart if fails
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 15
      periodSeconds: 10
      failureThreshold: 3

    # Readiness: remove from service if fails
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5

    # Startup: delay other probes during startup
    startupProbe:
      httpGet:
        path: /healthz
        port: 8080
      failureThreshold: 30
      periodSeconds: 10
Probe Purpose On Failure
Liveness Is container alive? Restart container
Readiness Can serve traffic? Remove from Service
Startup Is still starting? Delay other probes

Resource Management

resources:
  requests:    # Scheduling guarantee
    memory: "256Mi"
    cpu: "250m"    # 0.25 CPU cores
  limits:      # Hard ceiling
    memory: "512Mi"
    cpu: "500m"

QoS Classes

Class Condition Priority
Guaranteed requests = limits for all resources Highest (last evicted)
Burstable requests < limits Medium
BestEffort No requests or limits Lowest (first evicted)

Kubernetes Networking

Service Types

# ClusterIP (default) - Internal only
apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  type: ClusterIP
  selector:
    app: web
  ports:
  - port: 80
    targetPort: 8080
Type Access Use Case
ClusterIP Internal only Inter-service communication
NodePort <NodeIP>:<NodePort> Development, direct access
LoadBalancer External LB Production cloud exposure
ExternalName DNS CNAME External service aliasing

Ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-service
            port:
              number: 80
  tls:
  - hosts:
    - app.example.com
    secretName: app-tls-secret

Network Policies

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-network-policy
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: web
    ports:
    - port: 8080
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - port: 5432

Scheduling and Affinity

Node Affinity

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: node-type
            operator: In
            values:
            - high-memory
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
          - key: zone
            operator: In
            values:
            - us-east-1a

Pod Anti-Affinity

spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app: web
        topologyKey: kubernetes.io/hostname

Use case: Spread replicas across nodes for HA

Taints and Tolerations

# Taint a node
kubectl taint nodes node1 dedicated=gpu:NoSchedule

# Pod must tolerate to be scheduled
spec:
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"
Effect Behavior
NoSchedule Won't schedule new pods
PreferNoSchedule Try to avoid scheduling
NoExecute Evict existing pods too

Interview Questions

Q: "A deployment has 3 replicas but only 2 pods are running. How do you troubleshoot?"

# Check deployment status
kubectl get deployment myapp
kubectl describe deployment myapp

# Check replicaset
kubectl get rs -l app=myapp
kubectl describe rs <rs-name>

# Check pods
kubectl get pods -l app=myapp
kubectl describe pod <pending-pod>

# Check events
kubectl get events --sort-by='.lastTimestamp'

# Common issues:
# - Insufficient resources (check node capacity)
# - Image pull errors (check image name/registry)
# - Node affinity can't be satisfied
# - PVC can't be bound

Q: "How do you ensure pods are evenly distributed across availability zones?"

spec:
  topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        app: web

Next, we'll master Kubernetes troubleshooting—the skill that defines senior engineers. :::

Quiz

Module 4: Kubernetes & Container Orchestration

Take Quiz