Progressive Delivery

Progressive delivery extends continuous delivery with controlled rollout strategies. Using Argo Rollouts, platform teams can provide safer deployment patterns that automatically detect and rollback problematic releases.

Why Progressive Delivery?

Traditional deployments are all-or-nothing:

Traditional Deployment:
┌─────────────────────────────────────────────────────────┐
│                                                         │
│   v1 (100%) ───────────────────► v2 (100%)             │
│                                                         │
│   Problem: If v2 has issues, 100% of users affected    │
│                                                         │
└─────────────────────────────────────────────────────────┘

Progressive Delivery:
┌─────────────────────────────────────────────────────────┐
│                                                         │
│   v1 (100%) → v1 (90%) → v1 (50%) → v1 (0%)            │
│              → v2 (10%) → v2 (50%) → v2 (100%)         │
│                    │                                    │
│              Check metrics, auto-rollback if needed     │
│                                                         │
└─────────────────────────────────────────────────────────┘

Installing Argo Rollouts

# Install Argo Rollouts controller
kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml

# Install kubectl plugin
brew install argoproj/tap/kubectl-argo-rollouts
# Or: curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64

# Verify installation
kubectl argo rollouts version

# Open dashboard
kubectl argo rollouts dashboard
# Visit http://localhost:3100

Canary Deployment

Gradually shift traffic to the new version:

# rollout-canary.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: user-service
  namespace: production
spec:
  replicas: 10
  selector:
    matchLabels:
      app: user-service
  template:
    metadata:
      labels:
        app: user-service
    spec:
      containers:
        - name: user-service
          image: user-service:v2.0.0
          ports:
            - containerPort: 8080
          resources:
            requests:
              memory: "256Mi"
              cpu: "250m"
            limits:
              memory: "512Mi"
              cpu: "500m"

  strategy:
    canary:
      # Traffic routing via service mesh or ingress
      canaryService: user-service-canary
      stableService: user-service-stable

      # Traffic splitting (requires traffic management)
      trafficRouting:
        nginx:
          stableIngress: user-service-ingress

      steps:
        # Step 1: 10% traffic to canary
        - setWeight: 10
        - pause: { duration: 5m }

        # Step 2: Run analysis
        - analysis:
            templates:
              - templateName: success-rate
            args:
              - name: service-name
                value: user-service-canary

        # Step 3: 30% traffic
        - setWeight: 30
        - pause: { duration: 5m }

        # Step 4: 50% traffic
        - setWeight: 50
        - pause: { duration: 10m }

        # Step 5: Full rollout
        - setWeight: 100

Blue-Green Deployment

Switch all traffic at once after verification:

# rollout-bluegreen.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: order-service
  namespace: production
spec:
  replicas: 5
  selector:
    matchLabels:
      app: order-service
  template:
    metadata:
      labels:
        app: order-service
    spec:
      containers:
        - name: order-service
          image: order-service:v3.0.0
          ports:
            - containerPort: 8080

  strategy:
    blueGreen:
      # Service pointing to current production
      activeService: order-service-active

      # Service pointing to new version (for testing)
      previewService: order-service-preview

      # Auto-promote after successful analysis
      autoPromotionEnabled: true

      # Wait for analysis before promotion
      prePromotionAnalysis:
        templates:
          - templateName: smoke-tests
        args:
          - name: service-url
            value: http://order-service-preview.production.svc

      # Scale down old version after promotion
      scaleDownDelaySeconds: 300

Analysis Templates

Define metrics for automatic rollout decisions:

# analysis-template-success-rate.yaml
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
  namespace: production
spec:
  args:
    - name: service-name
  metrics:
    - name: success-rate
      # Prometheus query for success rate
      provider:
        prometheus:
          address: http://prometheus.monitoring:9090
          query: |
            sum(rate(http_requests_total{
              service="{{args.service-name}}",
              status=~"2.."
            }[5m])) /
            sum(rate(http_requests_total{
              service="{{args.service-name}}"
            }[5m])) * 100
      # Success criteria
      successCondition: result[0] >= 99
      failureLimit: 3
      interval: 60s
      count: 5

    - name: error-rate
      provider:
        prometheus:
          address: http://prometheus.monitoring:9090
          query: |
            sum(rate(http_requests_total{
              service="{{args.service-name}}",
              status=~"5.."
            }[5m])) /
            sum(rate(http_requests_total{
              service="{{args.service-name}}"
            }[5m])) * 100
      successCondition: result[0] <= 1
      failureLimit: 3
      interval: 60s
      count: 5

    - name: latency-p99
      provider:
        prometheus:
          address: http://prometheus.monitoring:9090
          query: |
            histogram_quantile(0.99,
              sum(rate(http_request_duration_seconds_bucket{
                service="{{args.service-name}}"
              }[5m])) by (le)
            )
      successCondition: result[0] <= 0.5
      failureLimit: 3
      interval: 60s
      count: 5

Rollout Management

# Watch rollout status
kubectl argo rollouts get rollout user-service -w

# Promote canary to next step
kubectl argo rollouts promote user-service

# Skip current analysis
kubectl argo rollouts promote user-service --skip-current-step

# Abort rollout (rollback)
kubectl argo rollouts abort user-service

# Retry failed rollout
kubectl argo rollouts retry rollout user-service

# View rollout history
kubectl argo rollouts history user-service

# Rollback to previous revision
kubectl argo rollouts undo user-service

Integration with ArgoCD

Argo Rollouts works seamlessly with ArgoCD:

# application-with-rollout.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: user-service
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/acme/gitops-repo.git
    path: apps/user-service
    targetRevision: main
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

# ArgoCD detects Rollout resources and shows
# progressive delivery status in the UI

Rollout Strategies Comparison

# Strategy Selection Guide
strategies:

  canary:
    when_to_use:
      - "Gradual traffic shifting needed"
      - "Want to test with subset of users"
      - "Complex analysis requirements"
    traffic_pattern: "Incremental (10% → 30% → 50% → 100%)"
    rollback_speed: "Immediate (shift traffic back)"

  bluegreen:
    when_to_use:
      - "Need full environment testing before switch"
      - "Database schema changes"
      - "Simpler traffic management"
    traffic_pattern: "All-or-nothing switch"
    rollback_speed: "Immediate (switch to blue)"

  recreate:
    when_to_use:
      - "Development environments"
      - "Non-critical workloads"
      - "Breaking changes that can't run side-by-side"
    traffic_pattern: "Down, then up"
    rollback_speed: "Full redeploy"

In the next lesson, we'll integrate GitOps with Crossplane and Backstage for a complete IDP workflow. :::