GitOps & Continuous Delivery
Progressive Delivery
3 min read
Progressive delivery extends continuous delivery with controlled rollout strategies. Using Argo Rollouts, platform teams can provide safer deployment patterns that automatically detect and rollback problematic releases.
Why Progressive Delivery?
Traditional deployments are all-or-nothing:
Traditional Deployment:
┌─────────────────────────────────────────────────────────┐
│ │
│ v1 (100%) ───────────────────► v2 (100%) │
│ │
│ Problem: If v2 has issues, 100% of users affected │
│ │
└─────────────────────────────────────────────────────────┘
Progressive Delivery:
┌─────────────────────────────────────────────────────────┐
│ │
│ v1 (100%) → v1 (90%) → v1 (50%) → v1 (0%) │
│ → v2 (10%) → v2 (50%) → v2 (100%) │
│ │ │
│ Check metrics, auto-rollback if needed │
│ │
└─────────────────────────────────────────────────────────┘
Installing Argo Rollouts
# Install Argo Rollouts controller
kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml
# Install kubectl plugin
brew install argoproj/tap/kubectl-argo-rollouts
# Or: curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64
# Verify installation
kubectl argo rollouts version
# Open dashboard
kubectl argo rollouts dashboard
# Visit http://localhost:3100
Canary Deployment
Gradually shift traffic to the new version:
# rollout-canary.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: user-service
namespace: production
spec:
replicas: 10
selector:
matchLabels:
app: user-service
template:
metadata:
labels:
app: user-service
spec:
containers:
- name: user-service
image: user-service:v2.0.0
ports:
- containerPort: 8080
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
strategy:
canary:
# Traffic routing via service mesh or ingress
canaryService: user-service-canary
stableService: user-service-stable
# Traffic splitting (requires traffic management)
trafficRouting:
nginx:
stableIngress: user-service-ingress
steps:
# Step 1: 10% traffic to canary
- setWeight: 10
- pause: { duration: 5m }
# Step 2: Run analysis
- analysis:
templates:
- templateName: success-rate
args:
- name: service-name
value: user-service-canary
# Step 3: 30% traffic
- setWeight: 30
- pause: { duration: 5m }
# Step 4: 50% traffic
- setWeight: 50
- pause: { duration: 10m }
# Step 5: Full rollout
- setWeight: 100
Blue-Green Deployment
Switch all traffic at once after verification:
# rollout-bluegreen.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: order-service
namespace: production
spec:
replicas: 5
selector:
matchLabels:
app: order-service
template:
metadata:
labels:
app: order-service
spec:
containers:
- name: order-service
image: order-service:v3.0.0
ports:
- containerPort: 8080
strategy:
blueGreen:
# Service pointing to current production
activeService: order-service-active
# Service pointing to new version (for testing)
previewService: order-service-preview
# Auto-promote after successful analysis
autoPromotionEnabled: true
# Wait for analysis before promotion
prePromotionAnalysis:
templates:
- templateName: smoke-tests
args:
- name: service-url
value: http://order-service-preview.production.svc
# Scale down old version after promotion
scaleDownDelaySeconds: 300
Analysis Templates
Define metrics for automatic rollout decisions:
# analysis-template-success-rate.yaml
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
namespace: production
spec:
args:
- name: service-name
metrics:
- name: success-rate
# Prometheus query for success rate
provider:
prometheus:
address: http://prometheus.monitoring:9090
query: |
sum(rate(http_requests_total{
service="{{args.service-name}}",
status=~"2.."
}[5m])) /
sum(rate(http_requests_total{
service="{{args.service-name}}"
}[5m])) * 100
# Success criteria
successCondition: result[0] >= 99
failureLimit: 3
interval: 60s
count: 5
- name: error-rate
provider:
prometheus:
address: http://prometheus.monitoring:9090
query: |
sum(rate(http_requests_total{
service="{{args.service-name}}",
status=~"5.."
}[5m])) /
sum(rate(http_requests_total{
service="{{args.service-name}}"
}[5m])) * 100
successCondition: result[0] <= 1
failureLimit: 3
interval: 60s
count: 5
- name: latency-p99
provider:
prometheus:
address: http://prometheus.monitoring:9090
query: |
histogram_quantile(0.99,
sum(rate(http_request_duration_seconds_bucket{
service="{{args.service-name}}"
}[5m])) by (le)
)
successCondition: result[0] <= 0.5
failureLimit: 3
interval: 60s
count: 5
Rollout Management
# Watch rollout status
kubectl argo rollouts get rollout user-service -w
# Promote canary to next step
kubectl argo rollouts promote user-service
# Skip current analysis
kubectl argo rollouts promote user-service --skip-current-step
# Abort rollout (rollback)
kubectl argo rollouts abort user-service
# Retry failed rollout
kubectl argo rollouts retry rollout user-service
# View rollout history
kubectl argo rollouts history user-service
# Rollback to previous revision
kubectl argo rollouts undo user-service
Integration with ArgoCD
Argo Rollouts works seamlessly with ArgoCD:
# application-with-rollout.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: user-service
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/acme/gitops-repo.git
path: apps/user-service
targetRevision: main
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
# ArgoCD detects Rollout resources and shows
# progressive delivery status in the UI
Rollout Strategies Comparison
# Strategy Selection Guide
strategies:
canary:
when_to_use:
- "Gradual traffic shifting needed"
- "Want to test with subset of users"
- "Complex analysis requirements"
traffic_pattern: "Incremental (10% → 30% → 50% → 100%)"
rollback_speed: "Immediate (shift traffic back)"
bluegreen:
when_to_use:
- "Need full environment testing before switch"
- "Database schema changes"
- "Simpler traffic management"
traffic_pattern: "All-or-nothing switch"
rollback_speed: "Immediate (switch to blue)"
recreate:
when_to_use:
- "Development environments"
- "Non-critical workloads"
- "Breaking changes that can't run side-by-side"
traffic_pattern: "Down, then up"
rollback_speed: "Full redeploy"
In the next lesson, we'll integrate GitOps with Crossplane and Backstage for a complete IDP workflow. :::