Production Deployment Workflows
Safe Rollout Strategies
5 min read
Why Gradual Rollouts Matter
AI-generated code, while tested, may behave differently in production:
- Edge cases not covered in tests
- Scale-related issues
- Integration with production services
- Real user behavior patterns
Gradual rollouts minimize blast radius and enable quick recovery.
Canary Deployments
Setting Up Canary Releases
# kubernetes/canary-deployment.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: api-rollout
spec:
replicas: 10
strategy:
canary:
steps:
- setWeight: 5 # 5% traffic
- pause: {duration: 5m}
- setWeight: 25 # 25% traffic
- pause: {duration: 10m}
- setWeight: 50 # 50% traffic
- pause: {duration: 15m}
- setWeight: 100 # Full rollout
analysis:
templates:
- templateName: success-rate
- templateName: latency
startingStep: 1 # Start analysis at first canary step
Automated Canary Analysis
# kubernetes/analysis-template.yaml
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
spec:
metrics:
- name: success-rate
interval: 1m
successCondition: result >= 0.99
failureLimit: 3
provider:
prometheus:
address: http://prometheus:9090
query: |
sum(rate(http_requests_total{status=~"2.*"}[5m])) /
sum(rate(http_requests_total[5m]))
- name: latency-p99
interval: 1m
successCondition: result <= 500
failureLimit: 3
provider:
prometheus:
query: |
histogram_quantile(0.99,
rate(http_request_duration_seconds_bucket[5m]))
* 1000
Feature Flags
Implementing Feature Flags
// lib/feature-flags.ts
import { LaunchDarkly } from 'launchdarkly-node-server-sdk';
const ldClient = LaunchDarkly.init(process.env.LD_SDK_KEY);
export async function isFeatureEnabled(
flagKey: string,
user: User,
defaultValue: boolean = false
): Promise<boolean> {
const ldUser = {
key: user.id,
email: user.email,
custom: {
plan: user.plan,
createdAt: user.createdAt.toISOString()
}
};
return ldClient.variation(flagKey, ldUser, defaultValue);
}
// Usage
async function processOrder(order: Order, user: User) {
const useNewPaymentFlow = await isFeatureEnabled(
'new-payment-flow',
user
);
if (useNewPaymentFlow) {
return newPaymentProcessor.process(order);
}
return legacyPaymentProcessor.process(order);
}
Gradual Rollout with Flags
claude "Create a feature flag rollout plan for our new
checkout flow. Include:
1. Internal testing phase
2. Beta users phase
3. Percentage-based rollout
4. Full release
5. Rollback triggers"
## New Checkout Flow Rollout Plan
### Phase 1: Internal Testing (Day 1-3)
- Flag: `new-checkout-flow`
- Target: Internal employees only
- Metrics: Error rate, completion rate
- Success criteria: Error rate < 0.1%
### Phase 2: Beta Users (Day 4-7)
- Target: Users with `beta_tester: true`
- Expected users: ~500
- Metrics: Conversion rate, support tickets
- Success criteria: No regression in conversion
### Phase 3: Percentage Rollout (Day 8-14)
- Day 8: 5% of users
- Day 10: 25% of users
- Day 12: 50% of users
- Day 14: 75% of users
- Monitor: All KPIs vs control group
### Phase 4: Full Release (Day 15)
- 100% of users
- Remove flag after 2 weeks stable
### Rollback Triggers (Automatic)
- Error rate > 1%
- p99 latency > 2000ms
- Conversion rate drops > 5%
Blue-Green Deployments
Infrastructure Setup
# kubernetes/blue-green.yaml
apiVersion: v1
kind: Service
metadata:
name: app-production
spec:
selector:
app: myapp
version: blue # Switch to 'green' for deployment
ports:
- port: 80
targetPort: 3000
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-blue
spec:
replicas: 5
selector:
matchLabels:
app: myapp
version: blue
template:
spec:
containers:
- name: app
image: myapp:v1.0.0
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-green
spec:
replicas: 5
selector:
matchLabels:
app: myapp
version: green
template:
spec:
containers:
- name: app
image: myapp:v1.1.0 # New version
Deployment Script
#!/bin/bash
# scripts/blue-green-deploy.sh
NEW_VERSION=$1
CURRENT=$(kubectl get svc app-production -o jsonpath='{.spec.selector.version}')
TARGET=$([ "$CURRENT" == "blue" ] && echo "green" || echo "blue")
echo "Deploying $NEW_VERSION to $TARGET environment..."
# Update target deployment
kubectl set image deployment/app-$TARGET app=myapp:$NEW_VERSION
# Wait for rollout
kubectl rollout status deployment/app-$TARGET
# Run smoke tests
./scripts/smoke-tests.sh app-$TARGET
if [ $? -ne 0 ]; then
echo "Smoke tests failed, aborting deployment"
exit 1
fi
# Switch traffic
kubectl patch svc app-production \
-p "{\"spec\":{\"selector\":{\"version\":\"$TARGET\"}}}"
echo "Deployment complete. Traffic now routing to $TARGET"
echo "Rollback command: kubectl patch svc app-production -p '{\"spec\":{\"selector\":{\"version\":\"$CURRENT\"}}}'"
Progressive Delivery with AI Monitoring
claude "Monitor this deployment and alert if:
1. Error rate exceeds baseline by 50%
2. Latency p95 exceeds 500ms
3. Memory usage grows abnormally
4. New error types appear in logs
Current baseline:
- Error rate: 0.1%
- p95 latency: 250ms
- Memory: 512MB stable"
// monitoring/deployment-watcher.ts
interface DeploymentMetrics {
errorRate: number;
p95Latency: number;
memoryUsage: number;
errorTypes: Set<string>;
}
async function monitorDeployment(
deploymentId: string,
baseline: DeploymentMetrics
): Promise<void> {
const alertThresholds = {
errorRateIncrease: 0.5, // 50% above baseline
p95LatencyMax: 500, // ms
memoryGrowthRate: 0.1, // 10% per hour
};
const interval = setInterval(async () => {
const current = await getMetrics(deploymentId);
// Check error rate
if (current.errorRate > baseline.errorRate * 1.5) {
await triggerAlert({
severity: 'high',
message: `Error rate ${current.errorRate}% exceeds threshold`,
action: 'Consider rollback'
});
}
// Check latency
if (current.p95Latency > alertThresholds.p95LatencyMax) {
await triggerAlert({
severity: 'medium',
message: `p95 latency ${current.p95Latency}ms exceeds 500ms`,
action: 'Investigate slow endpoints'
});
}
// Check for new error types
const newErrors = difference(current.errorTypes, baseline.errorTypes);
if (newErrors.size > 0) {
await triggerAlert({
severity: 'high',
message: `New error types detected: ${[...newErrors].join(', ')}`,
action: 'Review error logs'
});
}
}, 60000); // Check every minute
}
Rollback Automation
# .github/workflows/auto-rollback.yml
name: Auto Rollback
on:
workflow_dispatch:
inputs:
reason:
description: 'Rollback reason'
required: true
jobs:
rollback:
runs-on: ubuntu-latest
steps:
- name: Get Previous Version
id: prev
run: |
PREV=$(kubectl rollout history deployment/app -o jsonpath='{.metadata.annotations.previous-version}')
echo "version=$PREV" >> $GITHUB_OUTPUT
- name: Execute Rollback
run: |
kubectl rollout undo deployment/app
kubectl rollout status deployment/app
- name: Notify Team
uses: slackapi/slack-github-action@v1
with:
payload: |
{
"text": "🔄 Rollback executed",
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "*Rollback Complete*\nReason: ${{ inputs.reason }}\nRolled back to: ${{ steps.prev.outputs.version }}"
}
}
]
}
Next Lesson
We'll cover production monitoring and incident response for AI-generated code. :::