Platform Observability & Reliability

Developer Experience Metrics

3 min read

Developer Experience (DevEx) metrics measure how effectively your platform serves its users—the developers. These metrics help platform teams understand adoption, satisfaction, and areas for improvement.

DevEx Framework

┌─────────────────────────────────────────────────────────┐
│                DEVELOPER EXPERIENCE (DevEx)             │
├─────────────────────────────────────────────────────────┤
│                                                         │
│   ┌─────────────┐  ┌─────────────┐  ┌─────────────┐   │
│   │  COGNITIVE  │  │   FLOW      │  │  FEEDBACK   │   │
│   │    LOAD     │  │   STATE     │  │    LOOPS    │   │
│   └─────────────┘  └─────────────┘  └─────────────┘   │
│                                                         │
│   "How complex is  "Can developers  "How fast do      │
│    the platform?"  stay in flow?"   devs get feedback?"│
│                                                         │
├─────────────────────────────────────────────────────────┤
│                                                         │
│   Key Questions:                                        │
│   • Can a new developer deploy in their first week?    │
│   • How often do developers need to ask for help?      │
│   • What percentage use self-service vs. tickets?      │
│   • How long do developers wait for infrastructure?    │
│                                                         │
└─────────────────────────────────────────────────────────┘

Core DevEx Metrics

# Developer Experience Metrics
devex_metrics:

  onboarding:
    - name: "Time to first commit"
      description: "Time from joining to first merged PR"
      target: "<2 days"
      measurement: |
        first_merged_pr_date - employee_start_date

    - name: "Time to first deployment"
      description: "Time from joining to first production deploy"
      target: "<1 week"
      measurement: |
        first_deployment_date - employee_start_date

    - name: "Onboarding documentation usage"
      description: "% of new devs completing onboarding docs"
      target: ">90%"

  productivity:
    - name: "Build time"
      description: "Average CI pipeline duration"
      target: "<10 minutes"
      prometheus: |
        avg(ci_pipeline_duration_seconds{status="success"}) / 60

    - name: "Deploy time"
      description: "Time from merge to production"
      target: "<30 minutes"

    - name: "Environment spin-up time"
      description: "Time to create new dev environment"
      target: "<15 minutes"

  friction:
    - name: "Support ticket rate"
      description: "Platform tickets per developer per month"
      target: "<0.5"
      prometheus: |
        count(tickets{category="platform"}) /
        count(active_developers)

    - name: "Self-service adoption"
      description: "% of tasks completed without tickets"
      target: ">85%"

    - name: "Documentation search-to-answer time"
      description: "Time to find answer in TechDocs"
      target: "<5 minutes"

Measuring Platform Adoption

Track how developers use your platform:

# Platform Adoption Metrics
adoption_metrics:

  backstage_usage:
    - name: "Daily active users"
      prometheus: |
        count(distinct(
          backstage_request_user
        ) by (user) [24h])

    - name: "Template usage per week"
      prometheus: |
        sum(increase(
          backstage_scaffolder_task_count_total{status="completed"}
        [7d]))

    - name: "Catalog coverage"
      description: "% of services registered in catalog"
      prometheus: |
        count(backstage_catalog_entities{kind="Component"}) /
        count(kubernetes_services)

  crossplane_usage:
    - name: "Self-service infrastructure requests"
      prometheus: |
        sum(increase(crossplane_claim_created_total[30d]))

    - name: "Most requested resource types"
      prometheus: |
        topk(5, sum by (kind) (crossplane_claim_created_total))

  argocd_usage:
    - name: "GitOps-managed applications"
      prometheus: |
        count(argocd_app_info)

    - name: "Manual vs automated syncs"
      prometheus: |
        sum(argocd_app_sync_total{trigger="automated"}) /
        sum(argocd_app_sync_total)

Developer Surveys

Complement quantitative metrics with qualitative feedback:

# Developer Survey Questions
survey:

  satisfaction:
    - question: "How satisfied are you with the platform? (1-10)"
      type: "scale"
      benchmark: ">7"

    - question: "Would you recommend the platform to a colleague?"
      type: "nps"
      benchmark: "NPS > 30"

  productivity:
    - question: "The platform helps me be more productive"
      type: "likert"
      options: ["Strongly disagree", "Disagree", "Neutral", "Agree", "Strongly agree"]

    - question: "I can deploy my code without waiting for help"
      type: "likert"

  friction_points:
    - question: "What's the biggest blocker in your workflow?"
      type: "open_text"

    - question: "Which platform feature would you most like to see improved?"
      type: "multiple_choice"
      options:
        - "Documentation"
        - "Deployment speed"
        - "Self-service options"
        - "Observability"
        - "Other"

  frequency: "quarterly"
  anonymous: true

Developer Productivity Dashboard

# grafana-dashboard-devex.yaml
dashboard:
  title: "Developer Experience Dashboard"

  rows:
    - title: "Developer Productivity"
      panels:
        - title: "Average Build Time (minutes)"
          type: "stat"
          query: "avg(ci_build_duration_seconds) / 60"
          thresholds:
            - color: "green"
              value: 0
            - color: "yellow"
              value: 10
            - color: "red"
              value: 20

        - title: "Deployments per Developer (daily)"
          type: "stat"
          query: |
            sum(increase(deployments_total[24h])) /
            count(distinct(developer))

    - title: "Platform Adoption"
      panels:
        - title: "Backstage DAU"
          type: "graph"
          query: "backstage_daily_active_users"

        - title: "Self-Service vs Tickets"
          type: "piechart"
          queries:
            - expr: "sum(self_service_requests)"
              legend: "Self-Service"
            - expr: "sum(ticket_requests)"
              legend: "Tickets"

    - title: "Onboarding"
      panels:
        - title: "Time to First Deploy (days)"
          type: "stat"
          query: "avg(time_to_first_deploy_days)"

        - title: "New Developer Deployments (30d)"
          type: "table"
          query: |
            topk(10,
              count by (developer) (
                deployments_total{developer_tenure_days < 30}
              )
            )

Tracking Improvement Over Time

# DevEx Improvement Tracking
improvement_tracking:

  baseline:
    date: "2025-Q1"
    metrics:
      time_to_first_deploy: "5 days"
      build_time_avg: "15 minutes"
      self_service_rate: "60%"
      developer_satisfaction: "6.5/10"

  current:
    date: "2025-Q4"
    metrics:
      time_to_first_deploy: "2 days"
      build_time_avg: "8 minutes"
      self_service_rate: "85%"
      developer_satisfaction: "8.2/10"

  goals:
    date: "2026-Q2"
    metrics:
      time_to_first_deploy: "1 day"
      build_time_avg: "5 minutes"
      self_service_rate: "95%"
      developer_satisfaction: "9.0/10"

SPACE Framework

Apply the SPACE framework for holistic DevEx measurement:

# SPACE Framework
space:

  satisfaction:
    description: "How developers feel about their work"
    metrics:
      - "Developer satisfaction score"
      - "Net Promoter Score (NPS)"
      - "Would recommend platform"

  performance:
    description: "Outcome of developer activities"
    metrics:
      - "Code review throughput"
      - "Story points completed"
      - "Customer impact metrics"

  activity:
    description: "Count of developer actions"
    metrics:
      - "Number of commits"
      - "PRs opened/merged"
      - "Deployments per day"

  communication:
    description: "How devs collaborate"
    metrics:
      - "PR review response time"
      - "Documentation contributions"
      - "Knowledge sharing sessions"

  efficiency:
    description: "Minimal waste in workflows"
    metrics:
      - "Build time"
      - "Time waiting for reviews"
      - "Meeting load"

In the next lesson, we'll explore cost management and FinOps practices for platform teams. :::

Quiz

Module 5: Platform Observability & Reliability

Take Quiz