Senior SRE DevOps Engineer (Remote from United States)
Remotica
About the position
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior SRE DevOps Engineer in United States.
This is a high-impact role at the intersection of software engineering and cloud operations, focused on building and maintaining resilient, large-scale infrastructure for real-time communication systems. You will design, automate, and optimize cloud-native environments that support mission-critical connectivity under strict latency and reliability constraints. The position combines hands-on coding with deep operational ownership, empowering you to shape infrastructure strategy while improving developer productivity. Working in a remote-first, highly technical environment, you’ll collaborate across engineering teams to ensure scalability, security, and performance. If you thrive on solving distributed systems challenges and building production-grade reliability tooling, this role offers both ownership and influence.
Responsibilities
• Designing and implementing SLI/SLO frameworks with error budgets to guide reliability and performance decisions.
• Building and maintaining AWS-based production infrastructure using Infrastructure as Code (Terraform, CloudFormation), including ECS, EKS/Kubernetes, and microservices orchestration.
• Developing internal tools, automation frameworks, and reliability services in TypeScript, Python, or similar languages to enhance operational efficiency.
• Leading incident response processes, conducting root cause analyses, and creating automated runbooks to reduce MTTR.
• Architecting and maintaining CI/CD pipelines for backend services, mobile applications, and IoT firmware across cloud and on-prem environments.
• Implementing comprehensive observability using OpenTelemetry, distributed tracing, metrics exporters, and alerting systems.
• Managing data services such as PostgreSQL (RDS), Redis/ElastiCache, SQS, and networking components (ALB/NLB, VPC, IAM).
• Enforcing strong security standards, including IAM policies, encryption, secrets management, vulnerability management, and compliance auditing.
Requirements
• 7+ years of experience in SRE, DevOps, or Platform Engineering roles with daily hands-on coding responsibilities.
• Proficiency in at least one backend language (TypeScript/Node.js, Python, or Go) for developing automation tools, internal services, and reliability frameworks.
• Deep expertise in AWS services (ECS, EKS, RDS, ElastiCache, SQS, VPC, IAM, CloudWatch).
• Strong experience with Infrastructure as Code tools (Terraform, CloudFormation, or Pulumi), including modular design and state management.
• Proven experience designing and maintaining CI/CD pipelines in both cloud and on-prem environments.
• Solid understanding of container orchestration (Docker, Kubernetes, Helm) and distributed systems patterns such as circuit breakers, retries, and graceful degradation.
• Experience operating production databases (PostgreSQL, Redis) and message queues.
• Strong security knowledge covering network segmentation, encryption, secrets management, and incident response.
Nice-to-haves
• Preferred experience with real-time communication infrastructure (SIP, RTP, WebRTC), telecom systems, IoT pipelines, or satellite/low-bandwidth optimization environments.
Benefits
• Competitive compensation package
• Flexible remote work environment with autonomy and ownership
• Opportunity to build and scale critical communication infrastructure
• Exposure to cutting-edge technologies across cloud, IoT, telecom, and distributed systems
• High-impact role with direct influence on reliability and platform architecture
• Collaborative, technically advanced engineering culture
Apply tot his job
Apply To this Job