العودة إلى الوظائف

Site Reliability Engineer – Remote Device Management

Locus Robotics

RemoteUSD 140k – 180kfull time4 days ago
تقدم الآن

Job Description:
• Fleet Management at Scale: Design, implement, and maintain robust and secure device management strategies for remote devices using Unified Endpoint Management (UEM), MDM solutions, and orchestration tools.
• Reliability & Monitoring: Develop and manage observability pipelines to track device health, connectivity, and performance metrics across diverse warehouse environments.
• OTA & Lifecycle Management: Own the end-to-end lifecycle of device software, including secure Over-the-Air (OTA) firmware updates, rollback strategies, and OS hardening.
• Incident Response: Participate in on-call rotations to troubleshoot complex system failures, performing root cause analysis (RCA) to drive long-term reliability improvements.
• Self-Healing Infrastructure: Develop automated remediation scripts that detect and fix common edge issues such as hung scanning processes or display driver freezes without manual intervention.
• Zero-Touch Scalability: Architect and maintain remote provisioning and management workflows for a global fleet of Linux, iPads, and Android devices using secure remote management strategies.
• Secure Remote Access: Implement and manage secure remote access protocols such as SSH, VPNs, and private APNs to enable out-of-band troubleshooting and real-time device control without physical site visits.
• SLO/SLI Frameworks: Define and enforce Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for device availability, connectivity, and peripheral performance.
• Error Budget Management: Use error budgets to balance the pace of innovation with fleet reliability, ensuring data-driven decisions for feature releases versus stability fixes.
• Security Governance: Align fleet operations with industry standards such as the NIST Cybersecurity Framework (CSF), ISO/IEC 27001, and CIS Controls.
• Vulnerability Management: Drive continuous monitoring and automated patching schedules to mitigate risks and ensure regulatory compliance across all managed device platforms.

Requirements:
• Master’s degree in Computer Science, Software Engineering, Systems Engineering, Robotics, or equivalent experience.
• 7+ years of experience: Proven track record in SRE, DevOps, or Systems Engineering with a focus on IoT, remote devices, or distributed edge hardware.
• Deep proficiency in Linux/Unix systems (Debian/Ubuntu preferred), including kernel tuning, shell scripting (Python, Bash), and networking protocols (TCP/IP, MQTT, CoAP, HTTPS/REST, DNS).
• Knowledge of security best practices for IoT and remote devices, including secure boot, encryption at rest/in transit, and certificate management.
• Expert proficiency in Python, Rust, or Go-based configuration management (Ansible/Terraform) for fleet-wide deployments.
• Strong understanding of SRE principles, including SLIs/SLOs, error budgets, and automation over manual "toil."
• Experience with enterprise MDM or Unified Endpoint Management (UEM) platforms (such as Jamf Pro, Microsoft Intune, FleetDM, Mosyle, Esper, 42Gears SureMDM, SOTI MobiControl, VMware Workspace ONE, or Headwind MDM).
• Experience with open-source device management solutions is a plus (such as FleetDM, Mender.io, Balena, Micromdm, Memfault, or RAUC).
• Experience with building Linux images and containers (with tools such as Yocto, PTXdist, ubuntu-image, Packer, Debian live-build, debootstrap).
• Experience with Linux packaging formats (such as deb, snap, flatpak, nixpkg).
• Hands-on experience troubleshooting hardware interfaces, specifically USB/Bluetooth barcode scanners and industrial touchscreen displays.
• Experience configuring and locking down browsers or native apps into dedicated kiosk environments on both Linux and mobile OSs.
• Hands-on experience with cloud infrastructure (AWS or Azure) and containerization technologies like Docker and Kubernetes.
• Experience with CI/CD pipelines tailored for edge device deployment.
• Experience with ROS (Robot Operating System) or managing hardware-in-the-loop systems is a plus.
• Background in warehouse automation, logistics, or industrial IoT.

Benefits

عبر JSearch
نشرة أسبوعية مجانية

ابقَ على مسار النيرد

بريد واحد أسبوعياً — دورات، مقالات معمّقة، أدوات، وتجارب ذكاء اصطناعي.

بدون إزعاج. إلغاء الاشتراك في أي وقت.