Job description
Senior Platform / DevOps Engineer (Real-time Media, WebRTC, Edge + Cloud) Job title: Platform / DevOps Engineer (WebRTC, Edge + Cloud) Location: Bengaluru (Hybrid/Office) Employment type: Full-time Experience: 5–12+ years (flexible for strong fit) About the role We’re building and operating a LiveKit-like real-time communications platform (WebRTC) that must scale to millions of calls with edge PoPs for ultra-low latency and multi-region cloud reliability .
This is a hands-on, high-ownership role focused on production systems, performance, and resilience.
We’re especially interested in engineers who’ve seen scale in real-time/streaming infra.
What you’ll do Own reliability and performance of signaling, SFU/media nodes, TURN , routing, failover, and capacity planning Build and run multi-region Kubernetes platforms with secure networking and zero-downtime deployments Design edge + cloud architecture: PoPs, global routing, failover, autoscaling, DR Implement SLOs/SLIs , incident response, postmortems, and operational excellence Create strong observability : metrics, logs, tracing, and real-time QoE/latency metrics Ship Infrastructure-as-Code and automation: Terraform, Helm, GitOps, CI/CD Required skills Strong production experience with Kubernetes at scale (multi-cluster/multi-region) Strong Linux + networking fundamentals (UDP/TCP, NAT, conntrack, DNS, load balancing) Experience with IaC + delivery : Terraform, Helm, GitOps (ArgoCD/Flux), CI/CD Proven on-call ownership for high-availability systems Nice to have WebRTC/RTC operations: ICE, STUN/TURN, SFU scaling, packet loss/jitter tuning Edge/PoP and traffic management experience (global routing, Anycast/DNS strategies) Cost optimization for bandwidth-heavy workloads Experience operating realtime/streaming systems at very high concurrency What success looks like You can keep a real-time system stable through traffic spikes, packet loss, ISP variability, zone/region failures You think in terms of latency budgets, concurrency, bandwidth, packets/sec , not just pods and nodes You build platforms that are observable, automatable, and easy to operate
This job post has been translated by AI and may contain minor differences or errors.