https://bayt.page.link/eVrfSx1tpYbWkUXs5

العودة إلى نتائج البحث‎

Lead Site Reliability Engineer

- Qualys, Inc
- الهند

قبل 30+ يومًا 2026/10/29

إتمام الإستبيان

التقديم على موقع الشركة

خدمات الدعم التجاري الأخرى

أنشئ تنبيهًا وظيفيًا لوظائف مشابهة

تم إيقاف هذا التنبيه الوظيفي. لن تصلك إشعارات لهذا البحث بعد الآن.

تراجع

الوصف الوظيفي

Come work at a place where innovation and teamwork come together to support the most exciting missions in the world!

The Lead Site Reliability Engineer is the technical leadership tier of the Qualys SRE ladder — the level where operational excellence, reliability engineering, and technical leadership converge. While Senior SREs focus on improving individual services and platforms, the Lead SRE drives reliability strategy across multiple systems, teams, and operational domains.

You will lead initiatives that improve scalability, resilience, automation, and operational maturity across Qualys cloud platforms. This role requires strong technical judgment, deep expertise in distributed systems, cloud-native architectures, observability, automation, and incident management, along with the ability to mentor engineers and influence cross-functional teams.

WHERE THIS ROLE SITS

Dimension

Lead SRE Expectation

Ambiguity

Moderate to High — translates broad reliability objectives into scalable operational solutions.

Scope

Multiple applications, platforms, and infrastructure domains across the organization.

Ownership

Technical leadership for reliability, observability, automation, and operational excellence initiatives.

Judgment

Balances reliability, security, scalability, performance, and cost trade-offs.

Business Impact

Significant — improves customer experience through higher availability, faster recovery, and reduced operational risk.

WHAT YOU WILL DOReliability Engineering and Platform Leadership

Lead the design, implementation, and maintenance of highly scalable, reliable, and secure cybersecurity cloud platforms.
Define and implement reliability engineering best practices, standards, and operational processes.
Participate in capacity planning, disaster recovery planning, and business continuity initiatives.
Drive operational excellence programs that improve platform resilience and reduce operational risk.

Automation and Operational Excellence

Drive automation initiatives for infrastructure provisioning, deployments, monitoring, and incident management.
Architect and optimize CI/CD pipelines and deployment strategies.
Reduce operational toil through automation and self-service capabilities.
Evaluate and adopt new technologies that improve operational efficiency and reliability.

Incident Management and Observability

Lead incident response, root cause analysis (RCA), and postmortem reviews for critical production issues.
Establish and manage SLIs, SLOs, and SLAs across services and platforms.
Evaluate and implement monitoring, logging, tracing, and observability solutions.
Establish reliability metrics and operational KPIs to drive continuous improvement.

Technical Leadership and Cross-Team Influence

Collaborate with software engineering teams to improve application reliability, performance, and scalability.
Mentor and guide SRE, DevOps, and platform engineering team members.
Partner with security teams to ensure infrastructure compliance and operational security.
Participate in leadership discussions regarding infrastructure strategy and technology adoption.
Drive cross-functional collaboration between engineering, security, and operations teams.

WHAT GOOD LOOKS LIKE

Leads reliability initiatives that measurably improve availability, performance, and operational efficiency.
Anticipates system-level risks and drives preventive engineering solutions.
Builds automation that significantly reduces operational overhead and incident frequency.
Leads major incident investigations and drives long-term corrective actions.
Influences engineering teams to adopt reliability-first and automation-first practices.
Develops engineers through mentorship, coaching, and technical leadership.
Serves as a trusted technical leader during critical operational and architectural decisions.

DISTINGUISHING EXPECTATION

The Lead SRE is not measured solely by operational execution or incident response effectiveness. This role is measured by its ability to elevate the reliability maturity of systems, improve engineering practices, and create scalable operational solutions that benefit multiple teams. A successful Lead SRE creates lasting impact through technical leadership, automation, and operational strategy.

REQUIRED QUALIFICATIONS

Bachelor’s degree in Computer Science, Engineering, Information Technology, or equivalent practical experience.
10+ years of experience in Site Reliability Engineering
2+ years of experience leading technical teams, projects, or major initiatives.
Extensive experience with AWS, GCP, OCI, Azure, or similar cloud platforms.
Strong programming and scripting skills in Python, Go, Java, Bash, or similar languages.
Deep experience with Kubernetes, Docker, and container orchestration technologies.
Expertise with Infrastructure as Code tools such as Terraform or CloudFormation.
Strong understanding of distributed systems, networking, load balancing, and microservices architecture.
Experience implementing CI/CD pipelines using Jenkins, GitHub, Bitbucket, or similar tools.
Hands-on experience with Prometheus, Grafana, Datadog, ELK Stack, AppDynamics, Splunk, or similar observability platforms.
Experience with incident management, on-call operations, and production support.
2+ Experience in application development using any programming language

PREFERRED QUALIFICATIONS

Experience in DevOps, Infrastructure Engineering, or related roles.
Experience operating large-scale, high-traffic production systems.
Strong knowledge of cloud-native architecture and reliability engineering principles.
Experience with chaos engineering and resilience testing.
Relevant cloud certifications.
Strong communication, stakeholder management, and leadership skills.
Experience leading technical architecture discussions and operational reviews.

WHAT THIS ROLE IS NOT

Not a purely operational support role focused on ticket resolution.
Not limited to maintaining existing systems without driving improvement.
Not a people-management role with primary responsibility for performance management.
Not a Staff or Principal-level architecture role responsible for long-term engineering strategy.

ABOUT QUALYS

Qualys, Inc. is a pioneer and leading provider of cloud-based IT, security, and compliance solutions with more than 10,000 customers in over 130 countries. Qualys helps organizations simplify security operations, reduce risk, and achieve compliance through innovative cloud-native platforms and services.

لقد تمت ترجمة هذا الإعلان الوظيفي بواسطة الذكاء الاصطناعي وقد يحتوي على بعض الاختلافات أو الأخطاء البسيطة.

التقديم على موقع الشركة أرسل إلى صديق إتمام الإستبيان

قارن ملفك الشخصي مع المتقدمين الآخرين

إلغاء

لقد تجاوزت الحد الأقصى المسموح به للتنبيهات الوظيفية (15). يرجى حذف أحد التنبيهات الحالية لإضافة تنبيه جديد.

إدارة

تم إنشاء تنبيه وظيفي لهذا البحث. ستصلك إشعارات فور الإعلان عن وظائف جديدة مطابقة.

إدارة التنبيهات

هل أنت متأكد أنك تريد سحب طلب التقديم إلى هذه الوظيفة؟

لن يتم النظر في طلبك لهذة الوظيفة، وسيتم إزالته من البريد الوارد الخاص بصاحب العمل.