كلما زادت طلبات التقديم التي ترسلينها، زادت فرصك في الحصول على وظيفة!

إليك لمحة عن معدل نشاط الباحثات عن عمل خلال الشهر الماضي:

عدد الفرص التي تم تصفحها

عدد الطلبات التي تم تقديمها

استمري في التصفح والتقديم لزيادة فرصك في الحصول على وظيفة!

هل تبحثين عن جهات توظيف لها سجل مثبت في دعم وتمكين النساء؟

اضغطي هنا لاكتشاف الفرص المتاحة الآن!
نُقدّر رأيكِ

ندعوكِ للمشاركة في استطلاع مصمّم لمساعدة الباحثين على فهم أفضل الطرق لربط الباحثات عن عمل بالوظائف التي يبحثن عنها.

هل ترغبين في المشاركة؟

في حال تم اختياركِ، سنتواصل معكِ عبر البريد الإلكتروني لتزويدكِ بالتفاصيل والتعليمات الخاصة بالمشاركة.

ستحصلين على مبلغ 7 دولارات مقابل إجابتك على الاستطلاع.


تم إلغاء حظر المستخدم بنجاح
https://bayt.page.link/eVrfSx1tpYbWkUXs5
العودة إلى نتائج البحث‎

Lead Site Reliability Engineer

قبل 30+ يومًا 2026/10/29
خدمات الدعم التجاري الأخرى
أنشئ تنبيهًا وظيفيًا لوظائف مشابهة
تم إيقاف هذا التنبيه الوظيفي. لن تصلك إشعارات لهذا البحث بعد الآن.

الوصف الوظيفي

Come work at a place where innovation and teamwork come together to support the most exciting missions in the world!


The Lead Site Reliability Engineer is the technical leadership tier of the Qualys SRE ladder — the level where operational excellence, reliability engineering, and technical leadership converge. While Senior SREs focus on improving individual services and platforms, the Lead SRE drives reliability strategy across multiple systems, teams, and operational domains.


You will lead initiatives that improve scalability, resilience, automation, and operational maturity across Qualys cloud platforms. This role requires strong technical judgment, deep expertise in distributed systems, cloud-native architectures, observability, automation, and incident management, along with the ability to mentor engineers and influence cross-functional teams.


WHERE THIS ROLE SITS

Dimension 


Lead SRE Expectation 


Ambiguity 


Moderate to High — translates broad reliability objectives into scalable operational solutions. 


Scope 


Multiple applications, platforms, and infrastructure domains across the organization. 


Ownership 


Technical leadership for reliability, observability, automation, and operational excellence initiatives. 


Judgment 


Balances reliability, security, scalability, performance, and cost trade-offs. 


Business Impact 


Significant — improves customer experience through higher availability, faster recovery, and reduced operational risk. 


WHAT YOU WILL DOReliability Engineering and Platform Leadership
  • Lead the design, implementation, and maintenance of highly scalable, reliable, and secure cybersecurity cloud platforms.
  • Define and implement reliability engineering best practices, standards, and operational processes.
  • Participate in capacity planning, disaster recovery planning, and business continuity initiatives.
  • Drive operational excellence programs that improve platform resilience and reduce operational risk.
Automation and Operational Excellence
  • Drive automation initiatives for infrastructure provisioning, deployments, monitoring, and incident management.
  • Architect and optimize CI/CD pipelines and deployment strategies.
  • Reduce operational toil through automation and self-service capabilities.
  • Evaluate and adopt new technologies that improve operational efficiency and reliability.
Incident Management and Observability
  • Lead incident response, root cause analysis (RCA), and postmortem reviews for critical production issues.
  • Establish and manage SLIs, SLOs, and SLAs across services and platforms.
  • Evaluate and implement monitoring, logging, tracing, and observability solutions.
  • Establish reliability metrics and operational KPIs to drive continuous improvement.
Technical Leadership and Cross-Team Influence
  • Collaborate with software engineering teams to improve application reliability, performance, and scalability.
  • Mentor and guide SRE, DevOps, and platform engineering team members.
  • Partner with security teams to ensure infrastructure compliance and operational security.
  • Participate in leadership discussions regarding infrastructure strategy and technology adoption.
  • Drive cross-functional collaboration between engineering, security, and operations teams.
WHAT GOOD LOOKS LIKE
  • Leads reliability initiatives that measurably improve availability, performance, and operational efficiency.
  • Anticipates system-level risks and drives preventive engineering solutions.
  • Builds automation that significantly reduces operational overhead and incident frequency.
  • Leads major incident investigations and drives long-term corrective actions.
  • Influences engineering teams to adopt reliability-first and automation-first practices.
  • Develops engineers through mentorship, coaching, and technical leadership.
  • Serves as a trusted technical leader during critical operational and architectural decisions.
DISTINGUISHING EXPECTATION

The Lead SRE is not measured solely by operational execution or incident response effectiveness. This role is measured by its ability to elevate the reliability maturity of systems, improve engineering practices, and create scalable operational solutions that benefit multiple teams. A successful Lead SRE creates lasting impact through technical leadership, automation, and operational strategy.


REQUIRED QUALIFICATIONS
  • Bachelor’s degree in Computer Science, Engineering, Information Technology, or equivalent practical experience.
  • 10+ years of experience in Site Reliability Engineering
  • 2+ years of experience leading technical teams, projects, or major initiatives.
  • Extensive experience with AWS, GCP, OCI, Azure, or similar cloud platforms.
  • Strong programming and scripting skills in Python, Go, Java, Bash, or similar languages.
  • Deep experience with Kubernetes, Docker, and container orchestration technologies.
  • Expertise with Infrastructure as Code tools such as Terraform or CloudFormation.
  • Strong understanding of distributed systems, networking, load balancing, and microservices architecture.
  • Experience implementing CI/CD pipelines using Jenkins, GitHub, Bitbucket, or similar tools.
  • Hands-on experience with Prometheus, Grafana, Datadog, ELK Stack, AppDynamics, Splunk, or similar observability platforms.
  • Experience with incident management, on-call operations, and production support.
  • 2+ Experience in application development using any programming language
PREFERRED QUALIFICATIONS
  • Experience in DevOps, Infrastructure Engineering, or related roles.
  • Experience operating large-scale, high-traffic production systems.
  • Strong knowledge of cloud-native architecture and reliability engineering principles.
  • Experience with chaos engineering and resilience testing.
  • Relevant cloud certifications.
  • Strong communication, stakeholder management, and leadership skills.
  • Experience leading technical architecture discussions and operational reviews.
WHAT THIS ROLE IS NOT
  • Not a purely operational support role focused on ticket resolution.
  • Not limited to maintaining existing systems without driving improvement.
  • Not a people-management role with primary responsibility for performance management.
  • Not a Staff or Principal-level architecture role responsible for long-term engineering strategy.
ABOUT QUALYS

Qualys, Inc. is a pioneer and leading provider of cloud-based IT, security, and compliance solutions with more than 10,000 customers in over 130 countries. Qualys helps organizations simplify security operations, reduce risk, and achieve compliance through innovative cloud-native platforms and services.


لقد تمت ترجمة هذا الإعلان الوظيفي بواسطة الذكاء الاصطناعي وقد يحتوي على بعض الاختلافات أو الأخطاء البسيطة.
لقد تجاوزت الحد الأقصى المسموح به للتنبيهات الوظيفية (15). يرجى حذف أحد التنبيهات الحالية لإضافة تنبيه جديد.
تم إنشاء تنبيه وظيفي لهذا البحث. ستصلك إشعارات فور الإعلان عن وظائف جديدة مطابقة.
هل أنت متأكد أنك تريد سحب طلب التقديم إلى هذه الوظيفة؟

لن يتم النظر في طلبك لهذة الوظيفة، وسيتم إزالته من البريد الوارد الخاص بصاحب العمل.