Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!
We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.


User unblocked successfully
https://bayt.page.link/F7mAgZiiX9BRpySq9
Back to the job results

Principle Site Reliability Developer (SRE/SRD)

22 days ago 2026/10/25
Other Business Support Services
Create a job alert for similar positions
Job alert turned off. You won’t receive updates for this search anymore.

Job description

At Oracle Cloud Infrastructure (OCI), we build the more intelligent future of cloud. OCI EMEA Operations is a team of smart, motivated, and diverse people that are focused on bringing the world's most important work to OCI. We build and operate our commercial and sovereign cloud regions to be reliable and high performance. Our customers and their mission are the centre of what we do. We strive to improve our knowledge of the challenges our customers face which we use to enhance our cloud capabilities and work together to deliver their mission. 


As a Site Reliability Engineer, you will be responsible for the operation of production environments, including systems and databases, supporting critical business operations for a commercial and sovereign cloud environment. You will be focused on automation and optimization of operations for multiple production environments. You will recommend new and novel solutions to improve availability, performance, and supportability. This is an opportunity to bring a combination of deep technical knowledge with administration/analysis knowledge of Oracle's Cloud Infrastructure to provide escalation support to a wide range of complex production environment problems related to immense growth, scaling, leveraging the cloud, extremely high performance, and high availability requirements. You will also guide junior engineers to solve complex problems, take part in large-scale incident bridges and help to build and optimize processes and procedures.



Only Oracle brings together the data, infrastructure, applications, and expertise to power everything from industry innovations to life-saving care. And with AI embedded across our products and services, we help customers turn that promise into a better future for all. Discover your potential at a company leading the way in AI and cloud solutions that impact billions of lives.


True innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing a workforce that promotes opportunities for all with competitive benefits that support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.


We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing [email protected] or by calling 1-888-404-2494 in the United States.


Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.



Responsibilities:
  • Development of automation and optimization’s focused on operational excellence. 
  • Deep dive, root cause and solve for systemic issues.
  • Enhance Operations quality outcomes through scalable automations.
  • Install, monitor, maintain, support, and optimize all production server hardware and software.
  • Provide escalated technical support for complex technical issues which may include leading problem management cases and providing management status.
  • Coordinate escalated support cases and lead appropriate internal technical resources and/or third-party vendors to resolution and coordinate a storage infrastructure of Oracle system and database appliances.
  • Responsible for Oracle production environments; assist with server operating system and application upgrades, bug fixes, and patching; and work on standardization projects for both hardware and software under the Oracle technology stack while providing consistent system uptime as expected in a Cloud environment.
  • Lead communications with key partners in solving complex technical problems.
  • Provide technical guidance and leadership to junior members to enable them to grow in their careers.

Requirements:


  • Permanently resident in Romania.
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
  • 6+ years of experience in systems engineering, software development, cloud operations, or site reliability engineering roles.
  • Strong proficiency in at least one programming or scripting language (e.g., Python, Go, Java, Bash).
  • Solid understanding of Linux/Unix systems, networking (TCP/IP, DNS, load balancing), and storage technologies.
  • Experience with monitoring, observability, and operational analytics platforms.
  • Understanding of cloud-native technologies such as Docker, Kubernetes, and orchestration frameworks.
  • Experience participating in and leading large-scale incident response and operational bridges.
  • Experience developing automation solutions focused on operational excellence, reliability, and scalability.
  • Familiarity with AI-assisted engineering tools such as Codex, GitHub Copilot, Cursor, Claude Code, or similar technologies to improve engineering productivity.
  • Understanding of Large Language Models (LLMs) and their application in troubleshooting, automation, incident management, operational workflows, and knowledge management.
  • Familiarity with agentic workflows, AI agents, and intelligent automation frameworks to streamline operations and improve service reliability.
  • Strong operational mindset with a focus on ownership, customer impact, continuous improvement, automation, and operational excellence.
  • Experience leveraging data-driven insights, observability platforms, and automation to proactively identify, investigate, and resolve reliability and performance issues.
  • Customer focus, with a passion for delivering reliable and scalable cloud services.
  • Experience in SRE, cloud technical support, cloud operations, large-scale events management, or similar environments.
  • Demonstrated ability to quickly learn new technical disciplines and effectively mentor and train others.

Qualifications:

Career Level - IC4


This job post has been translated by AI and may contain minor differences or errors.
You’ve reached the maximum limit of 15 job alerts. To create a new alert, please delete an existing one first.
Job alert created for this search. You’ll receive updates when new jobs match.
Are you sure you want to unapply?

You'll no longer be considered for this role and your application will be removed from the employer's inbox.