Submitting more applications increases your chances of landing a job.
Here’s how busy the average job seeker was last month:
Opportunities viewed
Applications submitted
Keep exploring and applying to maximize your chances!
Looking for employers with a proven track record of hiring women?
Click here to explore opportunities now!You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for
Would You Be Likely to Participate?
If selected, we will contact you via email with further instructions and details about your participation.
You will receive a $7 payout for answering the survey.
Key Responsibilities:
* Own end-to-end incident management: Lead detection, triage, impact assessment, prioritization, and resolution of production incidents within agreed SLAs and OLAs.
* Coordinate major incident handling: Act as the primary point of contact during high-severity incidents, driving technical bridges/war rooms and ensuring timely stakeholder communication.
* ITIL/ITSM process execution: Apply ITIL-aligned practices for incident, problem, and change management, ensuring adherence to organizational standards and governance.
* Root cause analysis: Perform thorough post-incident reviews, document root causes, and define corrective and preventive actions to avoid recurrence.
* Service stability and continuous improvement: Identify recurring issues and operational gaps, propose process and tooling improvements, and contribute to reliability and performance enhancements.
* Collaboration with engineering and operations: Work closely with development, infrastructure, and QA teams to understand system behavior, dependencies, and release impacts on production.
* Monitoring and alert optimization: Review alerts, refine thresholds, and help optimize monitoring dashboards to reduce noise and improve early detection of issues.
* Knowledge management: Create and maintain runbooks, standard operating procedures, and knowledge base articles to improve first-time-right resolutions and reduce MTTR.
* Risk and change assessment: Participate in change advisory processes, assess production risks, and ensure appropriate validations and rollback plans are in place.
* Mentoring and guidance: Support junior team members in incident handling best practices, communication, and adherence to ITSM processes.
Minimum Qualifications:
* Education: Bachelor's degree in Engineering, preferably B.Tech or equivalent in Computer Science, Information Technology, or related field.
* Experience: 8-15 years of hands-on experience in production support and incident management in enterprise or large-scale environments.
Good to have skills:
ServiceNow, BMC Remedy, Problem Management, Change Management, Monitoring and Alerting Tools
* Project Management fundamentals
* Project Lifecycles on development & maintenance projects, estimation methodologies, quality processes.
* Knowledge of one or more programming languages; knowledge of architecture frameworks, and design principles; ability to comprehend & manage technology, performance engineering.
* Domain - Basic domain knowledge in order to understand the business requirements / functionality.
* Ability to perform project planning and scheduling, manage tasks and coordinate project resources to meet objectives and timelines
* Ability to work with business and technology subject matter experts to assess requirements, define scope, create estimates, and produce project charters
* Good understanding of SDLC and agile methodologies is a pre-requisite
* Awareness of latest technologies and trends
* Logical thinking and problem solving skills along with an ability to collaborate
You'll no longer be considered for this role and your application will be removed from the employer's inbox.