Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!
We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.


User unblocked successfully
https://bayt.page.link/eVbFoNu4Lss4ohyCA
Back to the job results

Senior Software Engineer - SRE Focused

30+ days ago 2026/09/03
Other Business Support Services
Create a job alert for similar positions
Job alert turned off. You won’t receive updates for this search anymore.

Job description

We are a B2B WealthTech startup based in Abu Dhabi and backed by BNY Mellon (America’s oldest bank and first company to list on NYSE) and Lunate (a new $50B AUM alternative asset management firm based in Abu Dhabi, UAE).
The company has raised $300M to build a state of the art wealth technology platform.
Our mission is to power and grow our clients’ Wealth franchises through differentiated experiences, financial solutions, and insights.
Our digital wealth management platform- will enable banks and other financial institutions in the Middle East to grow and further penetrate affluent, HNW and UHNW investor segments.
While still leveraging the capabilities and knowledge of large organizations, our fintech is a startup with truly cross-functional and agile teams.
For more information, please visit www.
alpheya.com Role We're building a team that owns production incident response, deep debugging, and permanent fixes across application, data, and deployment layers.
This is not a tickets-only ops role.
You will write code, ship fixes safely, and harden the platform so issues don't repeat.
Note: This is a software engineering role with real production ownership.
You’ll combine engineering and operations to own outcomes end-to-end: investigate incidents, ship code fixes, and prevent repeat issues through tests, observability, and hardening.
Lead and execute production incident response: triage, mitigation, stakeholder communication, and coordination across teams Debug and fix issues across Go services (mandatory) and the broader stack (Node.
js services where relevant) Work across service boundaries: GraphQL/RPC, distributed tracing, dependency failures, performance bottlenecks, and safe degradation patterns Troubleshoot Kubernetes workloads and deployments Diagnose PostgreSQL/CNPG issues Handle production bugs that span application + data pipelines (ETL/Snowflake mappings), including backfills/replays and data-quality validation Build prevention: add regression tests, improve observability , and maintain runbooks/service passports Drive reliability improvements: SLOs/SLIs, alert quality, release readiness checks, and operational standards across teams Execute and automate post-deployment validation (smoke/regression) using Playwright etc , writing test cases for every fix to prevent regressions.
7+ years in SRE / Production Engineering / Platform Engineering (reliability-focused) Strong Go (mandatory): ability to read, debug, and ship production fixes in Go codebases Proven experience debugging distributed systems in production (latency, error rates, timeouts, retries, cascading failures) Strong hands-on experience with Kubernetes in production environments Experience with Helm and GitOps workflows (FluxCD preferred; ArgoCD acceptable) Solid PostgreSQL troubleshooting experience (performance, incident patterns, migrations) Observability experience (metrics/logging/tracing; Datadog/Grafana/Tempo/Loki experience is a plus) Strong incident leadership: calm under pressure, clear communication, structured problem-solving Engineering hygiene: PR discipline, reviews, testing mindset, safe rollouts/rollbacks Comfortable with IAM/security fundamentals in real production systems: OAuth2/OIDC basics, RBAC/least privilege, and safe secrets handling Good to Have Node.
js backend experience in production Experience in FinTech / regulated environments / high-availability systems (auditability, change control, incident rigor) Data reliability experience: ETL monitoring, reconciliation, Snowflake operations, schema/mapping drift handling Reliability patterns common to trading/fintech platforms: correctness and data integrity mindset (idempotency, reconciliation), resilient partner integrations, and strong observability for critical user journeys

This job post has been translated by AI and may contain minor differences or errors.

You’ve reached the maximum limit of 15 job alerts. To create a new alert, please delete an existing one first.
Job alert created for this search. You’ll receive updates when new jobs match.
Are you sure you want to unapply?

You'll no longer be considered for this role and your application will be removed from the employer's inbox.