Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!
We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.


User unblocked successfully
https://bayt.page.link/LuRa2Hr81qhk6phY8
Back to the job results

Cloud Platform Engineer - Data Reliability & Backing Services

Yesterday 2026/10/29
Remote
Other Business Support Services
Create a job alert for similar positions
Job alert turned off. You won’t receive updates for this search anymore.

Job description

About Mozn


MOZN is a leading Enterprise AI company enabling organizations to make informed decisions in two critical domains: Financial Crime Prevention and Enterprise Knowledge Intelligence.
We’re a diverse, collaborative team of innovators united by a shared purpose: to build AI that delivers tangible business value, builds trust, and empowers people and organizations with augmented intelligence. Our culture is built on the relentless pursuit of excellence and meaningful impact.
If you’re passionate about working alongside exceptional talent on world-class AI, and you want the autonomy and runway to do the best work of your career, join us in shaping the future of intelligent enterprises.


About the role


We are looking for a highly motivated Cloud Platform Engineer III to join our Cloud Engineering team. The ideal candidate is passionate about data reliability, performance, and scalable backing services.


This role focuses on the reliability, performance, operation, automation, and continuous improvement of critical backing services such as MySQL, PostgreSQL, MongoDB, Elasticsearch/OpenSearch, Kafka, analytical databases (e.g., StarRocks, ClickHouse), and other database and messaging technologies across cloud-native and hybrid environments.


This is not a traditional DBA role. The ideal candidate understands distributed systems, Kubernetes, cloud platforms, automation, IaC, observability, and AI-assisted workflows, and can help product engineering teams use backing services safely and effectively.


What you'll do


Data Reliability & Backing Services Operations


  • Own reliability, performance, scalability, and operational health of MySQL, PostgreSQL, MongoDB, Elasticsearch/OpenSearch, Kafka, StarRocks, ClickHouse, and similar platforms.
  • Define best practices for how product engineering teams use transactional, document, search, messaging, and analytical platforms.
  • Design and maintain highly available, scalable, and resilient platform services, including replication, backup, recovery, failover, and disaster recovery capabilities.
  • Perform capacity planning, performance tuning, workload reviews, upgrades, patching, and lifecycle management for platform services.
  • Identify and resolve risks such as slow queries, hot partitions, consumer lag, replication lag, index growth, retention issues, and storage saturation.
  • Troubleshoot and resolve complex production issues related to databases, messaging systems, search platforms, and distributed data platforms.

Kubernetes & Cloud Platform Engineering


  • Hands-on experience deploying, operating, and troubleshooting stateful workloads in Kubernetes-based environments.
  • Strong understanding of Kubernetes fundamentals, including networking, storage, workload lifecycle management, scalability, and reliability concepts.
  • Enable and support Kubernetes-based deployments of database, messaging, search, and analytical platforms using cloud-native patterns and operational best practices.

Automation & Platform Enablement


  • Use automation, Infrastructure as Code, GitOps, and CI/CD to make backing services repeatable, reliable, and easier to operate.
  • Contribute to self-service platform capabilities, guardrails, dashboards, alerts, runbooks, and production readiness checks.
  • Use AI-assisted workflows where appropriate for incident triage, root cause analysis, query analysis, capacity forecasting, documentation, and developer support.
  • Collaborate with Product Engineering, SRE, Security, Data Engineering, and Cloud Platform teams to improve reliability, performance, availability, and security posture.

Qualifications


  • 4-7 years of experience in Platform Engineering, SRE, Database Reliability Engineering, Data Platform Engineering, DevOps, or related roles.
  • Strong hands-on experience with MySQL, PostgreSQL, Kafka, and at least one of MongoDB or Elasticsearch/OpenSearch in production environments.
  • Experience with analytical or distributed data platforms such as StarRocks, ClickHouse, Apache Doris, Druid, Pinot, or similar OLAP systems is highly desirable.
  • Hands-on experience operating stateful workloads in Kubernetes-based environments.
  • Good understanding of high availability, replication, backup and recovery, disaster recovery, capacity planning, and performance tuning concepts.
  • Familiarity with distributed systems concepts including sharding, replication, partitioning, consistency, compaction, backpressure, consumer lag, and query optimization.
  • Experience with at least one major cloud platform (AWS, GCP, or OCI).
  • Experience automating provisioning, deployment, configuration, monitoring, and lifecycle management using tools such as Terraform, Helm, Ansible, GitOps, or similar automation frameworks.
  • Strong scripting or programming skills in Python, Bash, Go, or similar.
  • Experience with observability platforms such as LTGM, Prometheus/Grafana, ELK/OpenSearch, Datadog, or equivalent.
  • Strong troubleshooting, problem-solving, and debugging skills across distributed systems.
  • Excellent communication, collaboration, and documentation skills.
  • Demonstrated curiosity, ownership mindset, adaptability, and ability to guide product engineering teams.

Preferred Qualifications


  • Experience designing, operating, or optimizing large-scale distributed database, messaging, search, or analytics platforms.
  • Experience operating or optimizing analytical databases and OLAP systems such as StarRocks, ClickHouse, Apache Doris, Druid, or Pinot.
  • Experience with streaming and real-time data platforms leveraging technologies such as Kafka, Flink, Spark, CDC, or similar ecosystems.
  • Exposure to Analytics, Data Engineering, AI/ML platforms, LLM-based applications, or AI infrastructure projects.
  • Experience using AI-assisted tooling or agents to improve operations, troubleshooting, documentation, or developer self-service.
This job post has been translated by AI and may contain minor differences or errors.
You’ve reached the maximum limit of 15 job alerts. To create a new alert, please delete an existing one first.
Job alert created for this search. You’ll receive updates when new jobs match.
Are you sure you want to unapply?

You'll no longer be considered for this role and your application will be removed from the employer's inbox.