Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!
We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.


User unblocked successfully
https://bayt.page.link/JtP62pCk11z18NoHA
Back to the job results

Senior Data Engineer

Today 2026/09/03
Other Business Support Services
Create a job alert for similar positions
Job alert turned off. You won’t receive updates for this search anymore.

Job description

We are seeking a highly skilled and experienced Senior Data Engineer to join our growing team in Bangalore, India. We operate a large-scale private cloud infrastructure spanning thousands of servers across multiple data centers, built on OpenStack, Kubernetes, Ceph, and VMware. In this role, you will design, build, and maintain scalable data pipelines that collect, process, and deliver data from across this infrastructure to power analytics, capacity planning, cost optimization, and AI/ML initiatives. You will collaborate closely with data scientists, platform engineers, SRE, and product teams to deliver robust, real-time, and batch data solutions.



Responsibilities:
  • Design, develop, and maintain scalable data pipelines for ingestion, transformation, and delivery of structured and unstructured data


  • Build and optimize real-time streaming architectures using Apache Kafka and related ecosystem tools


  • Develop and manage ETL/ELT workflows using dbt (dbt Labs) to support analytics, reporting, and AI/ML model training


  • Implement data collection strategies from diverse infrastructure sources including OpenStack, Kubernetes, Ceph, VMware, and ServiceNow (Snow), as well as APIs, databases, and log files


  • Collaborate with AI/ML teams to build feature stores and prepare training datasets at scale


  • Ensure data quality, integrity, and governance through monitoring, validation, automated testing frameworks, and metadata management using DataHub


  • Implement and maintain data quality validation across pipelines (e.g. Great Expectations) to ensure correctness, completeness, consistency, and freshness of data at every stage


  • Optimize data storage and processing solutions within a private cloud environment (OpenStack, Ceph, Kubernetes)


  • Build and manage observability and monitoring solutions with strong emphasis on the ELK stack (Elasticsearch, Logstash, Kibana) and Prometheus as core platforms, complemented by OpenTelemetry for distributed tracing and telemetry collection


  • Mentor junior engineers and contribute to engineering best practices and technical documentation
     



Qualifications:

You have:



  • Bachelor’s or master’s degree in computer science, Data Engineering, or a related field with 12+ years of professional experience and 6+yrs experience in data engineering or a closely related discipline. Strong expertise in data pipeline design, data modelling, and data manipulation at scale.


  • Strong hands-on experience with the ELK stack (Elasticsearch, Logstash, Kibana) and Prometheus — these are essential to the role.


  • Deep experience with SQL and NoSQL databases (PostgreSQL, MongoDB, Cassandra, etc.)


  • Hands-on experience with Apache Kafka (or equivalent streaming platforms such as Apache Pulsar)


  • Experience with dbt (dbt Labs) for data transformation, modelling, and testing


  • Experience with data quality frameworks (e.g. Great Expectations) and pipeline validation practices such as data contracts, automated testing, and anomaly detection


  • Solid knowledge of big data technologies such as Apache Spark, Hadoop, or Flink


  • Experience with open table formats, particularly Apache Iceberg, for large-scale data lakehouse architectures


  • Familiarity with private cloud platforms (OpenStack, VMware) and containerization (Docker, Kubernetes)


  • Experience with OpenTelemetry for instrumentation, distributed tracing, and telemetry data collection


Nice to have:



  • Proficiency in Python, Scala, or Java for data processing and automation


  • Experience building data infrastructure to support AI/ML workflows and model serving


  • Familiarity with LLM tooling, vector databases (e.g. Milvus), and AI data pipelines


  • Knowledge of data governance frameworks, compliance standards, and metadata platforms such as DataHub


  • Experience with orchestration tools such as Apache Airflow or Prefect. Experience collecting and processing data from Ceph storage clusters, OpenStack APIs, or VMware vCenter


  • Familiarity with ServiceNow (Snow) for CMDB, ITSM data extraction, and asset management reporting.


  • Contributions to open-source data engineering projects


This job post has been translated by AI and may contain minor differences or errors.

You’ve reached the maximum limit of 15 job alerts. To create a new alert, please delete an existing one first.
Job alert created for this search. You’ll receive updates when new jobs match.
Are you sure you want to unapply?

You'll no longer be considered for this role and your application will be removed from the employer's inbox.