Job description
We are looking for a Senior Data Engineer with 6+ years of experience to design, build, and scale cloud‑native data and AI platforms on Azure using Databricks. The role requires strong hands‑on expertise in data engineering, lakehouse architecture, and AI/ML data pipelines to support advanced analytics, machine learning, and business intelligence use cases.
The ideal candidate will lead complex data initiatives, collaborate closely with data scientists and ML engineers, and play a key role in shaping the organization’s data and AI strategy.
Honeywell helps organizations solve the world's most complex challenges in automation, the future of aviation and energy transition. As a trusted partner, we provide actionable solutions and innovation through our Aerospace Technologies, Building Automation, Energy and Sustainability Solutions, and Industrial Automation business segments – powered by our Honeywell Forge software – that help make the world smarter, safer and more sustainable.Responsibilities:
- Architect and develop end‑to‑end data pipelines on Azure using Databricks (Spark / PySpark)
- Design and maintain lakehouse architectures using Azure Data Lake + Delta Lake
- Build and optimize batch and streaming pipelines for large‑scale datasets
- Create and manage feature pipelines and curated datasets for AI/ML model training and inference
- Collaborate with data scientists and ML engineers to enable scalable ML workflows
- Support MLOps pipelines, including data versioning, feature stores, and model deployment readiness
- Optimize Databricks workloads for performance, scalability, and cost efficiency
- Implement data quality, validation, monitoring, and observability frameworks
- Ensure data security, governance, and compliance using Azure and Databricks best practices
- Review code, define standards, and mentor junior and mid‑level data engineers
- Lead architectural decisions and contribute to data platform roadmap planning
Qualifications:
Required Skills & Qualifications
- 6+ years of hands‑on experience in Data Engineering or Data Platform roles
- Strong proficiency in Python, PySpark, and Spark SQL
- Extensive experience with Databricks (jobs, notebooks, workflows, Delta Live Tables)
- Strong experience with Azure Cloud services, including:
- Azure Data Lake Storage (ADLS Gen2)
- Azure Databricks
- Azure Data Factory / Synapse Pipelines
- Solid understanding of Delta Lake, including optimization and ACID guarantees
- Advanced SQL skills for analytical data modeling
- Experience designing AI/ML data pipelines (training, validation, inference datasets)
- Knowledge of data warehousing, lakehouse, and dimensional modeling concepts
- Familiarity with CI/CD, Git, and DevOps practices
- Strong troubleshooting, performance tuning, and problem‑solving skills
- Develop, orchestrate and maintain scalable data pipelines using DAG based workflows to ensure reliable and efficient data processing
Preferred / Nice to Have Skills
- Experience with ML platforms such as Azure Machine Learning or Databricks ML
- Hands‑on experience with Feature Store, MLflow, or experiment tracking
- Streaming data experience (Kafka, Event Hubs, Spark Structured Streaming)
- Experience with dbt, Unity Catalog, or data governance tools
- Knowledge of BI and visualization tools (Power BI preferred)
- Exposure to MLOps best practices and production ML systems
- Prior experience as a technical lead or mentor
- Knowledge on LangChain , Agent, Agent Architecture.
This job post has been translated by AI and may contain minor differences or errors.