Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!

We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.

https://bayt.page.link/k4PS2V4ErvdAAksG9

Back to the job results

Data Engineer

- Thermo Fisher Scientific, Inc
- India

16 days ago 2026/10/21

Complete Questionnaire

Apply on company site

Other Business Support Services

Create a job alert for similar positions

Job alert turned off. You won’t receive updates for this search anymore.

Undo

Job description

Work Schedule

Other

Environmental Conditions

Office

Job Description

Summarized Purpose:

We are offering an opportunity for a Mid-Level Data Engineer to design, build, test, tune, and support production data pipelines using PySpark, Python, advanced SQL, AWS data services, secure data handling practices, and AI-assisted data engineering capabilities.

Education/Experience:

Bachelor's degree or equivalent in Computer Science, Information Technology, Data Engineering, or related field
3-5 years of experience in data engineering, ETL development, SQL, AWS data platforms, or production data pipeline support

Major Job Responsibilities:

Develop, test, tune, and maintain ETL and data pipelines using PySpark, Python, SQL, and AWS services
Support ingestion and transformation of flat files, relational databases, APIs, data warehouses, and enterprise data sources
Collaborate with business analysts, data architects, QA, DevOps, and senior engineers to implement source-to-target mappings and data solutions
Implement CDC, incremental load design, idempotent pipeline processing, and data reconciliation patterns for reliable data movement
Maintain technical documentation, mapping specifications, data catalog updates, runbooks, automated tests, and release support materials

Knowledge, Skills, and Abilities:

Hands-on experience with PySpark, Python, advanced SQL, ETL best practices, data modeling, and large-scale data processing
Deep knowledge of Redshift performance tuning including distribution keys, sort keys, compression encoding, Spectrum, materialized views, WLM, vacuum, and analyze
Strong knowledge of Athena optimization including partition pruning, file formats, compression, schema evolution, and cost-efficient query design
Strong understanding of DynamoDB data modeling, access-pattern-based design, capacity planning, GSIs/LSIs, TTL, Streams, and performance tuning
Exposure to secure PHI/PII handling including encryption, access controls, auditability, retention, masking, and de-identification where applicable
Strong analytical, troubleshooting, documentation, communication, and cross-functional collaboration skills

Must Have Skills:

PySpark, Python, advanced SQL, ETL development, and data pipeline implementation experience
AWS data services experience including S3, Glue, Lambda, Step Functions, ECS, DynamoDB, Redshift, PostgreSQL, SQL Server, and Athena integration
Flat-file ingestion, source-to-target mapping, transformation logic, CDC, incremental loads, idempotent processing, reconciliation, and data quality checks
CI/CD, GitHub workflows, automated testing, and release management for data pipelines and database changes
Problem-solving, production support, debugging, documentation, and Agile delivery skills

Good to Have Skills:

Exposure to AI-assisted mapping automation and use of LLMs for data cleaning, data quality checks, transformation logic, or documentation
Familiarity with RAG patterns, embeddings, vector databases, semantic search, or AI-enabled data discovery solutions
Understanding of healthcare data standards such as HL7, FHIR, CCD, claims data, EMR extracts, clinical trial data, and patient de-identification
Familiarity with infrastructure as code such as Terraform or CloudFormation, plus Databricks, Snowflake, streaming, observability, or DevOps practices

Working Hours:

India: 05:30 PM to 02:30 AM IST
Philippines: 08:00 PM to 05:00 AM PHT

This job post has been translated by AI and may contain minor differences or errors.

Apply on company site Email to Friend Complete Questionnaire

Compare your profile with other applicants

Cancel

You’ve reached the maximum limit of 15 job alerts. To create a new alert, please delete an existing one first.

MANAGE

Job alert created for this search. You’ll receive updates when new jobs match.

Manage alerts

Are you sure you want to unapply?

You'll no longer be considered for this role and your application will be removed from the employer's inbox.

Similar Jobs

Azure Data Engineer (Data Lake Ownership)
Michael Page
Dubai, UAE

5 days ago
Software Engineer, Local-First & Data Replication
Perduron
Doha, Qatar

6 days ago · Easy Apply
Data Engineer
Qeu
Jeddah, Saudi Arabia

15 days ago · Easy Apply
Data Center Engineer
Star Services
Dammam, Saudi Arabia

30+ days ago · Easy Apply

Upgrade to Premium

Promote your job application to the top.