Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!
We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.


User unblocked successfully
https://bayt.page.link/tgYt2PPsXA8adntW8
Back to the job results

PySpark Developer

11 hours ago 2026/09/30
Other Business Support Services
Create a job alert for similar positions
Job alert turned off. You won’t receive updates for this search anymore.

Job description

Roles and Responsibilities

Key Responsibilities
Develop and maintain data pipelines using PySpark
Process and analyze large-scale datasets in distributed environments
Design and implement ETL/ELT workflows
Optimize Spark jobs for performance and scalability
Work with data stored in HDFS, Hive, or cloud storage (S3, ADLS)
Collaborate with data engineers, analysts, and business teams
Ensure data quality, integrity, and governance
Debug and troubleshoot data processing issues
Automate workflows using scheduling tools (Airflow, Oozie, etc.)
Write clean, scalable, and efficient code
Required Skills & Qualifications
Technical Skills
Strong proficiency in Python and PySpark
Good experience with Apache Spark (RDDs, DataFrames, Spark SQL)
Knowledge of Hadoop ecosystem (HDFS, Hive)
Experience in ETL pipeline development
Familiarity with SQL and database concepts
Experience with data formats (Parquet, ORC, JSON, CSV)
Basic understanding of distributed computing concepts
Exposure to version control tools (Git)



Additional Responsibilities

Preferred Skills (Nice-to-Have)
Experience with cloud platforms (AWS, Azure, GCP)
Knowledge of Databricks or EMR environments
Familiarity with workflow orchestration tools (Airflow)
Exposure to Kafka or real-time data streaming
Understanding of Delta Lake / Lakehouse architecture
Experience with NoSQL databases (MongoDB, Cassandra)
Knowledge of CI/CD pipelines and DevOps practices
Basic understanding of machine learning workflows



Technical Requirements

* Primary skills: Pyspark


This job post has been translated by AI and may contain minor differences or errors.
You’ve reached the maximum limit of 15 job alerts. To create a new alert, please delete an existing one first.
Job alert created for this search. You’ll receive updates when new jobs match.
Are you sure you want to unapply?

You'll no longer be considered for this role and your application will be removed from the employer's inbox.