Submitting more applications increases your chances of landing a job.
Here’s how busy the average job seeker was last month:
Opportunities viewed
Applications submitted
Keep exploring and applying to maximize your chances!
Looking for employers with a proven track record of hiring women?
Click here to explore opportunities now!You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for
Would You Be Likely to Participate?
If selected, we will contact you via email with further instructions and details about your participation.
You will receive a $7 payout for answering the survey.
* Design, develop, and maintain scalable batch/stream data pipelines using Python and PySpark in distributed environments.
* Implement efficient transformations, aggregations, and joins on large datasets while ensuring performance and cost optimization.
* Write optimized SQL for data extraction, validation, and reconciliation across multiple sources.
* Build reusable, testable modules and follow engineering best practices (code reviews, unit testing, documentation).
* Troubleshoot production issues, perform root-cause analysis, and implement long-term fixes and monitoring improvements.
* Collaborate with stakeholders to translate requirements into technical designs, delivery plans, and measurable outcomes.
* Ensure data quality through validation checks, anomaly detection patterns, and consistent schema management.
* Contribute to continuous improvement of development standards, performance benchmarks, and pipeline reliability.
* Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
* 5-9 years of hands-on experience in software development and/or data engineering roles.
* Strong proficiency in Python with experience building production-grade applications or data workflows.
* Strong proficiency in PySpark, including DataFrame APIs, optimization techniques, and distributed processing concepts.
* Working knowledge of SQL for complex queries, data analysis, and validation.
* Experience delivering reliable solutions with attention to performance, scalability, and maintainability.
Technology->Analytics - Packages->Python - Big Data,Technology->Big Data - Data Processing->PySpark
Build and scale data-driven solutions that power smarter decisions. In this role, you'll design and deliver high-performance data processing pipelines using Python and PySpark, working closely with data engineers, analysts, and product teams to turn raw data into reliable, actionable insights. You'll contribute to a collaborative environment where clean code, thoughtful design, and continuous improvement are valued. If you enjoy solving complex data challenges, optimizing distributed workloads, and delivering production-ready systems that make a real impact, this is a great opportunity to grow your expertise while helping teams move faster with trustworthy data.
You'll no longer be considered for this role and your application will be removed from the employer's inbox.