Job description
Project Role : Data Engineer
Project Role Description : Design, develop and maintain data solutions for data generation, collection, and processing. Create data pipelines, ensure data quality, and implement ETL (extract, transform and load) processes to migrate and deploy data across systems.
Must have skills : PySpark
Good to have skills : NA
Minimum 5 year(s) of experience is required
Educational Qualification : 15 years full time education
Summary:
As a Data Engineer, a typical day involves designing, developing, and maintaining comprehensive data solutions that support the generation, collection, and processing of data. This role requires creating efficient data pipelines and ensuring the integrity and quality of data throughout its lifecycle. The position also involves implementing processes to extract, transform, and load data, facilitating seamless migration and deployment across various systems. Collaboration with different teams to align data strategies and optimize workflows is an integral part of daily activities, ensuring that data infrastructure supports organizational needs effectively and reliably.
Roles & Responsibilities:
- Expected to be an SME, collaborate and manage the team to perform.
- Responsible for team decisions.
- Engage with multiple teams and contribute on key decisions.
- Provide solutions to problems for their immediate team and across multiple teams.
- Lead the development and optimization of data pipelines to improve performance and scalability.
- Mentor junior team members to enhance their technical skills and understanding of data engineering best practices.
- Coordinate with stakeholders to gather requirements and translate them into technical specifications.
- Ensure adherence to data governance and compliance standards within the team and projects.
Professional & Technical Skills:
- Must To Have Skills: Proficiency in PySpark.
- Strong experience in building and managing scalable ETL pipelines.
- In-depth knowledge of data processing frameworks and distributed computing.
- Ability to optimize data workflows for performance and reliability.
- Familiarity with cloud-based data storage and processing solutions.
- Experience in troubleshooting and resolving complex data issues.
Additional Information:
- The candidate should have minimum 5 years of experience in PySpark.
- This position is based at our Chennai office.
- A 15 years full time education is required.
This job post has been translated by AI and may contain minor differences or errors.