Riskinsight Consulting
Python Pyspark Developer
Job Description
Responsibilities
- Design, develop, and deploy scalable data processing applications using Python and PySpark.
- Collaborate with data scientists and analysts to understand requirements and translate them into technical solutions.
- Write efficient and optimized code to process and analyze large volumes of data.
- Implement data ingestion processes from various data sources to the data processing platform.
- Create and maintain data pipelines and workflows for data processing and analytics.
- Perform data quality checks and ensure data integrity throughout the system.
- Troubleshoot and debug production issues to identify and resolve technical problems.
- Stay updated with the latest technologies and tools in data processing to drive innovation and improve performance.
- Collaborate with cross-functional teams to ensure seamless integration of data processing applications with other systems.
Requirements
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 4-6 years of experience in Python development with specific experience in PySpark.
- Strong proficiency in Python programming language.
- Experience with distributed computing frameworks like PySpark and Apache Spark.
- Knowledge of data processing and analytics techniques.
- Experience with data integration and ETL processes.
- Familiarity with data storage and querying systems like SQL and NoSQL databases.
- Understanding of data structures, algorithms, and distributed systems.
- Excellent problem-solving and analytical skills.
- Strong communication and interpersonal skills.
- Ability to work independently and in a team environment.
- Proactive attitude towards learning and professional development.
Skills: Python, PySpark, Apache Spark, SQL, NoSQL