Expleo
Data Engineer – BFSI Domain
Job Description
Overview
Data Engineer – BFSI Domain
Responsibilities
Key Responsibilities
- Develop and optimize ETL/ELT pipelines using Apache Spark, Apache Iceberg and Python
- Orchestrate workflows with Apache Airflow and NiFi
- Model, ingest and curate structured and semi-structured data in data lakes and warehouses
- Implement and maintain data versioning, schema evolution and partition strategies in Iceberg
- Build and maintain Google Cloud data solutions (BigQuery, Dataflow, Pub/Sub, Cloud Storage)
- Monitor job performance, troubleshoot data quality issues and tune cluster resources
- Collaborate on CI/CD for data infrastructure (Terraform, Kubernetes, Docker)
- Document pipelines, schemas and operational runbooks
Required Skills & Experience
- 5+ years in a Data Engineering role
- Hands-on with Apache Spark, Apache Iceberg, Python and SQL
- Experience authoring DAGs in Airflow and streaming/batch flows in NiFi
- Solid understanding of data modeling, partitioning and indexing strategies
- Familiarity with Linux, Git and RESTful APIs
- Strong debugging, performance-tuning and troubleshooting skills
Qualifications
-
Key Responsibilities
- Develop and optimize ETL/ELT pipelines using Apache Spark, Apache Iceberg and Python
- Orchestrate workflows with Apache Airflow and NiFi
- Model, ingest and curate structured and semi-structured data in data lakes and warehouses
- Implement and maintain data versioning, schema evolution and partition strategies in Iceberg
- Build and maintain Google Cloud data solutions (BigQuery, Dataflow, Pub/Sub, Cloud Storage)
- Monitor job performance, troubleshoot data quality issues and tune cluster resources
- Collaborate on CI/CD for data infrastructure (Terraform, Kubernetes, Docker)
- Document pipelines, schemas and operational runbooks
Required Skills & Experience
- 5+ years in a Data Engineering role
- Hands-on with Apache Spark, Apache Iceberg, Python and SQL
- Experience authoring DAGs in Airflow and streaming/batch flows in NiFi
- Solid understanding of data modeling, partitioning and indexing strategies
- Familiarity with Linux, Git and RESTful APIs
- Strong debugging, performance-tuning and troubleshooting skills
Essential skills
Key Responsibilities
- Develop and optimize ETL/ELT pipelines using Apache Spark, Apache Iceberg and Python
- Orchestrate workflows with Apache Airflow and NiFi
- Model, ingest and curate structured and semi-structured data in data lakes and warehouses
- Implement and maintain data versioning, schema evolution and partition strategies in Iceberg
- Build and maintain Google Cloud data solutions (BigQuery, Dataflow, Pub/Sub, Cloud Storage)
- Monitor job performance, troubleshoot data quality issues and tune cluster resources
- Collaborate on CI/CD for data infrastructure (Terraform, Kubernetes, Docker)
- Document pipelines, schemas and operational runbooks
Required Skills & Experience
- 5+ years in a Data Engineering role
- Hands-on with Apache Spark, Apache Iceberg, Python and SQL
- Experience authoring DAGs in Airflow and streaming/batch flows in NiFi
- Solid understanding of data modeling, partitioning and indexing strategies
- Familiarity with Linux, Git and RESTful APIs
- Strong debugging, performance-tuning and troubleshooting skills
Experience
-
Required Skills & Experience
- 5+ years in a Data Engineering role
- Hands-on with Apache Spark, Apache Iceberg, Python and SQL
- Experience authoring DAGs in Airflow and streaming/batch flows in NiFi
- Solid understanding of data modeling, partitioning and indexing strategies
- Familiarity with Linux, Git and RESTful APIs
- Strong debugging, performance-tuning and troubleshooting skills