l**************l

About Candidate

Education

S
Specialization in Data Science and Big Data: Data-Driven Decision Making 2025
Massachusetts Institute of Technology – MIT
P
Postgraduate in Data Engineering 2024
Faculdade XP Educação - IGTI
S
Specialization in Data Science 2024
Escola Britânica de Artes Criativas e Tecnologia
T
Technologist in Systems Analysis and Development 2022
Centro Universitário Joaquim Nabuco de Recife
B
Bachelor's in Control and Automation Engineering 2013
Universidade do Estado do Amazonas - UEA

Experiences

S
Senior Data Engineer - Business Intelligence and Implementation Team (CEMIG) jan 2025
PowerOfData (Brazil)

• Designed and implemented distributed ETL/ELT pipelines in GCP using Python and PySpark, leveraging DataProc clusters, Cloud Storage, and Kubernetes to process diverse legal and investigative data sources. • Built and maintained a scalable and secure Data Lake architecture to store structured and semi-structured data, enabling cross-team access, traceability, and advanced analytics. • Developed end-to-end analytical workflows for building ABTs (Analytical Base Tables), supporting downstream applications such as clustering, sentiment analysis, and pattern detection. • Applied LLM-based techniques, including regex parsing and semantic mapping, to extract insights from unstructured judicial data and correlate relevant information. • Designed and operationalized decision-making pipelines that integrate model outputs, NLP insights, and business rules into streamlined APIs for analytics and legal teams. • Participated in architectural design reviews and implemented best practices to ensure scalability, cost-effectiveness, and alignment with governance and security policies. • Built custom validation frameworks and monitoring dashboards using Looker Studio and Power BI for continuous data quality assurance and transparency.

S
Senior Data Engineer - Credit Team (BV Bank) Oct 2024 - Jan 2025
PowerOfData (Brazil) |

• Mapped and analyzed data flows as part of a full migration from Azure (Data Factory, Synapse) to GCP (BigQuery, Cloud Storage). • Rebuilt pipelines using dbt for SQL transformations and Apache Airflow for orchestration. • Integrated CI/CD processes with Jenkins and Bitbucket to automate deployment and testing. • Validated and implemented business rules in collaboration with multidisciplinary teams, ensuring alignment with data governance and security standards. • Improved data quality, lineage tracking, and transformation efficiency through modular modeling and robust documentation. • Applied performance tuning and redundancy elimination techniques to optimize costs and processing time. using BigQuery, dbt, Airflow, and Cloud Storage. • Refactored transformations and implemented CI/CD automation via Jenkins and Bitbucket. • Validated business rules and ensured compliance with governance through modular dbt models and automated tests. • Improved data quality, lineage tracking, and pipeline efficiency post-migration.

S
Senior Data Engineer - Strategy and Implementation Team (Health Project) Oct 2024 - Apr 2025
ACT-AI (London, UK)

• Led analysis of multisource health data (national and international) using Azure cloud services and semantic tools like Protégé, LogMap, and Neo4j. • Designed data architecture and pipelines using Azure Data Factory, Synapse, SQL, and Python to support AI-driven medical insights and ontology alignment.Built and managed centralized Data Lakes, ensuring governance and accessibility across teams. • Developed and deployed AI agents using state-of-the-art LLM techniques, semantic parsing, and advanced regex for medical knowledge extraction. • Collaborated with ML and data science teams to structure ABTs, enable inference workflows, and optimize model outputs. • Applied data quality frameworks and event-driven workflows using Azure-native tools such as Event Grid, DLP, and Data Catalog. using Dataflow and BigQuery to support real-time analytics. • Unified disparate data sources into a central Data Lake with standardized schema design. • Contributed to model optimization and ML inference flow with Data Science teams.

A
Automation & Data Engineer Apr 2020 - Oct 2024
Samsung Electronics

• Implementation of real-time data pipelines on Google Cloud Platform (GCP) to monitor and optimize robotic automation systems, ensuring operational efficiency and reduced downtime. • Development and deployment of predictive models using Python and SQL in BigQuery to detect anomalies in quality inspection processes and suggest proactive adjustments. • Structuring and ingestion of production data from robotic systems into a centralized Data Lake on GCP, enabling cross-functional analytics and traceability. • Creation of automated ETL workflows to transform data collected from industrial sensors, integrating it with business KPIs for real-time monitoring dashboards using Looker Studio and Power BI. • Collaboration with multidisciplinary teams to design and deliver scalable analytics solutions, driving continuous improvement and accelerating time-to-market for new product launches. • Transformation of operational data into actionable insights, achieving a 20% improvement in product quality through machine learning pipelines and statistical modeling.

Skills

ETL/ELT
100%
Airflow
90%
dbt
90%
PostgreSQL
100%
Snowflake
95%
Feature Engineering
100%
Governance
100%
Python
99%
SQL
98%
PowerBI
100%

Be the first to review “leandro.vidigal”

Your Rating for this listing