Visa
Sr Data Engineer (ML)
Job Description
Company Description
As the world’s leader in digital payments technology, Visa’s mission is to connect the world through the most creative, reliable and secure payment network – enabling individuals, businesses, and economies to thrive. Our advanced global processing network, VisaNet, provides secure and reliable payments around the world, and is capable of handling more than 65,000 transaction messages a second. The company’s dedication to innovation drives the rapid growth of connected commerce on any device, and fuels the dream of a cashless future for everyone, everywhere. As the world moves from analog to digital, Visa is applying our brand, products, people, network and scale to reshape the future of commerce.
At Visa, your individuality fits right in. Working here gives you an opportunity to impact the world, invest in your career growth, and be part of an inclusive and diverse workplace. We are a global team of disruptors, trailblazers, innovators and risk-takers who are helping drive economic growth in even the most remote parts of the world, creatively moving the industry forward, and doing meaningful work that brings financial literacy and digital commerce to millions of unbanked and underserved consumers.
You’re an Individual. We’re the team for you. Together, let’s transform the way the world pays.
Job Description
As a Sr ML/Data Engineer, you will be part of the Global Data function in building Data Science models at scale and managing the pipelines of global AI/ML models, MLOps Pipeline. The Sr ML/Data Engineer takes responsibility to build the required blocks for seamless processing of the Data science models including frameworks for feature selection, data preparation, model re-training, model performance and scoring optimization, all at scale. The position is based at Visa’s offices in Bangalore, India.
Essential Functions
- Determine and refine machine learning objectives.
- Ensure that algorithms generate recommendations as expected by testing and training the ML models by developing approaches/functions to analyze huge volumes of historical data.
- Build MLOps pipelines to support development, experimentation, continuous integration, continuous delivery, verification/ validation, and monitoring of AI/ML models
- Run tests, perform statistical analysis, and interpret test results executing the ML models.
- Model re-training, performance evaluation and score optimization for existing ML models.
- Create necessary validation and documentation to support the Model approval process with the Model Risk Management group to make it production ready.
- Automate the end-to-end deployment and training steps as part of productionizing the ML models.
- Create required documentation for the Ops team to take the ML models to production.
- Solve complex problems with big data datasets as well as optimize existing machine learning libraries and frameworks.
- Provide quality data solutions in a timely manner and be responsible for data governance and integrity while meeting objectives and maintaining SLAs.
- Proficient in operationalizing some or all of the following techniques: Linear & Logistic Regression, Decision Trees, Random Forests, K-Nearest Neighbors, Markov Chain, Monte Carlo, Gibbs Sampling, Support Vector Machines, Deep Learning techniques
Qualifications
Qualifications:
- Minimum of 4+ years of analytics expertise in building Data Science and ML pipelines
- 4+ yrs. work experience with a Bachelor’s Degree or 3+ years of work experience with a Master’s or Advanced Degree with specialization in Computer science, Information science, Statistics, Data Engineering and Analytics or relevant area.
- Good understanding of the Payments and Banking Industry including aspects such as consumer credit, consumer debit, prepaid, small business, commercial, co-branded and merchant
Technical Expertise:
- Experience in building robust data pipelines and writing ETL/ELT code (PySpark, Hive)
- Experience working with scheduling tools (Airflow, Oozie) or building data processing orchestration workflows
- Hands-on experience working with large scale data ingestion, processing, and storage in the Hadoop ecosystem
- Experience in writing and optimizing SQL queries in Big data environment.
- Familiarity with both common computing environments (e.g. Linux, Shell Scripting) and commonly-used IDE’s (Jupyter Notebooks)
- Ability to build ML/data pipelines (e.g. ETL, data preparation, feature selection, data aggregation and analysis) using PySpark.
- Experience creating/supporting production software/systems and a proven track record of identifying and resolving performance bottlenecks for production systems.
- Experience working in building and integrating the code in the defined CI/CD framework using git
- Preferred experience with Visualization Tools like Tableau, Power BI and D3
- Exposure to machine learning models based on unstructured, structured, and streaming datasets.
Business Skills:
- Ability to translate data and technical concepts into requirements documents, business cases and user stories.
- Good understanding of agile working practices and related program management skills.
- Good communication and presentation skills with ability to interact with different cross-functional team members at varying levels
- Ability to learn new tools and paradigms as data science continues to evolve at Visa and elsewhere.
- Demonstrated intellectual and analytical rigor, team oriented, energetic, collaborative, diplomatic, and flexible style.
Additional Information
This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.