University of Rochester
Data Scientist AI intern
Job Description
Roche fosters diversity, equity and inclusion, representing the communities we serve. When dealing with healthcare on a global scale, diversity is an essential ingredient to success. We believe that inclusion is key to understanding people’s varied healthcare needs. Together, we embrace individuality and share a passion for exceptional care. Join Roche, where every voice matters.
The Position
Developing Automated Agents for Dataset Cleaning and Generation with Generative AI and LLMs
Supervisors: , Ercan Suekuer, Tatyana Doktorova
We are seeking an intern to support the development of a fully automated agent-based system for dataset cleaning and generation using generative AI and large language models (LLMs). The goal is to build intelligent agents capable of automatically identifying and correcting data inconsistencies, handling missing data, and generating high-quality datasets. These agents will leverage advanced AI models, including LLMs, to streamline data preparation processes.
As part of this role, you’ll also collaborate with Microsoft Azure engineers, who will provide support in deploying and optimizing these systems on the Azure platform.
In this position you will:
-
Collaborate on building an agent-based approach for automating dataset cleaning, preparation, and generation using LLMs and generative AI.
-
Design workflows that utilize AI models to identify, handle, and resolve data quality issues.
-
Explore machine learning techniques for data synthesis and augmentation to improve the availability and quality of datasets.
-
Integrate Langchain and LlamaParser into data processing workflows as part of the automation pipeline.
-
Work closely with team members to understand project needs and integrate the automated agents into existing workflows.
-
Support testing, validation, and optimization of the developed agents.
Qualifications Required:
-
You have completed your studies (Bachelor or Master) within the past 12 months, or you are currently pursuing a Master’s or PhD degree in computer science, bioinformatics, computational sciences, or a related field.
-
Strong programming skills, particularly in Python and R.
-
Familiarity with AI/ML frameworks (e.g., TensorFlow, PyTorch) and experience with generative AI and LLMs are highly desirable.
-
Experience with Langchain and LlamaParser is a plus.
-
Experience in data handling, cleaning, and preprocessing is essential.
-
Nice-to-have: Knowledge of clinical trials, CDISC data standards, and know-how of Retrieval-Augmented Generation (RAG) approaches.
-
Strong problem-solving and communication skills, with the ability to work both independently and as part of a team.
You have very good interpersonal and communication skills, are able to build good working relationships, and are an outstanding teammate. Your experience and investigative attitude allow you to work independently, to design, perform, and interpret experiments, and to embark on new scientific methodologies.
Start: from January until March 2025
Duration: 6-9 Months
Workload: 100%
Due to regulations non-EU/EFTA citizens must be enrolled and provide a certificate from the university stating that an internship is mandatory as part of the application documents
Who we are
At Roche, more than 100,000 people across 100 countries are pushing back the frontiers of healthcare. Working together, we’ve become one of the world’s leading research-focused healthcare groups. Our success is built on innovation, curiosity and diversity.
Basel is the headquarters of the Roche Group and one of its most important centres of pharmaceutical research. Over 10,700 employees from over 100 countries come together at our Basel/Kaiseraugst site, which is one of Roche`s largest sites. Read more.
Besides extensive development and training opportunities, we offer flexible working options, 18 weeks of maternity leave and 10 weeks of gender independent partnership leave. Our employees also benefit from multiple services on site such as child-care facilities, medical services, restaurants and cafeterias, as well as various employee events.
We believe in the power of diversity and inclusion, and strive to identify and create opportunities that enable all people to bring their unique selves to Roche.
Roche is an Equal Opportunity Employer.