CVS Health
Data Scientist (GenAI)
Job Description
At CVS Health, we’re building a world of health around every consumer and surrounding ourselves with dedicated colleagues who are passionate about transforming health care. As the nation’s leading health solutions company, we reach millions of Americans through our local presence, digital channels and more than 300,000 purpose-driven colleagues – caring for people where, when and how they choose in a way that is uniquely more connected, more convenient and more compassionate. And we do it all with heart, each and every day.
Position Summary Join CVS Health’s Digital Workplace AI team and help define the future of how 300,000+ colleagues find information, collaborate, and work smarter. As a Data Scientist on the Lumina Agentic team, you’ll design and deploy next-generation Agentic AI and RAG systems that transform CVS’s internal knowledge ecosystem. This role blends data science (70%) and software engineering (30%) to bring cutting-edge GenAI applications into production—at enterprise scale. You’ll collaborate with a multidisciplinary team of data scientists, engineers, and product managers to build intelligent systems that improve colleague productivity and reimagine the digital workplace.
If you’re passionate about building real-world GenAI systems that make work better for thousands of users, this is your opportunity to make a lasting impact at one of the world’s most trusted healthcare companies. What You’ll Do Design, train, and deploy retrieval-augmented generation (RAG) systems for knowledge discovery and assistance.
Develop Agentic AI applications that reason, take actions, and orchestrate complex workflows. Build and maintain knowledge ingestion pipelines and connect them to scalable APIs and services. Evaluate agentic systems for accuracy, bias, and quality.
Write and deploy microservices to Azure Kubernetes Service (AKS) using GitHub Actions CI/CD pipelines. Collaborate with cross-functional partners to define goals, experiment rapidly, and ship solutions that matter. Build data connectors and parsers to optimize data and knowledge extraction from documents and systems.
What You’ll Bring Soft Skills A self-starter with strong ownership—comfortable working with ambiguity and defining your own roadmap. Excellent communicator who collaborates effectively across teams.
Thrives in an agile environment and enjoys rapid iteration. Data Science & GenAI Experience with RAG system design and optimization (prompt engineering, chunking, retrieval tuning). Strong understanding of NLP, regex, parsing, and data cleaning.
Hands-on experience fine-tuning and deploying LLMs using frameworks such as LangChain or LangGraph Familiarity with evaluation metrics for GenAI performance Software Engineering Solid Python skills and experience developing API-based services. Proficiency with cloud platforms (Azure preferred) and containerized deployment (Docker, Kubernetes). Understanding of CI/CD pipelines and DevOps best practices (GitHub Actions, testing automation).
Required Qualifications 4+ years of programming experience in Python. 1+ year of experience building RAG systems with LangChain or LangGraph or similar tools 2+ year of experience working with LLMs and vector databases (MongoDB Atlas, Pinecone, or similar). 2+ year experience with cloud computing (Azure preferred).
2+ year of experience using Git and version control. Preferred Qualifications 2+ years of NLP experience (SpaCy, NLTK, BeautifulSoup). Experience deploying LLM systems at scale (10K+ documents or large enterprise workloads).
EWJD3