Indium Software
Data Scientist
Job Description
Data Science: Strong in NLP- Topic Modelling, Working knowledge of LLM for summarization, topic synthesis & annotation
Data Scientist with a strong background in natural language processing (NLP) to join our team. The ideal candidate will have expertise in topic modeling and a working knowledge of large language models (LLMs) for tasks like summarization, topic synthesis, and annotation.
Key Responsibilities:
NLP Model Development:
Develop and refine NLP models, including topic modeling algorithms, to extract meaningful insights from text data.
Experiment with and evaluate different NLP techniques to optimize model performance.
Stay up-to-date with the latest advancements in NLP and incorporate them into our projects.
LLM Applications:
Leverage LLMs to perform tasks such as summarization, topic synthesis, and annotation.
Explore and experiment with different LLM architectures and fine-tuning techniques.
Integrate LLMs into our data pipelines and applications to enhance their capabilities.
Data Analysis and Visualization:
Analyze and interpret the results of NLP models to identify patterns, trends, and actionable insights.
Communicate findings effectively through clear and concise visualizations.
Collaboration:
Work closely with data engineers, data analysts, and other team members to develop and deploy data-driven solutions.
Contribute to the development of data strategies and best practices.
Qualifications and Skills:
Bachelor’s or Master’s degree in Computer Science, Data Science, Statistics, or a related field.
Strong foundation in NLP concepts and techniques, including topic modeling.
Experience with NLP libraries and frameworks like NLTK, spaCy, or Gensim.
Familiarity with LLMs and their applications, such as GPT-3 or BERT.
Proficiency in Python programming, including data manipulation and analysis libraries.
Excellent problem-solving, analytical, and communication skills.
Ability to work independently and as part of a team.
Preferred Qualifications:
Experience with cloud platforms like AWS, GCP, or Azure.
Knowledge of machine learning frameworks like TensorFlow or PyTorch.
Experience with data visualization tools like Tableau or Power BI.