IDEMIA
Internship – Vision and LLM for pedestrian detection
Job Description
Since our founding, IDEMIA has been on a mission to unlock the world and make it safer through our cutting-edge identity technologies. Our technology leadership makes us the partner of choice for hundreds of governments and thousands of enterprises in over 180 countries, including some of the biggest and most influential brands in the world. In applying our unique expertise in biometrics and cryptography, we enable our clients to unlock simpler and safer ways to pay, connect, access, identify, travel and protect public places – at scale and in total security.
Our teams work from 5 continents and speak 100+ different languages. We strongly believe that our diversity is a key driver of innovation and performance.
Purpose
The objective of the internship is to investigate the Multi-Modal Composite Image Retrieval task. This task involves the intelligent fusion of natural language input (typically processed by a large language model) and vision features (provided by IDEMIA’s expert models) to build interactive models that expand the capabilities of standalone vision models.
The primary application will focus on pedestrian detection and re-identification, where a user might interact with the system through natural language to refine a detection model. For example, enhancing a pedestrian detection model to detect “pedestrians with umbrellas” based on a simple request like, “find people with umbrellas.”
This research-focused internship will begin with a literature review and replication of state-of-the-art results. The specific research directions will be further refined with input from the intern, based on emerging findings.
Key Missions
- Litterature Review on Multi-Modal Composite Image Retrieval and the main existing methods
- Implementing methods with Idemia pedestrian detector to make them multi modal
- Conducting experiments to evaluate the evolved pedestrian performance
- Collaborating with researchers to refine methodologies and contribute to the overall project goals
- Documenting findings and presenting results to stakeholders
Profile & Other Information
- Engineering student in the final year of the engineering cycle, or a Master’s student (M2) specializing in AI, computer vision, image processing, or Deep Learning
- Strong knowledge of computer vision and Deep Learning
- Proficiency in Python, and PyTorch (or similar frameworks)
- Proficient with Linux environments
- Solid training in data analysis and software development
- Proficient in English, both spoken and written (e.g., reading scientific articles, presenting work)
- Curious, proactive, and autonomous
- Clear and persuasive communication skills
By choosing to work at IDEMIA, you will join a unique tech company, offering a wide range of growth opportunities. You will contribute to a safer world, collaborating with an international and global community. We value the diversity of our teams and welcome people from all walks of life, regardless of how they look, where they come from, who they love, or what they think.
We deliver cutting edge, future proof innovation that reach the highest technological standards and we’re transforming, fast, to stay a leader in a world that’s changing fast, too.
At IDEMIA, people can develop their expertise and feel a sense of ownership and empowerment, in a global environment, as part of a company with the ambition and the ability to change the world.
Visit our website to know more about the leader in Identity Technologies