INTERNSHIP Conditional generation Humans in Context

Contract type : Internship

Level of qualifications required : Master's or equivalent

Fonction : Internship Research

About the research centre or Inria department

The Centre Inria de l’Université de Grenoble groups together almost 600 people in 22 research teams and 7 research support departments.

Staff is present on three campuses in Grenoble, in close collaboration with other research and higher education institutions (Université Grenoble Alpes, CNRS, CEA, INRAE, …), but also with key economic players in the area.

The Centre Inria de l’Université Grenoble Alpe is active in the fields of high-performance computing, verification and embedded systems, modeling of the environment at multiple levels, and data science and artificial intelligence. The center is a top-level scientific institute with an extensive network of international collaborations in Europe and the rest of the world.

Main activities

Context

Text-to-video models have demonstrated remarkable diversity in scene generation, enabling applications from entertainment to simulation. However, these models often lack the capability to condition generation on specific input details, such as clothing in virtual try-on tasks. Meanwhile, virtual try-on models excel at fine-grained conditioning but lack the flexibility for broader scene control. This project aims to bridge these two approaches, combining the diversity and flexibility of text-to-video models with the precision of virtual try-on conditioning. This work is particularly relevant for generating realistic humans in complex environments, interacting with objects and other people.

Project Objectives

The student will develop methods that integrate text-based scene control with fine-grained detail preservation. This includes enabling seamless interaction between humans and their surroundings, such as handling objects or engaging with other individuals in the scene. The ultimate goal is to generate highly realistic virtual try-on images that adapt garments to the target person’s morphology and pose in a specific visual environment.

 

Potential extension as a PhD position or engineering contract.

 

Skills

Skills

  • Expertise in Python, PyTorch, and computer vision.
  • Familiarity with image and video generation techniques, including diffusion models.
  • Strong background in machine learning and deep learning.
  • Experience with software development tools like GitHub or GitLab.
  • Effective communication and organizational abilities.

Benefits package

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave
  • Possibility of teleworking and flexible organization of working hours

Remuneration

  •  €4.35 per hour of actual presence at 1 January 2024.
  • About 590€ gross per month (internship allowance)