2022-05194 - Post-Doctorant F/H Improving the early detection of failures in medical transducer manufacturing with machine learning
The offer description below is in French

Contract type : Fixed-term contract

Level of qualifications required : PhD or equivalent

Fonction : Post-Doctoral Research Visit

Corps d'accueil : Chargé de Recherche (CR)

About the research centre or Inria department

The Inria Grenoble - Rhône-Alpes research center groups together almost 600 people in 22 research teams and 7 research support departments.

Staff is present on three campuses in Grenoble, in close collaboration with other research and higher education institutions (Université Grenoble Alpes, CNRS, CEA, INRAE, …), but also with key economic players in the area.

Inria Grenoble - Rhône-Alpes is active in the fields of high-performance computing, verification and embedded systems, modeling of the environment at multiple levels, and data science and artificial intelligence. The center is a top-level scientific institute with an extensive network of international collaborations in Europe and the rest of the world.

The STATIFY team specializes in the statistical modeling of systems involving data with a complex structure. Faced with the new problems posed by data science and deep learning methods, the objective is to develop mathematically well-founded statistical methods to propose models that capture the variability of the systems under consideration, models that are scalable to process large dimensional data and with guaranteed good levels of accuracy and precision.

Parallel Design, entity of General Electric Healthcare (GEHC), is a R&D center specialized in the design and the manufacturing of transducers for medical ultrasound.

Context

Dans le cadre d’un partenariat :

  • Avec General Electric Healthcare à Nice
  • Et Inria à Grenoble (équipe STATIFY)

For several years, GEHC has been carrying out a major digitalization project of its site located in Sophia Antipolis to improve the quality of the transducers it produces, to anticipate any performance drifts, to better understand their origin, and to increase the overall efficiency of its supply chain.

To do so, GEHC collects data throughout its supply chain, such as information about the materials and processes used during the transducer manufacturing, intermediate impedance measurement to detect manufacturing defects that would justify stopping the transducer fabrication, and carrying out a final acoustic performance measurement to check the quality of the transducer before delivering it to the customer.

For years, these analysis and measurement operations were done manually. Recently, GEHC has developed a digital architecture to transfer the data collected from its supply chain to the Microsoft Azure cloud, as well as a set of tools to process and analyze this data.

Assignment

The objective of this collaboration is to set up a global methodology for the clustering/classification of manufacturing defects in GEHC transducers, as well as the prediction of their final acoustic performance. This methodology includes the definition of numerical simulation plans, the choice of relevant data transformations and adequate training of schemes for machine learning algorithms. The evaluation of this methodology will be based on numerical and experimental data provided by GEHC from its simulation models and supply chain.

The collaboration will begin in October 2022 and will last for 2 years.

The recruited candidate will work 80% of his time at GEHC in Nice, 1090 route des Crêtes – WTC bât.R, 06560 Sophia Antipolis , and 20% of his time at Inria Grenoble, 655 Av. de l'Europe, 38330 Montbonnot-Saint-Martin.

Main activities

The data analysis made at the control stages is currently based on supervised and unsupervised machine learning methods trained with measurements from the supply chain. Such learning may seem “late” insofar as it occurs well after the design phase of the transducer once the supply chain has been set up. This is why GEHC would like to anticipate the training of its statistical models using numerous numerical simulations carried out during the design phase of the transducer. Such an approach brings a couple of scientific challenges, the main one being undoubtedly to train machine learning models with data from digital simulations and then apply them to measurements from real transducers. Indeed, the comparison between the two types of data (simulation vs measurements) shows a "limited" degree of similarity which does not allow the use of "raw" data (i.e. nontransformed and/or normalized) for training on standard algorithms.

This project will be based on a collaboration between GEHC and Inria so to combine physical (transducer), numerical (simulation), and algorithmic (artificial intelligence) knowledge into the analysis of supply chain data. Firstly, we will aim at designing a new version of the defect detection procedure that can be carried before setting up the supply chain using only simulations. This procedure will adopt regression models recently developed in the simulation-based inference literature [1]. We will explore two main research directions: learning the relation between physical properties of the transducers and the statistics of the simulated data using mixtures of experts [2] and invertible neural networks, such as normalizing flows [3]. Having a dataset containing numerical simulations will allow us to detect failures very early in the manufacturing flow, at the potential risk of inaccurate results on real data, due to differences in the statistical properties of each source of information.

To address this issue, we propose to initialize the statistical models using the simulated data and improve them using continuous learning with the samples collected from real experiments. We will also explore different choices of summary statistics to make the correspondence between simulated and real data as smooth as possible. We will develop online versions of the learning algorithms so to allow the integration of real data as soon as they become available and make the models adaptive to changes in the production line that may have an impact on the statistics of the observations. We will use stochastic approximation principles to design such online versions of the learning algorithms.

Once detected, the defects and their corresponding measurements will be gathered and used, possibly in an online manner too, to learn a clustering model to identify different classes of defect. We will consider Bayesian nonparametric versions of the clustering algorithm so to allow continuous improvements in the detections and not to commit to an arbitrary number of classes a priori [4]. Such strategy will allow the number of classes to change online, as new types of defects may be discovered during production time. This refined classification scheme will help uncovering the causes of defect in the transducers by means of statistical feature selection techniques as well identifying sequences of actions occurring during the production of a transducer that may lead to a defect.

References: [1] Cranmer et al. “The frontier of simulation-based inference” (2020) [2] Papamakarios et al. “Normalizing Flows for Probabilistic Modeling and Inference” (2021) [3] Deleforge et al. “High-dimensional regression with gaussian mixtures and partially-latent response variables” (2015) [4] Blei and Jordan “Variational inference for Dirichlet process mixtures” (2006)

Skills

Compétences techniques et niveau requis :

  • Connaissance de méthodes d'inférence Bayesienne (e.g. MCMC, ABC)
  • Programmation scientifique en python avec connaissance de librairies standards pour le machine learning, comme numpy, scipy, sklearn, pandas
  • (Souhaitable) Notions de méthodes de solution de problèmes inverses
  • (Souhaitable) Connaissance en réseaux de neurones génératifs (e.g. normalizing flows, GANs)
  • Langues : anglais (lu écrit parlé) et français

Compétences relationnelles :

  • aimer le travail en équipe
  • savoir communiquer sur ses résultats et ses avancées
  • être à l'écoute des différentes idées.

Compétences additionnelles appréciées :

  • être capable de travailler avec des personnes sur différents sites.

Benefits package

  • Restauration subventionnée
  • Transports publics remboursés partiellement
  • Congés: 7 semaines de congés annuels + 10 jours de RTT (base temps plein) + possibilité d'autorisations d'absence exceptionnelle (ex : enfants malades, déménagement)
  • Possibilité de télétravail (après 6 mois d'ancienneté) et aménagement du temps de travail
  • Équipements professionnels à disposition (visioconférence, prêts de matériels informatiques, etc.)
  • Prestations sociales, culturelles et sportives (Association de gestion des œuvres sociales d'Inria)
  • Accès à la formation professionnelle
  • Sécurité sociale

Remuneration

2653 gross salary/month