Analogical reasoning for KG management tasks: Entity Alignment and GraphRAG

Contract type : Internship

Level of qualifications required : Graduate degree or equivalent

Fonction : Internship Research

Context

Analogical reasoning, expressed with analogical quadruples of the form “a is to b as c is to d” (e.g. Paris is to France as Berlin is to Germany), is a natural way for human beings to reason about new situations based on the knowledge gained from experiencing similar situations. Its insights have been proven in various human cognitive tasks, such as natural language learning or problem-solving, and recently in Machine learning through analogy-based classifiers [9] and retrievers [5].

The past work of [6] has demonstrated that analogy-based classifiers can be applied to knowledge graph (KG) management tasks, showing great results for domain-specific KG bootstrapping, and paving the road for testing this analogy-based classifier on other KG management tasks. Among those, we aim at studying:

  • Entity alignment: where similar entities across different knowledge graphs should be detected [7,8] e.g. two entities representing the same city but in two different graphs.
  • GraphRAG: where facts from KGs are used to ground LLMs [1,2,3,4] e.g. a graph representing the products of a company is used in answering questions about these products.

Assignment

In this internship, we propose to study the application of analogical reasoning on two KG management tasks.

This internship will take place on the premises of the Wimmics team in Sophia Antipolis, under the supervision of:

 

Wimmics (Web-Instrumented huMan-Machine Interactions, Communities and Semantics) is a joint research team at Université Côte d’Azur, Inria, CNRS, I3S, whose research lies at the intersection of artificial intelligence and the Web. Wimmics members work on methods to extract, control, query, validate, infer, explain and interact with knowledge.

Main activities

In this internship, we propose to study the application of analogical reasoning on two KG management tasks. In particular, the internship will include the following tasks:

  1. Understanding key concepts of KG, entity alignment, GraphRAG, and analogical reasoning through a literature review.
  2. Applying analogical reasoning on entity alignment
    1. Identifying benchmark datasets (with a potential extension to the Ontology Matching task)
    2. Designing the experimental pipeline
    3. Implementation, experimentation and evaluation
  3. Applying analogical reasoning on RAG/GraphRAG
    1. Identifying RAG/GraphRAG components that could be replaced / enriched with analogical reasoning
    2. Identifying benchmark datasets
    3. Designing the experimental pipeline: for RAG and GraphRAG
    4. Implementation, experimentation and evaluation

Skills

You are studying in Master Year 2 / final year of engineering school, with a specialty in computer science or applied mathematics. You are proficient in:

  • Python programming
  • Machine Learning / Deep Learning, especially with frameworks like PyTorch or Tensorflow
  • Knowledge of LLMs, frameworks like LangChain, and (Graph)RAG would be appreciated.
  • Knowledge of the Semantic Web (RDF, RDFS, OWL, SPARQL, knowledge graphs, and ontologies) would be appreciated.
  • Ability to read and write in English

You are curious, eager to learn, face challenges, experiment, and discover by yourself.

 

Benefits package

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage