PhD Position F/M Towards a programmable autonomic platform for decentralized learning
Contract type : Fixed-term contract
Level of qualifications required : Graduate degree or equivalent
Fonction : PhD Position
Level of experience : Recently graduated
About the research centre or Inria department
The Inria Rennes - Bretagne Atlantique Centre is one of Inria's eight centres and has more than thirty research teams. The Inria Center is a major and recognized player in the field of digital sciences. It is at the heart of a rich R&D and innovation ecosystem: highly innovative PMEs, large industrial groups, competitiveness clusters, research and higher education players, laboratories of excellence, technological research institute, etc.
Context
The Internet of Things and its myriads of devices constantly producing data at the Edge of the Internet, has become a huge source of data to be collected, processed and analysed in a near real time fashion. In particular, AI applications are natural consumers of these data [1]. With the emergence of Edge computing [2], moving the learning process behind these applications closer to the Edge, where data are produced, is appealing, in turn raising mutliple challenges.
An appealing approach is Federated Learning (FL) [3], where each device learns based on locally produced data, creates a model and then sends it to a centralized server in charge of merging the locally obtained models into a single global model. This model is then sent back to devices so they can have the global model locally and restart learning with a model built based on data collected everywhere. Federated Learning provides benefits compared to a purely centralized learning approach by not having local data moved to the server, and thus enabling by-design privacy.
Yet, FL still relies over a centralized server as a coordinator and merger for the models. Without it, no aggregation is possible. Moreover, it suffers from the traditional limitations of centralized approaches: limited scalability and resilience. While it has been shown that adapting learning for more decentralized platforms is promising [4], the problem of efficiently learning over decentralized platform is still a widely open problem.
[1] R. Singh and S. S. Gill, “Edge AI: a survey”, Internet of Things and Cyber-Physical Systems 2023, Vol. 3, Page 71-92, doi:10.1016/j.iotcps.2023.02.004
[2] Weisong Shi et al., "Edge computing: Vision and challenges", in: IEEE internet of things journal 3.5 (2016), pp. 637–646.
[3] H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Agüera y Arcas. 2016. Communication-Efficient Learning of Deep Networks from Decentralized Data. https://doi.org/10.48550/arXiv.1602.05629
[4] Hegedűs, I., Danner, G., & Jelasity, M. Gossip learning as a decentralized alternative to federated learning. In Distributed Applications and Interoperable Systems: 19th IFIP WG 6.1 International Conference, DAIS 2019, Kongens Lyngby, Denmark, June 17–21, 2019, Proceedings 19 (pp. 74-90). Springer International Publishing.
Assignment
Different from the FL approach where the centralized component is vital for the mere functioning of the system, we plan to adopt the recently proposed alternative where devices learn locally and opportunistically exchange their model so as to build a more complete model. Finding the right trade-off between accuracy of the models within an area (composed of one or several interrelated nodes) and the network traffic generated will constitute the main driver for the solution.
Once deployed, the runtime system must face the changing conditions of the platform: disconnection of devices, bottlenecks in the network. More generally, adaptation is needed to adjust the parameters, hyperparameters and methods use to combine local models, so as to ensure the best tradeoff between accuracy of the model and resource usage.
A related problem is the ease with which a programmer can deploy, monitor and adapt at runtime the platform and its parameters. This relates to the notion of Programmability. Developing abstractions for the easy deployment and control of AI programs over decentralized platforms is an important step towards a larger adoption of the approach. This will constitute the thid aspect to get studied during the thesis.
The work will be experimentally validated over a real large scale platform such as the Grid'5000 platform [5]. To ease such a deployment, the E2Clab framework will be used. E2Clab [6,7] is a framework that implements a rigorous methodology that provides guidelines to move from real-life application workflows to representative settings of the physical infrastructure underlying this application in order to accurately reproduce its relevant behaviors and therefore understand end-to-end performance.
[5] Daniel Balouek et al. Adding Virtualization Capabilities to the Grid’5000 Testbed, in: Cloud Computing and Services Science, ed. by Ivan I. Ivanov et al., vol. 367, Communications in Computer and Information Science, Springer International Publishing, 2013, pp. 3–20, isbn: 978-3-319-04518-4.
[6] Daniel Rosendo, Pedro Silva, et al. E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproduc Edge-to-Cloud Experiments. Cluster 2020 - IEEE International Conference on Cluster Computing, Sep 2020, Kobe, Japan.
[7] The E2Clab project: https://team.inria.fr/kerdata/e2clab/ (https://team.inria.fr/kerdata/e2clab/).
Main activities
The activity of the PhD student recruited will include:
- Analysis and synthesis of the state of the art
- Design of distributed algorithms
- Proposal of programming abstractions
- Development and deployment at large scale of a runtime proof of concept
- Report and scientific article writing
Skills
Qualifications:
- Good communication and writing skills
- Strong programming / scripting skills
- Knowledge and/or experience in one or more of the following areas: distributed systems, adaptive systems, Cloud, Edge, Stream Processing, decentralized learning
Benefits package
- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
- Professional equipment available (videoconferencing, loan of computer equipment, etc.)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage
Remuneration
monthly gross salary amounting to 2200 euros
General Information
- Theme/Domain :
Distributed Systems and middleware
Scientific computing (BAP E) - Town/city : Rennes
- Inria Center : Centre Inria de l'Université de Rennes
- Starting date : 2025-09-01
- Duration of contract : 3 years
- Deadline to apply : 2025-05-12
Warning : you must enter your e-mail address in order to save your application to Inria. Applications must be submitted online on the Inria website. Processing of applications sent from other channels is not guaranteed.
Instruction to apply
Defence Security :
This position is likely to be situated in a restricted area (ZRR), as defined in Decree No. 2011-1425 relating to the protection of national scientific and technical potential (PPST).Authorisation to enter an area is granted by the director of the unit, following a favourable Ministerial decision, as defined in the decree of 3 July 2012 relating to the PPST. An unfavourable Ministerial decision in respect of a position situated in a ZRR would result in the cancellation of the appointment.
Recruitment Policy :
As part of its diversity policy, all Inria positions are accessible to people with disabilities.
Contacts
- Inria Team : MAGELLAN
-
PhD Supervisor :
Tedeschi Cedric / Cedric.Tedeschi@irisa.fr
About Inria
Inria is the French national research institute dedicated to digital science and technology. It employs 2,600 people. Its 200 agile project teams, generally run jointly with academic partners, include more than 3,500 scientists and engineers working to meet the challenges of digital technology, often at the interface with other disciplines. The Institute also employs numerous talents in over forty different professions. 900 research support staff contribute to the preparation and development of scientific and entrepreneurial projects that have a worldwide impact.