Users trust in collaborative writing in Wikipedia

Contract type : Internship agreement

Level of qualifications required : Master's or equivalent

Fonction : Internship Research

Context

The internship will be hosted by team COAST at Inria Centre at Université de Lorraine.

The internship will be co-supervised by Claudia Ignat, Director of recherche at Inria and Leo Joubert, MCF, Université de Rouen Normandie.

The internship is in the context of the PEPR eNSEMBLE (https://pepr-ensemble.fr/).

Assignment

Large-scale collaborative systems, where a large number of users collaborate to carry out a shared task, are attracting much attention from industry and academia. CSCW studies [1,2] showed that the awareness of behavior of other members of the team is an important component to compensate for the lack of direct communication. By allowing each member to be aware of what other members are doing, trust can be built in the team [3]. Trust is defined as an individual’s willingness to become vulnerable to the actions of others with the expectation that others will follow through on their commitments [4]. Trust is more crucial in open-collaborative systems such as Wikipedia in which members usually do not know each other personally. However, it is difficult for end users to manually assess the level of trust in each partner, that is the credibility value that a user can attribute to another user based on their past interactions. This internship aims to study the problem of trust evaluation and seeks to design a computational trust model dedicated to collaborative systems.

We are particularly interested in the case of Wikipedia, a collaborative online encyclopedia, because it provides us with a huge database produced by a large number of contributors.

On the platform, users can submit revisions of articles to improve their content. The objective of Wikipedia is to ensure the quality and neutrality of the platform's documents.

We already studied how the collaborative interaction of one user affects the trust assessed by the other users in the trust game [5] and contract-based multi-synchronous collaboration [6]. In the trust game [7] the interaction consisted of the money transaction between the two users, while in contract-based multi-synchronous collaboration the computation of trust was based on the adherence to/violation of contracts shared between two users. In the context of the trust game we also showed (i) that presenting a trust score to users encourages collaboration between them in a meaningful way, at a similar level to displaying participants' nicknames; (ii) that users conform to the confidence score in their decision-making regarding monetary exchange [8]. The results therefore suggest that a trust model can be deployed in collaborative systems in order to assist users. However, in Wikipedia, users do not interact directly, but by means of the article to which they contribute. It is difficult to figure out how one user’s edits might influence another user’s edits.

Usually, scientific literature considers the quality of a contribution in relation to its lifetime on a page. The longer the content of the contribution is present, the higher its quality. The problem with this measure is that it excludes from the quality judgment both the mutual trust that contributors may have with each other, and the fact that Wikipedia rules justifying the deletion of contributions may apply differently from one page to another.

To advance towards this issue, we want to calculate a Wikipedia user's trust level in relation to their past contributions, this trust level being able to predict the quality of this user's future contributions. The trust metric proposed in [5, 6] to predict the behavior of users in relation to their past interactions and taking into account fluctuations in user behavior could be applied by considering that interactions between users are the user contributions to revisions of Wikipedia articles. The main challenge is to define the quality of a user's contributions. For this we plan to study existing metrics based on the length of contributions (for example the length of a contribution in terms of the number of characters added) and the longevity of contributions (edit longevity, for example the duration of persistence of a contribution in the article).

Our concept relies on the use of a distance (for example the Levenstein distance) between the different versions of the document. We would like to calculate a measure of longevity based on a semantic distance by using BERT [9, 11] and SMART [10] models and compare it with existing measures. Wikipedia provides a dataset containing articles that have been manually assessed for quality by experts [12][13]. We therefore wish to validate our algorithms for measuring the quality of user contributions on this data.

Bibliography:

[1] Jeremy P. Birnholtz and Steven Ibara. Tracking changes in collaborative writing: edits, visibility and group maintenance. In CSCW 2012. ACM, 809–818.

[2] Chyng-Yang Jang, Charles Steinfield, and Ben Pfaff. Virtual team awareness and groupware support: an evaluation of the TeamSCOPE system. Int. J. Hum.-Comput. Stud. 56, 1 (2002), 109–126.

[3] C Brad Crisp and Sirkka L Jarvenpaa. 2013. Swift trust in global virtual teams. Journal of Personnel Psychology (2013)

[4] Roger C. Mayer and Mark B. Gavin. 2005. Trust in management and performance: Who minds the shop while the employees watch the boss? Acad Manage J 48, 5: 874–888.

[5] Quang-Vinh Dang and Claudia-Lavinia Ignat. Computational trust model for repeated trust games. In Proceedings of the IEEE Trustcom/BigDataSE/ISPA, Tianjin, China, pages 34—41, August 2016.

[6] Claudia-Lavinia Ignat and Quang-Vinh Dang. “Users trust assessment based on their past behavior in large scale collaboration”. In: The IEEE International Conference on Intelligent Computer Communication and Processing (ICCP 2021). Cluj-Napoca, Romania, Oct. 2021, 19:1–19:8. doi: 10.1109/ICCP53602.2021.9733490. hal: hal-03469344.

[7] Joyce Berg, John Dickhaut, and Kevin McCabe. Trust, reciprocity, and social history. Games and economic behavior, 10(1):122--142, 1995.

[8] Claudia-Lavinia Ignat, Quang-Vinh Dang, and Valerie L. Shalin. The influence of trust score on cooperative behavior. ACM Transactions on Internet Technology, 19(4), 22 pages, November 2019.

[9] Liu Zhuang, Lin Wayne, Shi Ya, and Zhao Jun. “A Robustly Optimized BERT Pre-training Approach with Post-training”. In: Proceedings of the 20th Chinese National Conference on Computational Linguistics. CCL 2021. Huhhot, China: Springer, Aug. 2021, pp. 471–484. doi: 10.1007/978-3-030-84186-7_31.

[10] Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Tuo Zhao. “SMART: Robust and Efficient Fine- Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization”. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, July 2020, pp. 2177–2190. doi: 10.18653/v1/2020.acl-main.197.

[11] Wei Wang et al. “StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding”. In: Proceedings of the 8th International Conference on Learning Representations. ICLR 2020. Addis Ababa, Ethiopia: OpenReview.net, Apr. 2020. url: https://openreview.net/forum?id=BJgQ4lSFPH

[12] Morten Warncke-Wang, Dan Cosley, and John Riedl. Tell me more: an actionable quality model for Wikipedia. In Proceedings of OpenSym, 10 pages, August 2013.

[13] Morten Warncke-Wang, English Wikipedia Quality Assessment Dataset. Figshare, Dataset. https://doi.org/10.6084/m9.figshare.1375406.v2

Main activities

Study the existing trust metrics in collaborative systems
Study existing works on article’s quality in Wikipedia
Propose a metric for the quality of user contributions based on the length and longevity of contributions (using both syntactic and semantic distances)
Adapt the trust metric proposed in [5] for Wikipedia considering that user interactions during trust game are their contributions for article revisions
Perform measurements using Wikipedia dataset

Skills

Engineering and/or Master 2 degree in Computer science / Applied mathematics / Cognitive science
Theoretical expertise: collaborative systems
Good collaborative and networking skills, excellent written and oral communication in English
Good programming skills
Strong analytical skills

Benefits package

Subsidized meals
Partial reimbursement of public transport costs
Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
Professional equipment available (videoconferencing, loan of computer equipment, etc.)
Social, cultural and sports events and activities
Access to vocational training
Social security coverage

Remuneration

4.35 €/hour

Apply for this position

General Information

Theme/Domain : Distributed Systems and middleware
Statistics (Big data) (BAP E)
Town/city : Villers lès Nancy
Inria Center : Centre Inria de l'Université de Lorraine
Starting date : 2024-03-01
Duration of contract : 6 months
Deadline to apply : 2024-12-12

Warning : you must enter your e-mail address in order to save your application to Inria. Applications must be submitted online on the Inria website. Processing of applications sent from other channels is not guaranteed.

Instruction to apply

Defence Security :
This position is likely to be situated in a restricted area (ZRR), as defined in Decree No. 2011-1425 relating to the protection of national scientific and technical potential (PPST).Authorisation to enter an area is granted by the director of the unit, following a favourable Ministerial decision, as defined in the decree of 3 July 2012 relating to the PPST. An unfavourable Ministerial decision in respect of a position situated in a ZRR would result in the cancellation of the appointment.

Recruitment Policy :
As part of its diversity policy, all Inria positions are accessible to people with disabilities.

Contacts

Inria Team : COAST
Recruiter :
Ignat Claudia-lavinia / claudia.ignat@inria.fr

About Inria

Inria is the French national research institute dedicated to digital science and technology. It employs 2,600 people. Its 200 agile project teams, generally run jointly with academic partners, include more than 3,500 scientists and engineers working to meet the challenges of digital technology, often at the interface with other disciplines. The Institute also employs numerous talents in over forty different professions. 900 research support staff contribute to the preparation and development of scientific and entrepreneurial projects that have a worldwide impact.