Submissions/Wikipedia Ranking of World Universities: an example of influence measurement from statistical analysis of Wikipedia network

From Wikimania


This is an Open submission for Wikimania 2017 that has not yet been reviewed by a member of the Programme Committee.

Submission no. 6009 Subject - E3
Title of the submission
Wikipedia Ranking of World Universities: an example of influence measurement from statistical analysis of Wikipedia network
Type of submission (lecture, panel, tutorial/workshop, roundtable discussion, lightning talk, poster, birds of a feather discussion)
Author of the submission
José Lages (presenter)
Dima Shepelyansky
Language of presentation
E-mail address
Country of origin
Affiliation, if any (organisation, company etc.)
Institut UTINAM, Université de Bourgogne-Franche-Comté, CNRS
Quantware group, laboratoire de physique théorique de Toulouse, Université de Toulouse, CNRS
Personal homepage or blog
Abstract (up to 300 words to describe your proposal)
Presently, Wikipedia contains about 300 language editions representing complementary cultural views on Human knowledge. The extraction of non trivial and hidden information encoded in this huge amount of encyclopedic data remains a great challenge. With this aim, we applied Markov chains and Google matrix methods to analyze the directed networks of hyperlinked Wikipedia articles, for different language editions. We used various algorithms (PageRank, CheiRank, and 2DRank) to rank articles within a given language edition. Using automatic extraction of ranked articles devoted to Universities and College, we obtained the Wikipedia Ranking of World Universities (WRWU) by analyzing networks of 24 language editions (about 17 millions of articles) representing about 60 percent of World population and about 60 percent of the total Wikipedia articles. Stinkingly, the WRWU measures efficiently the academic excellence of universities as well as eg the Academic Ranking of World Universities (ARWU) of the Shanghai Jiao Tong University. We showed that besides measuring academic excellence, the WRWU measures also societal, historical and regional importance of universities. Contrarily to the composite nature of the ARWU (Noble Prizes counting, number of articles in Nature, ...), the WRWU is based on statistical grounds treating on an equal footing all cultural points of view and all academic disciplines. The WRWU, which has attracted some worldwide interest, can be considered as complementary to already existing college and university rankings. In fact, information about these rankings are automatically encoded in the WRWU since the periodic releases of these rankings influence Wikipedia editors who update articles related to colleges and universities. Definitively, the WRWU gives a global picture of the most influent universities in the world.
Actuellement, Wikipédia contient presque 300 éditions linguistiques différentes représentant autant de points de vue culturels différents sur le savoir humain. L'extraction d'information pertinente enfouie dans cette immense masse de données est un challenge actuel. Dans ce but, nous avons appliqué des méthodes telles que les chaînes de Markov et la matrice de Google pour analyser les réseaux dirigés d'articles Wikipédia, et cela pour différentes éditions linguistiques. Nous avons utilisé plusieurs algorithmes (PageRank, CheiRank, and 2DRank) pour classer les articles d'une même édition linguistique. En utilisant l'extraction automatique des articles consacrés aux établissements d'enseignement supérieur et de recherche, nous avons obtenu le classement mondial Wikipédia des universités (WRWU) en analysant les réseaux de 24 éditions linguistiques (environ 17 millions de d'articles) couvrant près de 60% de la population mondiale et représentant 60% des articles de Wikipédia. A l'instar du classement de l'université Jiao Tong de Shanghai, le classement WRWU mesure effectivement l'excellence académique des établissements d'enseignement supérieur et de recherche. Nous avons montré qu'en plus de mesurer l'excellence académique des universités, le classement WRWU mesure également l'importance historique, sociétale, et/ou régionale des universités. Contrairement aux classements composites tels que le classement ARWU (comptage des prix Nobel, des articles dans Nature, ...), le classement WRWU est quant à lui uniquement basé sur des principes statistiques garantissant l'exacte équité entre les différents points de vue culturel et entre les différentes disciplines académiques. Le classement WRWU, qui a suscité un certain écho médiatique international, peut être vu comme un classement complémentaire aux classements déjà existants. En fait, l'information sur ces classements est automatiquement encodée dans le classement WRWU puisque les parutions périodiques de ceux-ci influencent les éditeurs de Wikipédia mettant à jour les articles consacrés aux universités. Définitivement, le classement WRWU donne le schéma global des universités les plus influentes du monde.

What will attendees take away from this session?
Attendees will learn about:
  • extraction of hidden information from Wikipedia
  • measure of influence through Wikipedia editions
  • measure of culture entanglement/interactions through Wikipedia editions
  • Google matrix treatment of Wikipedia networks
  • Wikipedia Ranking of World Universities
Theme of presentation
  • WikiCulture & Community
  • Education
For workshops and discussions, what level is the intended audience?
Any level
Length of session (if other than 25 minutes, specify how long)
25 minutes is fine, but 40 minutes would be better.
Will you attend Wikimania if your submission is not accepted?
Pretty sure I could not since my travel allowance is conditioned upon presentation of a lecture.
Slides or further information (optional)
Special requests
Is this Submission a Draft or Final?

This is a Completed submission for Wikimania 2017 ready to be reviewed by a member of the Programme Committee.

Interested attendees

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).

  1. Daniel Mietchen (talk) 01:48, 10 April 2017 (UTC)[reply]
  2. ...