Submissions/Expanding Wikipedia Articles Across Languages via Recommendations

From Wikimania
Submission no. 3046 - T3
Title of the submission

Expanding Wikipedia Articles Across Languages via Recommendations

Type of submission (lecture, panel, tutorial/workshop, roundtable discussion, lightning talk, poster, birds of a feather discussion)


Author of the submission

Tiziano Piccardi

Language of presentation


E-mail address


Tiziano Piccardi, Bob West, Michele Catasta, Leila Zia (authors of the submission, real names)

Country of origin

Switzerland, USA

Affiliation, if any (organisation, company etc.)

EPFL, Stanford, Wikimedia Foundation

Personal homepage or blog
Abstract (up to 300 words to describe your proposal)

In the English version of Wikipedia alone, more than 2 million (37%) articles are marked as stubs. This substantial portion of low-quality articles is consistent across languages and can be one of the limitations in using Wikipedia as a complete source of information. Despite the substantial size of the problem, at the moment the Wikipedia's edit platform does not provide any support to deal with this kind of articles, and the editors have to put extra effort to understand how to improve the content.

Luckily, the majority of the articles are not stubs, and this provides a valuable source of information for understanding how a good article should look like. In particular, we can exploit the full articles in Wikipedia to i) extract structural patterns and to ii) recommend content (think, for example, section titles, infoboxes, images, etc.), using as a source of information both the article in different Wikipedia languages, and the set of articles of the same type (i.e., category-based filtering).

Finally, our key contribution is the development of a recommendation pipeline (for a variety of Wikipedia languages) integrated into the editing tools, to support both the creation and the “destubbing” of Wikipedia articles. The current assumption is that the recommendations are presented in a sidebar where the Wikipedia editors can read and import suggestions about missing sections, missing images, missing references and infobox content.

What will attendees take away from this session?

You will get insights about the pressing issue of stub articles in Wikipedia, currently accounting for a sizable chunk of all the articles across different languages.

Furthermore, you will have the opportunity to provide feedback and contribute to the design of our tool. The project is user-centric, and we need the expertise of the Wikipedia editors to develop a solution that can have a positive impact on the quality of the articles.

Theme of presentation

Technology, Interface & Infrastructure

For workshops and discussions, what level is the intended audience?
Length of session (if other than 25 minutes, specify how long)
25 minutes
Will you attend Wikimania if your submission is not accepted?
Slides or further information (optional)

This is an ongoing research. Some documentation about it is available at

Special requests

It would be great if the talk is scheduled for the first day of the conference. This will give us enough time to talk with the Wikipedia editors about this subject throughout the rest of the conference. Of course, this is a nice-to-have. :)

Is this Submission a Draft or Final?

This is a Completed submission for Wikimania 2017 ready to be reviewed by a member of the Programme Committee.

Interested attendees

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with a hash and four tildes. (# ~~~~).

  1. Amir É. Aharoni (talk) 04:42, 10 April 2017 (UTC)[reply]
  2. RachelWex (talk) 21:40, 11 April 2017 (UTC)[reply]