User:Jason.nlw/Wicipop Project

From Wikipedia, the free encyclopedia

Wikipop Project. Final Report[edit]

Forward[edit]

The aim of the Wicipop project was to significantly increase the Welsh language Wicipedia’s coverage of the Welsh pop scene and popular music and culture more broadly. The need for this type of content was recognised following a short study of Wicipedia content, which highlighted the lack of articles on the subject compared to other Wikipedias and also the high amount of traffic visiting articles that did already exist.

An application for a £20,000 grant from the Welsh Government to run the three month project was accepted in early January 2017. The project ran until March 31st 2017.

Increasing quality and quantity of content was achieved using a fusion of different techniques, namely community engagement, automatic content creation from data and through lobbying content creators to release content on an open license for reuse on Wikipedia.


Targets[edit]

These are the agreed targets to be achieved during the course of the project.

  1. An increase of 500 Welsh language articles on Wikipedia.
  2. A report into the automatic creation of Wikipedia content using openly licensed text, data and media.
  3. Train volunteers to edit Welsh Wicipedia articles by holding 3 Edit-a-thons in 3 different regions of Wales.


Results[edit]

A total of 783 articles were created as part of the project. The target of creating 500 articles was achieved and surpassed using three main techniques;

Community outreach[edit]

Radio 1 DJ Huw Stephens editing Wikipedia at the Aberystwyth Wicipop Edit-a-thon in 2017

Three edit-a-thon events were held held as part of the project, fulfilling one of the three core objectives. One in Aberystwyth in partnership with the Welsh Music Journal Y Selar, one in Bangor in partnership with Bangor University and one in Cardiff. A total of 25 people attended these events, creating or significantly improving 44 articles.

2 members of the National Library of Wales volunteer team have also been creating articles.

Experienced Wicipedia editors have also been engaged with the project through the creation of an online Wicipop Wicipedia project where editors can discuss the project, share resources and highlight articles which need creating. 9 editors have signed up to the project. 39 new articles have been recorded via the project page. This project page will continue to exist after the project is finished, hopefully leading to a sustainable increase in new articles being created.


Open access content[edit]

The release of existing content by content creators on an open licence was crucial to the success of the project and the quality of the Wicipedia content being produced. The quality and quantity of content released by third parties surpassed expectations. It is also notable that the release of media in particular motivated Wicipedia editors to produce related content. Here is an overview of content released on an open licence as a direct result of the Wicipop project.

  • 256 photographs of Welsh bands were shared by a member of the public via Flickr
  • 56 professional quality promotional images of Welsh bands were shared by Ochr 1/Antena
  • 50 album reviews were released by the magazine Y Selar.
  • BBC Cymru have shared the text for about 100 articles on Welsh bands.
  • Text for 14 articles were released jointly by Bangor University and Coleg Cymraeg Cenedlaethol.
  • Sain Records have released c.700 album covers and c.7000 sound clips on a free licence for use on Wikipedia and beyond. Work to upload this content to Wikimedia Commons is ongoing.

The release of soundclips and album artwork by Sain Records is extremely notable because a wholesale release of such valuable material by a major record label is rare if not without precedent. The release will significantly enrich online coverage of Welsh music, making it freely accessible to a global audience.

A number of articles from Coleg Cymraeg Genedlaethol’s ‘Esboniadur’ were also used as the bases for Wicipedia articles, as they are already available on a compatible licence.


Automatic content creation[edit]

The final target for the project was to produce a report into the possibilities surrounding the use of Auto Wiki Browser (AWB) and Pywikibot (Python based toolset for coders) to automate or semi automate article creation on Wicipedia using open text, images and data.

Auto Wiki Browser[edit]

AWB is a semi automated MediaWiki editor which enables the user to make mass edits to the Wikipedia main space in any language. The tool has advanced find and replace functionality for correcting and improving articles, however, for Wicipop AWB was successfully used to create brand new articles.

The toolset allows the user to create a template for a set of articles, into which data can be input from a spreadsheet.

An example of an article created automatically using this method.


AWB was used to create 600 Welsh language articles about Welsh and international pop singers and groups using open data from Wikidata but the tool could be used to created an unlimited number of articles using open data from any source.

Databases could also be developed using a number of sources with the sole purpose of creating Wicipedia articles.

Wikidata Lists[edit]

Lists generated from Wikidata, such as the list of albums in the above example can be added to the articles and formatted using AWB. These lists rely on live data from Wikidata so as new information is added to the database the list of Albums will automatically be updated. This provides an element of future proofing for the articles, increasing the likelihood that they will be accurate and relevant in the longer term.


Pywikibot[edit]

Pywikibot, is a collection of tools for coders developing scripts for making mass edits to Wikipedia, and to Access the Wikipedia API.

We used this toolset in order to explore the possibility of semi automating the wikification of open licence text. This would be particularly useful when creating a large number of Wikipedia articles using openly licensed texts.

For example we were able to analyze a piece of text and identify the names of people and places which might already have Wikipedia articles. We were then able to search Wikipedia using the API to check is the articles existed. Where they did, links to those articles could automatically be added to the text. This process was also used to identify and link to existing articles with text from spreadsheets prepared for AWB uploads.

“Astudiodd i fod yn weinidog yr efengyl yng Ngholeg y Bedyddwyr, Bangor, ac ym 1972 ffurfiodd Ac Eraill gyda Cleif Harpwood, Iestyn Garlick a Phil Edwards. Rhyddhawyd tair record fer i Sain cyn chwalu ym 1975.”

Once links to articles were established we were able to automatically add the relevant Wiki markup (code) to the text, to create a link to the article. The toolset would also allow us to automatically format references and links within a piece of text for inclusion on Wikipedia.

Conclusions[edit]

The targets for holding public events and producing a report into automatic article creation were achieved. The target for the number of articles to be created (500) was surpassed (783).

Automatic article creation using data and Welsh language templates has been proven to be a viable option for mass producing new Wicipedia content. Articles created in this way should align with user demand - providing access to Welsh language content that users want to see. This demand might be measured by studying Wikipedia analytics or by cooperating with the education sector for example.

Tests carried out by technical staff also demonstrated how plain text can be automatically formatted as a Wikipedia article. This could be very beneficial when transferring a large amount of openly licenced text to Wicipedia. Although the tests were a success they did highlight the need for further work around recognising Consonant Mutations (treigladau) within a piece of text.

The project saw a number of businesses and academic institutions, as well as members of the public, release content on an open licence. This included photographs, articles and soundclips. This demonstrates the readiness of organisations to free up Welsh language content for reuse given the right conditions - even when content has a potential commercial value.

Community engagement was an important element of this project. This helped to engage existing Wicipedia editors and encourage new editors to edit for the first time. Public events also served as a promotional tool, providing an opportunity to highlight the work of the project. They attracted local media attention and helped to forge partnerships with Selar, Ochr1 and Bangor University, all of whom also agreed to help promote the project and share content openly for the benefit of the project.

Holding 3 Edit-a-thons in quick succession on the same subject did stretch resources and securing fresh editors for each event proved difficult and in hindsight, holding fewer, more high profile events may have served the project better.

All things considered, the project has demonstrated a model of engagement with Wicipedia which could be applied to any number of subjects and which could be up-scaled successfully in order to produce large quantities of quality content at the same time as engaging communities and teaching new skills.