User:Moudy83/updated PR articles

From Wikipedia, the free encyclopedia

Below is an incomplete list of academic conference presentations, peer-reviewed papers and other types of academic writing which focus on Wikipedia as their subject. Works that mention Wikipedia only in passing are unlikely to be listed.

Unpublished works of presumably academic quality are listed in a dedicated section. For non-academic research, as well as tools that may be useful in researching Wikipedia, see Wikipedia:Researching Wikipedia. For a WikiProject focused on doing research on Wikipedia, see Wikipedia:WikiProject Wikidemia.

For academic papers using Wikipedia as a source, see Wikipedia:Wikipedia as an academic source, and the bibliography links listed at the bottom of this page. For teaching with Wikipedia, see Wikipedia:School and university projects. For researching with Wikipedia, see Wikipedia:Researching with Wikipedia. For non-academic works focused on Wikipedia, see Wikipedia:Wikipedia in the media.

Over time[edit]

Growth of academic interest in Wikipedia: number of publications by year, from creation of Wikipedia to end of 2007. Source: based on mid-May 2008 revision of this page.

Peer reviewed[edit]

Conference presentations and papers[edit]

See also: Wikimania and WikiSym conference series
This table is sortable.
Authors Title Conference / published in Year Online Notes Abstract Keywords
Amir Hossein Jadidinejad, Fariborz Mahmoudi Cross-Language Information Retrieval Using Meta-language Index Construction and Structural Queries Proceeding of the Multilingual Information Access Evaluation I. Text Retrieval Experiments, Lecture Notes in Computer Science, Volume 6241/2011, pp. 70-77 2011 [1] CLEF2009
Structural Query Language allows expert users to richly represent its information needs but unfortunately, the complexity of SQLs make them impractical in the Web search engines. Automatically detecting the concepts in an unstructured user’s information need and generating a richly structured, multilingual equivalent query is an ideal solution. We utilize Wikipedia as a great concept repository and also some state of the art algorithms for extracting Wikipedia’s concepts from the user’s information need. This process is called “Query Wikification”. Our experiments on the TEL corpus at CLEF2009 achieves +23% and +17% improvement in Mean Average Precision and Recall against the baseline. Our approach is unique in that, it does improve both precision and recall; two pans that often improving one, hurt the another.
Wikipedia-Mining, Indri Structural Query Language, CLEF
Darren Hardy Geospatial signatures of anonymous Wikipedia authorship AAG Annual Meeting, Washington, DC 2010 [2]
We've seen a rapid rise of volunteered geographic information on websites and Google Earth, some of which is produced en masse by global virtual communities. In this talk, I discuss whether the first law of geography applies to Wikipedia. My recent study of geographic effects in Wikipedia authorship includes data from 7 years of contributions to a million geotagged articles in 21 languages. My methodology defines a proximity metric between authors and articles, and uses IP geolocation on 2.8 million anonymous authors. I use this metric to test my hypothesis that anonymous Wikipedia authors write about nearby places more than distant ones. My results provide empirical evidence of geographic effects in an online authorship community.
distance decay, geotagging, VGI
Darren Hardy The Wikification of Geospatial Metadata Workshop on the Role of Volunteered Geographic Information in Advancing Science (GIScience) 2010 [3]
For decades, metadata has been the ever-present, cure-all solution to heterogeneous data integration and use. Yet, high-quality, ubiquitous metadata is extremely rare in practice. Current volunteered geographic information systems may provide insights on how the scientific community can produce and manage metadata for geospatial data infrastructures.
Geospatial data interoperability, scientific knowledge generation, VGI
Markus Fuchs Aufbau eines linguistischen Korpus aus den Daten der englischen Wikipedia Proceedings of the Conference on Natural Language Processing 2010 (KONVENS 10) 2010 [4] German corpus, database, wikipedia
Sérgio Nunes, Cristina Ribeiro, Gabriel David Term Frequency Dynamics in Collaborative Articles Proceedings of the 10th ACM Symposium on Document Engineering (DocEng'10) 2010 [5]
Documents on the World Wide Web are dynamic entities. Mainstream information retrieval systems and techniques are primarily focused on the latest version a document, generally ignoring its evolution over time. In this work, we study the term frequency dynamics in web documents over their lifespan. We use the Wikipedia as a document collection because it is a broad and public resource and, more important, because it provides access to the complete revision history of each document. We investigate the progression of similarity values over two projection variables, namely revision order and revision date. Based on this investigation we find that term frequency in encyclopedic documents - i.e. comprehensive and focused on a single topic - exhibits a rapid and steady progression towards the document's current version. The content in early versions quickly becomes very similar to the present version of the document.
document dynamics, term frequency, wikipedia
Roberto Navigli and Paola Velardi Learning Word-Class Lattices for Definition and Hypernym Extraction Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden 2010 [6]

Definition extraction is the task of automatically identifying definitional sentences within texts. The task has proven useful in many research areas including ontology learning, relation extraction and question answering. However, current approaches – mostly focused on lexico-syntactic patterns – suffer from both low recall and precision, as definitional sentences occur in highly variable syntactic structures. In this paper, we propose Word-Class Lattices (WCLs), a generalization of word lattices that we use to model textual definitions. Lattices are learned from a dataset of definitions from Wikipedia. Our method is applied to the task of def-

inition and hypernym extraction and compares favorably to other pattern generalization methods proposed in the literature.
wikipedia, definition identification, hypernym extraction
Roberto Navigli and Simone Ponzetto BabelNet: Building a Very Large Multilingual Semantic Network Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden 2010 [7]
BabelNet, a very large, wide-coverage multilingual semantic network, is automatically constructed by means of a methodology that integrates lexicographic and encyclopedic knowledge from WordNet and Wikipedia. In addition Machine Translation is also applied to enrich the resource with lexical information for all languages. We conduct experiments on new and existing gold-standard datasets to show the high quality and coverage of the resource.
wikipedia, knowledge acquisition, semantic networks
Simone Ponzetto and Roberto Navigli Knowledge-rich Word Sense Disambiguation Rivaling Supervised Systems Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden 2010 [8]

One of the main obstacles to high-performance Word Sense Disambiguation (WSD) is the knowledge acquisition bottleneck. In this paper, we present a methodology to automatically extend WordNet with large amounts of semantic relations from an encyclopedic resource, namely Wikipedia. We show that, when provided with a vast amount of high-quality semantic relations, simple knowledge-lean disambiguation algorithms compete with state-of-the-art supervised WSD systems in a coarse-grained

all-words setting and outperform them on gold-standard domain-specific datasets.
wikipedia, word sense disambiguation, knowledge acquisition
Peter Kin-Fong Fong and Robert P. Biuk-Aghai What Did They Do? Deriving High-Level Edit Histories in Wikis Proceedings of the 6th International Symposium on Wikis and Open Collaboration (WikiSym 2010), Gdansk, Poland 2010 [9] Derives summaries of the kinds of edits performed (e.g. spelling correction, wikify, disambiguation, etc.) and proposes the calculation of an edit significance metric.
Wikis have become a popular online collaboration platform. Their open nature can, and indeed does, lead to a large number of editors of their articles, who create a large number of revisions. These editors make various types of edits on an article, from minor ones such as spelling correction and text formatting, to major revisions such as new content introduction, whole article re-structuring, etc. Given the enormous number of revisions, it is difficult to identify the type of contributions made in these revisions through human observation alone. Moreover, different types of edits imply different edit significance. A revision that introduces new content is arguably more significant than a revision making a few spelling corrections. By taking edit types into account, better measurements of edit significance can be produced. This paper proposes a method for categorizing and presenting edits in an intuitive way and with a flexible measure of significance of each individual editor’s contributions.
wikipedia, revision history, text differencing, edit categorization, edit significance
Cheong-Iao Pang and Robert P. Biuk-Aghai A Method for Category Similarity Calculation in Wikis Proceedings of the 6th International Symposium on Wikis and Open Collaboration (WikiSym 2010), Gdansk, Poland 2010 [10] Degree of similarity between categories is calculated based on the co-assignment of articles to categories. This has application in visualization and other areas.
Wikis, such as Wikipedia, allow their authors to assign categories to articles in order to better organize related content. This paper presents a method to calculate similarities between categories, illustrated by a calculation for the top-level categories in the Simple English version of Wikipedia.
wiki, category similarity
Robert P. Biuk-Aghai and Keng Hong Lei Chatting in the Wiki: Synchronous-Asynchronous Integration Proceedings of the 6th International Symposium on Wikis and Open Collaboration (WikiSym 2010), Gdansk, Poland 2010 [11] Collaborative work often consists of a mix of synchronous and asynchronous activity. This work extends the currently purely asynchronous wiki systems through a deep integration of instant messaging facilities in the wiki system.
Wikis have become popular platforms for collaborative writing. The traditional production mode has been remote asynchronous and supported by wiki systems geared toward both asynchronous writing and asynchronous communication. However, many people have come to rely on synchronous communication in their daily work. This paper first discusses aspects of synchronous and asynchronous activity and communication and then proposes an integration of synchronous communication facilities in wikis. A prototype system developed by the authors is briefly presented.
instant messaging, wiki, communication, synchronous, asynchronous
Teun Lucassen and Jan Maarten Schraagen Trust in Wikipedia: How Users Trust Information from an Unknown Source 4th Workshop on Information Credibility on the Web, Raleigh, North Carolina USA 2010 [12]
The use of Wikipedia as an information source is becoming increasingly popular. Several studies have shown that its information quality is high. Normally, when considering information trust, the source of information is an important factor. However, because of the open-source nature of Wikipedia articles, their sources remain mostly unknown. This means that other features need to be used to assess the trustworthiness of the articles. We describe article features - such as images and references - which lay Wikipedia readers use to estimate trustworthiness. The quality and the topics of the articles are manipulated in an experiment to reproduce the varying quality on Wikipedia and the familiarity of the readers with the topics. We show that the three most important features are textual features, references and images.
wikipedia, trust, credibility, think aloud protocol
Daniel Hasan Dalip, Marcos André Gonçalves, Marco Cristo and Pável Calado Automatic quality assessment of content created collaboratively by web communities: a case study of Wikipedia 9th ACM/IEEE-CS joint conference on Digital libraries, Austin, Texas USA 2009 [13]
The old dream of a universal repository containing all the human knowledge and culture is becoming possible through the Internet and the Web. Moreover, this is happening with the direct collaborative, participation of people. Wikipedia is a great example. It is an enormous repository of information with free access and edition, created by the community in a collaborative manner. However, this large amount of information, made available democratically and virtually without any control, raises questions about its relative quality. In this work we explore a significant number of quality indicators, some of them proposed by us and used here for the first time, and study their capability to assess the quality of Wikipedia articles. Furthermore, we explore machine learning techniques to combine these quality indicators into one single assessment judgment. Through experiments, we show that the most important quality indicators are the easiest ones to extract, namely, textual features related to length, structure and style. We were also able to determine which indicators did not contribute significantly to the quality assessment. These were, coincidentally, the most complex features, such as those based on link analysis. Finally, we compare our combination method with state-of-the-art solution and show significant improvements in terms of effective quality prediction.
wikipedia, SVM, machine learning, quality assessment
Krishnan Ramanathan and Komal Kapoor Creating user profiles using Wikipedia The 28th international conference on conceptual modeling (ER 2009), Gramado Brazil, Springer LNCS 5829 2009 [14]
Creating user profiles is an important step in personalization. Many methods for user profile creation have been developed to date using different representations such as term vectors and concepts from an ontology like DMOZ. In this paper we propose and evaluate different methods for creating user profiles using Wikipedia as the representation. The key idea in our approach is to map documents to Wikipedia concepts at different levels of resolution: words, key phrases, sentences, paragraphs, the document summary and the entire document itself. We suggest a method for evaluating recall by pooling the relevant results from the different methods and evaluate our results for both precision and recall. We also suggest a novel method for profile evaluation by assessing the recall over a known ontological profile drawn from DMOZ.
wikipedia, User profiles, User modeling, DMOZ
Andrea Prato and Marco Ronchetti Using Wikipedia as a reference for extracting semantic information from a text. The Third International Conference on Advances in Semantic Processing SEMAPRO 2009, Malta 2009
In this paper we present an algorithm that, using Wikipedia as a reference, extracts semantic information from an arbitrary text. Our algorithm refines a procedure proposed by others, which mines all the text contained in the whole Wikipedia. Our refinement, based on a clustering approach, exploits the semantic information contained in certain types of Wikipedia hyperlinks, and also introduces an analysis based on multi-words. Our algorithm outperforms current methods in that the output contains many less false positives. We were also able to understand which (structural) part of the texts provides most of the semantic information extracted by the algorithm.
wikipedia, Semantic Relatedness, Semantic Analysis
Simone P. Ponzetto and Roberto Navigli Large-Scale Taxonomy Mapping for Restructuring and Integrating Wikipedia IJCAI 2009: 21st International Joint Conference on Artificial Intelligence, Pasadena, California 2009 [15]
We present a knowledge-rich methodology for disambiguating Wikipedia categories with WordNet synsets and using this semantic information to restructure a taxonomy automatically generated from the Wikipedia system of categories. We evaluate against a manual gold standard and show that both category disambiguation and taxonomy restructuring perform with high accuracy. Besides, we assess these methods on automatically generated datasets and show that we are able to effectively enrich WordNet with a large number of instances from Wikipedia. Our approach produces an integrated resource, thus bringing together the fine-grained classification of instances in Wikipedia and a well-structured top-level taxonomy from WordNet.
wikipedia, knowledge acquisition, taxonomy learning, category disambiguation, word sense disambiguation
Aaron Halfaker, Aniket Kittur, Robert Kraut and John Riedl A Jury of your Peers: Quality, Experience and Ownership in Wikipedia WikiSym2009: Symposium on Wikis and Open Collaboration 2009 [16]
Wikipedia is a highly successful example of what mass collaboration in an informal peer review system can accomplish. In this paper, we examine the role that the quality of the contributions, the experience of the contributors and the ownership of the content play in the decisions over which contributions become part of Wikipedia and which ones are rejected by the community. We introduce and justify a versatile metric for automatically measuring the quality of a contribution. We find little evidence that experience helps contributors avoid rejection. In fact, as they gain experience, contributors are even more likely to have their work rejected. We also find strong evidence of ownership behaviors in practice despite the fact that ownership of content is discouraged within Wikipedia.
wikipedia peer peer review wikiwork experience ownership quality
Myshkin Ingawale, Amitava Dutta, Rahul Roy, Priya Seetharaman The Small Worlds of Wikipedia: Implications for Growth, Quality and Sustainability of Collaborative Knowledge Networks AMCIS 2009: Americas Conference on Information Systems 2009 [17]
This work is a longitudinal network analysis of the interaction networks of Wikipedia, a free, user-led collaborativelygenerated online encyclopedia. Making a case for representing Wikipedia as a knowledge network, and using the lens of contemporary graph theory, we attempt to unravel its knowledge creation process and growth dynamics over time. Typical small-world characteristics of short path-length and high clustering have important theoretical implications for knowledge networks. We show Wikipedia’s small-world nature to be increasing over time, while also uncovering power laws and assortative mixing. Investigating the process by which an apparently un-coordinated, diversely motivated swarm of assorted contributors, create and maintain remarkably high quality content, we find an association between Quality and Structural Holes. We find that a few key high degree, cluster spanning nodes - ‘hubs’ - hold the growing network together, and discuss implications for the networks’ growth and emergent quality.
knowledge networks, interaction networks, small-worlds
Raphael Hoffmann, Saleema Amershi, Kayur Patel, Fei Wu, James Fogarty, Daniel S. Weld Amplifying Community Content Creation Using Mixed-Initiative Information Extraction CHI2009: Conference on Computer Human Interaction 2009 [18]
Although existing work has explored both information extraction and community content creation, most research has focused on them in isolation. In contrast, we see the greatest leverage in the synergistic pairing of these methods as two interlocking feedback cycles. This paper explores the potential synergy promised if these cycles can be made to accelerate each other by exploiting the same edits to advance both community content creation and learning-based information extraction. We examine our proposed synergy in the context of Wikipedia infoboxes and the Kylin information extraction system. After developing and refining a set of interfaces to present the verification of Kylin extractions as a non primary task in the context of Wikipedia articles, we develop an innovative use of Web search advertising services to study people engaged in some other primary task. We demonstrate our proposed synergy by analyzing our deployment from two complementary perspectives: (1) we show we accelerate community content creation by using Kylin's information extraction to significantly increase the likelihood that a person visiting a Wikipedia article as a part of some other primary task will spontaneously choose to help improve the article's infobox, and (2) we show we accelerate information extraction by using contributions collected from people interacting with our designs to significantly improve Kylin's extraction performance.
computer-supported collaboration, user interface, information extraction
Maria Grineva, Maxim Grinev and Dmitry Lizorkin Extracting Key Terms From Noisy and Multitheme Documents WWW2009: 18th International World Wide Web Conference 2009 [19]

We present a novel method for key term extraction from text documents. In our method, document is modeled as a graph of semantic relationships between terms of that document. We exploit the following remarkable feature of the graph: the terms related to the main topics of the document tend to bunch up into densely interconnected subgraphs or communities, while non-important terms fall into weakly interconnected communities, or even become isolated vertices. We apply graph community detection techniques to partition the graph into thematically cohesive groups of terms. We introduce a criterion function to select groups that contain key terms discarding groups with unimportant terms. To weight terms and determine semantic relatedness between them we exploit information extracted from Wikipedia. Using such an approach gives us the following two advantages. First, it allows effectively processing multi-theme documents. Second, it is good at filtering out noise information in the document, such as, for example, navigational bars or headers in web pages.

Evaluations of the method show that it outperforms existing methods producing key terms with higher precision and recall. Additional experiments on web pages prove that our method appears to be substantially more effective on noisy and multi-theme documents than existing methods.
semantic relatedness, contextual advertising, information retrieval
Andrew Krizhanovsky and Feiyu Lin Related terms search based on WordNet / Wiktionary and its application in Ontology Matching RCDL 2009 [20] Wikokit
A set of ontology matching algorithms (for finding correspondences between concepts) is based on a thesaurus that provides the source data for the semantic distance calculations. In this wiki era, new resources may spring up and improve this kind of semantic search. In the paper a solution of this task based on Russian Wiktionary is compared to WordNet based algorithms. Metrics are estimated using the test collection, containing 353 English word pairs with a relatedness score assigned by human evaluators. The experiment shows that the proposed method is capable in principle of calculating a semantic distance between pair of words in any language presented in Russian Wiktionary. The calculation of Wiktionary based metric had required the development of the open-source Wiktionary parser software.
Wiktionary, semantic relatedness, information retrieval
Shane Greenstein and Michelle Devereaux Wikipedia in the Spotlight, Kellogg Case Number: 5-306-507; HBS Case Number: KEL253. Case Collection at the Kellogg School of Management. [21] 2009 [22]
By 2009 Wikipedia had achieved the type of success that only a handful of young organizations could ever dream of reaching. It had grown from almost nothing in 2001 to become one the consistently highest ranked and most visited sites on the Internet. This success brought new problems and at a scale that no organization of this type had ever before faced. The case exposes students to Wikipedia’s brief history, the causes of its success, and the issues it faced going forward. Two topics form the focus of the case. The first concerns the rules and norms for submission and editing, which raise questions about the ambiguity of Wikipedia’s authority and the virtual cycle that keeps the site going. The second lesson concerns the need to alter its practices as it gains in popularity, raising questions about what any wiki site, profit-oriented or open source, must do to scale to large numbers of participants and entries. These issues arise as part of a discussion about the site’s priorities going forward.
Open Source Organizations, Strategy at Wikipedia, Managing Internet Media, Wiki, Jimbo Wales, Business of Encyclopedias
Shane Greenstein and Rebecca Frazzano and Evan Meagher Triumph of the Commons: Wikia and the Commercialization of Open Source Communities in 2009, Kellogg Case Number: 5-309-509; HBS Case Number. Case Collection at the Kellogg School of Management. [23] 2009 [24]
In 2009 Wikia was the Internet’s largest for-profit provider of hosted open-source wikis, with over a million daily users. After five years of existence, the organization had supported a wide range of exploratory activities, experiencing both success and failure. With approximately $3 million of cash on hand, Wikia turned cash flow positive in 2009, with revenues of approximately $4.5 million, affording it time and flexibility to try new things. Some of the company’s employees and investors suggested that Wikia should attempt to expand and market itself more aggressively, but which strategic direction should receive priority? The case presents many of the issues and tradeoffs facing CEO Gil Penchina as he formulates these priorities.
Open Source Organizations, Commercialization, Managing Internet Media, Wiki, The Business of Wikis, Learning from Wikipedia
Robert P. Biuk-Aghai, Libby Veng-Sam Tang, Simon Fong and Yain-Whar Si Wikis as Digital Ecosystems: An Analysis Based on Authorship Third IEEE International Conference on Digital Ecosystems and Technologies (DEST 2009), Istanbul, Turkey, 31 May - 3 June 2009 2009 [25] Using Wikipedia as an example, shows that large volunteer-contributed wikis feature characteristics of digital ecosystems.
Wikis, best represented by the popular and highly successfulWikipedia system, have established themselves as important componentsof a collaboration infrastructure. We suggest that the complex networkof user-contributors in volunteer-contributed wikis constitutes adigital ecosystem that bears all the characteristics typical of suchsystems. This paper presents an analysis supporting this notion basedon significance of authorship within the wiki. Our findings confirm thehypothesis that large volunteer-contributed wikis are digitalecosystems, and thus that the findings from the digital ecosystemsresearch stream are applicable to this type of system.
analysis, co-authorship, collaborative writing, digital ecosystem, wiki, Wikipedia
Brent Hecht and Darren Gergle Measuring Self-Focus Bias in Community-Maintained Knowledge Repositories Communities and Technologies 2009 [26] Uses a "hyperlingual approach" to demonstrate that each language of Wikipedia contains a massive amount of self-focus in its represented world knowledge.
Self-focus is a novel way of understanding a type of bias in community-maintained Web 2.0 graph structures. It goes beyond previous measures of topical coverage bias by encapsulating both node- and edge-hosted biases in a single holistic measure of an entire community-maintained graph. We outline two methods to quantify self-focus, one of which is very computationally inexpensive, and present empirical evidence for the existence of self-focus using a “hyperlingual” approach that examines 15 different language editions of Wikipedia. We suggest applications of our methods and discuss the risks of ignoring self-focus bias in technological applications.
hyperlingual, multi-lingual, self-focus
Michael D. Lieberman and Jimmy Lin You Are Where You Edit: Locating Wikipedia Users Through Edit Histories 3rd International Conference on Weblogs and Social Media (ICWSM) 2009 [27]
Whether knowingly or otherwise, Wikipedia contributors reveal their interests and expertise through their contribution patterns. An analysis of Wikipedia edit histories shows that it is often possible to associate contributors with relatively small geographic regions, usually corresponding to where they were born or where they presently live. For many contributors, the geographic coordinates of pages they have edited are tightly clustered. Results suggest that a wealth of information about contributors can be gleaned from edit histories. This illustrates the efficacy of data mining on large, publicly-available datasets and raises potential privacy concerns.
Panciera, K.; Halfaker, A.; Terveen, L. Wikipedians are born, not made: a study of power editors on Wikipedia ACM 2009 International Conference on Group Work 2009 [28] show that the amount of work done by Wikipedians and non-Wikipedians differs significantly from their very first day.
we show that the amount of work done by Wikipedians and non-Wikipedians differs significantly from their very first day. Our results suggest a design opportunity: customizing the initial user experience to improve retention and channel new users’ intense energy.
Myshkin Ingawale, Rahul Roy, Priya Seetharaman Persistence of Cultural Norms in Online Communities: The Curious Case of WikiLove PACIS 2009: Pacific Asia Conference on Information Systems 2009 [29]
Tremendous progress in information and communication technologies in the last two decades has enabled the phenomenon of Internet-based groups and collectives, generally referred to as online communities. Many online communities have developed distinct cultures of their own, with accompanying norms. A particular research puzzle is the persistence and stability of such norms in online communities, even in the face of often exponential growth rates in uninitiated new users. We propose a network-theoretic approach to explain this persistence. Our approach consists of modelling the online community as a network of interactions, and representing cultural norms as transmissible ideas (or ‘memes’) propagating through this network. We argue that persistence of a norm over time depends, amongst other things, on the structure of the network through which it propagates. Using previous results from Network Science and Epidemiology, we show that certain structures are better than others to ensure persistence: namely, structures which have scale-free degree distributions and assortative mixing. We illustrate this theory using the case of the community of contributors at Wikipedia, a collaboratively generated online encyclopaedia.
online communities, persistence of norms, epidemiology, network science
Lam, S.K.; Riedl, J. Is Wikipedia Growing a Longer Tail? ACM 2009 International Conference on Group Work 2009 [30]
Wikipedia has millions of articles, many of which receive little attention. One group of Wikipedians believes these obscure entries should be removed because they are uninteresting and neglected; these are the deletionists. Other Wikipedians disagree, arguing that this long tail of articles is precisely Wikipedia’s advantage over other encyclopedias; these are the inclusionists. This paper looks at two overarching questions on the debate between deletionists and inclusionists: (1) What are the implications to the long tail of the evolving standards for article birth and death? (2) How is viewership affected by the decreasing notability of articles in the long tail? The answers to five detailed research questions that are inspired by these overarching questions should help better frame this debate and provide insight into how Wikipedia is evolving.
Aniket Kittur, Ed H. Chi, Bongwon Suh What’s in Wikipedia? Mapping Topics and Conflict Using Socially Annotated Category Structure CHI 2009 2009 [31] blog summary, Signpost summary
Wikipedia is an online encyclopedia which has undergone tremendous growth. However, this same growth has made it difficult to characterize its content and coverage. In this paper we develop measures to map Wikipedia using its socially annotated, hierarchical category structure. We introduce a mapping technique that takes advantage of socially-annotated hierarchical categories while dealing with the inconsistencies and noise inherent in the distributed way that they are generated. The technique is demonstrated through two applications: mapping the distribution of topics in Wikipedia and how they have changed over time; and mapping the degree of conflict found in each topic area. We also discuss the utility of the approach for other applications and datasets involving collaboratively annotated category hierarchies.
Wikipedia, wiki, visualization, mapping, annotation, social computing, distributed collaboration, conflict.
Medelyan, O. and Milne, D. Augmenting domain-specific thesauri with knowledge from Wikipedia Proceedings of the NZ Computer Science Research Student Conference (NZCSRSC 2008), Christchurch, New Zealand. 2008 [32]
Medelyan, O. and Legg, C Integrating Cyc and Wikipedia: Folksonomy meets rigorously defined common-sense PProceedings of the first AAAI Workshop on Wikipedia and Artificial Intelligence (WIKIAI'08), Chicago, I.L. 2008 [33]
Medelyan, O, Witten, I.H., and Milne, D Topic Indexing with Wikipedia. PProceedings of the first AAAI Workshop on Wikipedia and Artificial Intelligence (WIKIAI'08), Chicago, I.L. 2008 [34]
Milne, David and Witten, Ian .H. Learning to link with Wikipedia. Proceedings of the first AAAI Workshop on Wikipedia and Artificial Intelligence (WIKIAI'08), Chicago, I.L. 2008 [35]
Milne, David and Witten, Ian .H. An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. Proceedings of the first AAAI Workshop on Wikipedia and Artificial Intelligence (WIKIAI'08), Chicago, I.L. 2008 [36]
This paper describes a new technique for obtaining measures of semantic relatedness. Like other recent approaches, it uses Wikipedia to provide structured world knowledge about the terms of interest. Our approach is unique in that it does so using the hyperlink structure of Wikipedia rather than its category hierarchy or textual content. Evaluation with manually defined measures of semantic relatedness reveals this to be an effective compromise between the ease of computation of the former approach and the accuracy of the latter.
Anuradha Jambunathan and Marco Ronchetti Exploiting the collective intelligence contained in Wikipedia to automatically describe the content of a document Proceedings of the Workshop on Collective Intelligence at the Third Asian Semantic Web Conference, in The Semantic Web: a view on data integration, reasoning, human factors, collective intelligence and technology adoption 2008 [37]
The Wikipedia phenomenon is very interesting from the point of view of the collective, social effort to produce a large, strongly interlinked body of knowledge. It also offers, for the first time in history, a general source of information coded in electronic form and freely available to anyone. As such, it can be used as a reference for tools aiming at mining semantic meaning from generic documents. In this paper, we propose a clustering-based method that exploits some of the implicit knowledge built into Wikipedia to refine and ameliorate existing approaches.
Semantic Relatedness, Semantic Analysis
Bongwon Suh, Ed H. Chi, Aniket Kittur, Bryan A. Pendleton Lifting the veil: improving accountability and social transparency in Wikipedia with wikidashboard Conference on Human Factors in Computing Systems, Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems 2008 [38]
Wikis are collaborative systems in which virtually anyone can edit anything. Although wikis have become highly popular in many domains, their mutable nature often leads them to be distrusted as a reliable source of information. Here we describe a social dynamic analysis tool called WikiDashboard which aims to improve social transparency and accountability on Wikipedia articles. Early reactions from users suggest that the increased transparency afforded by the tool can improve the interpretation, communication, and trustworthiness of Wikipedia articles.
accountability, collaboration, social transparency, trust, visualization, wiki, wikidashboard, wikipedia
Marcin Miłkowski Automated Building of Error Corpora of Polish Corpus Linguistics, Computer Tools, and Applications – State of the Art. PALC 2007, Peter Lang. Internationaler Verlag der Wissenschaften 2008, 631-639 2008 [39]

The paper shows how to automatically develop error corpora out of revision history of documents. The idea is based on a hypothesis that minor edits in documents represent correction of typos, slips of the tongue, grammar, usage and style mistakes. This hypothesis has been confirmed by frequency analysis of revision history of articles in the Polish Wikipedia. Resources such as revision history in Wikipedia, Wikia, and other collaborative editing systems, can be turned into corpora of errors, just by extracting the minor edits. The most theoretically interesting aspect is that the corrections will represent the average speaker's intuitions about usage, and this seems to be a promising way of researching normativity in claims about proper or improper Polish.

By processing the revision history, one can gain pairs of segments in the corpus: first representing the error, and the other representing the correction. Moreover, it is relatively easy to tag parts of speech, compare subsequent versions, and prepare a text file containing the resulting corpus.
error corpora, normativity, revision history, corpora building
Christopher Thomas, Pankaj Mehra, Roger Brooks, Amit Sheth Growing Fields of Interest - Using an Expand and Reduce Strategy for Domain Model Extraction IEEE/WIC International Conference on Web Intelligence, Sydney, Australia 2008 [40] pdf
Domain hierarchies are widely used as models underlying information retrieval tasks. Formal ontologies and taxonomies enrich such hierarchies further with properties and relationships associated with concepts and categories but require manual effort; therefore they are costly to maintain, and often stale. Folksonomies and vocabularies lack rich category structure and are almost entirely devoid of properties and relationships. Classification and extraction require the coverage of vocabularies and the alterability of folksonomies and can largely benefit from category relationships and other properties. With Doozer, a program for building conceptual models of information domains, we want to bridge the gap between the vocabularies and Folksonomies on the one side and the rich, expert-designed ontologies and taxonomies on the other. Doozer mines Wikipedia to produce tight domain hierarchies, starting with simple domain descriptions. It also adds relevancy scores for use in automated classification of information. The output model is described as a hierarchy of domain terms that can be used immediately for classifiers and IR systems or as a basis for manual or semi-automatic creation of formal ontologies.
Wikipedia mining, Model creation
Benjamin K. Johnson Incentives to Contribute in Online Collaboration: Wikipedia as Collective Action International Communication Association, 58th Annual Conference, Montreal, Quebec 2008 [41] pdf
Wikipedia is an online encyclopedia created by volunteers, and is an example of how developments in software platforms and the low cost of sharing and coordinating on the Internet are leading to a new paradigm of creative collaboration on a massive scale. The research presented here addresses the questions of why individuals choose to give away their time and effort and how the challenges associated with collective action are addressed by Wikipedia’s technologies, organization, and community. Interviews with editors of the encyclopedia were used to identify what personal gains and other motivations compel contributors, what challenges to collaboration exist, and what technological and social structures aid their ability to create a freely available repository of human knowledge. The paper suggests that the free encyclopedia is at once both a traditional instance of collective action requiring coordination and strong incentives and an instance of networked public goods that result through boundary crossing made possibly through extremely low barriers to sharing.
collective action, motivation, coordination, incentives
Libby Veng-Sam Tang, Robert P. Biuk-Aghai and Simon Fong A Method for Measuring Co-authorship Relationships in MediaWiki Proceedings of the 2008 International Symposium on Wikis (WikiSym 2008), Porto, Portugal, 8-10 September 2008 2008 [42] Defines a metric for measuring the strength of the co-authorship relation of a pair of wiki authors. Presents an expert finder as an application using this metric, and applies it to Wikipedia data.
Collaborative writing through wikis has become increasingly popular in recent years. When users contribute to a wiki article they implicitly establish a co-authorship relationship. Discovering these relationships can be of value, for example in finding experts on a given topic. However, it is not trivial to determine the main co-authors for a given author among the potentially thousands who have contributed to a given author’s edit history. We have developed a method and algorithm for calculating a co-authorship degree for a given pair of authors. We have implemented this method as an extension for the MediaWiki system and demonstrate its performance which is satisfactory in the majority of cases. This paper also presents a method of determining an expertise group for a chosen topic.
wiki, co-authorship, analysis, metric
Sérgio Nunes, Cristina Ribeiro, Gabriel David WikiChanges - Exposing Wikipedia Revision Activity Proceedings of the 2008 International Symposium on Wikis (WikiSym '08) 2008 [43] (pdf)
Wikis are popular tools commonly used to support distributed collaborative work. Wikis can be seen as virtual scrapbooks that anyone can edit without having any specific technical know-how. The Wikipedia is a flagship example of a real-word application of wikis. Due to the large scale of Wikipedia it's difficult to easily grasp much of the information that is stored in this wiki. We address one particular aspect of this issue by looking at the revision history of each article. Plotting the revision activity in a timeline we expose the complete article's history in a easily understandable format. We present WikiChanges, a web-based application designed to plot an article's revision timeline in real time. It also includes a web browser extension that incorporates activity sparklines in the real Wikipedia. Finally, we introduce a revisions summarization task that addresses the need to understand what occurred during a given set of revisions.
visualization, revision history
Travis Kriplean, Ivan Beschastnikh, David W. McDonald Articulations of wikiwork: uncovering valued work in wikipedia through barnstars Proceedings of the ACM 2008 conference on Computer supported cooperative work (CSCW '08) 2008 [44] CSCW 2008 Best paper honorable mention (pdf)
Successful online communities have complex cooperative arrangements, articulations of work, and integration practices. They require technical infrastructure to support a broad division of labor. Yet the research literature lacks empirical studies that detail which types of work are valued by participants in an online community. A content analysis of Wikipedia barnstars -- personalized tokens of appreciation given to participants -- reveals a wide range of valued work extending far beyond simple editing to include social support, administrative actions, and types of articulation work. Our analysis develops a theoretical lens for understanding how wiki software supports the creation of articulations of work. We give implications of our results for communities engaged in large-scale collaborations.
articulation work, barnstars, commons-based peer production, online community
Moira Burke, Robert Kraut Mopping up: modeling wikipedia promotion decisions Proceedings of the ACM 2008 conference on Computer supported cooperative work (CSCW '08) 2008 [45] pdf
This paper presents a model of the behavior of candidates for promotion to administrator status in Wikipedia. It uses a policy capture framework to highlight similarities and differences in the community's stated criteria for promotion decisions to those criteria actually correlated with promotion success. As promotions are determined by the consensus of dozens of voters with conflicting opinions and unwritten expectations, the results highlight the degree to which consensus is truly reached. The model is fast and easily computable on the fly, and thus could be applied as a self-evaluation tool for editors considering becoming administrators, as a dashboard for voters to view a nominee's relevant statistics, or as a tool to automatically search for likely future administrators. Implications for distributed consensus-building in online communities are discussed.
administrators, collaboration, management, organizational behavior, policy capture, promotion
Aniket Kittur, Robert Kraut Harnessing the wisdom of crowds in wikipedia: quality through coordination Proceedings of the ACM 2008 conference on Computer supported cooperative work (CSCW '08) 2008 [46] CSCW 2008 Best paper honorable mention
Wikipedia's success is often attributed to the large numbers of contributors who improve the accuracy, completeness and clarity of articles while reducing bias. However, because of the coordination needed to write an article collaboratively, adding contributors is costly. We examined how the number of editors in Wikipedia and the coordination methods they use affect article quality. We distinguish between explicit coordination, in which editors plan the article through communication, and implicit coordination, in which a subset of editors structure the work by doing the majority of it. Adding more editors to an article improved article quality only when they used appropriate coordination techniques and was harmful when they did not. Implicit coordination through concentrating the work was more helpful when many editors contributed, but explicit coordination through communication was not. Both types of coordination improved quality more when an article was in a formative stage. These results demonstrate the critical importance of coordination in effectively harnessing the "wisdom of the crowd" in online production environments.
collaboration, collective intelligence, coordination, distributed cognition, social computing
Aniket Kittur, Bongwon Suh, Ed Chi Can you ever trust a wiki?: impacting perceived trustworthiness in wikipedia Proceedings of the ACM 2008 conference on Computer supported cooperative work (CSCW '08) 2008 [47] CSCW 2008 Best short paper award
Wikipedia has become one of the most important information resources on the Web by promoting peer collaboration and enabling virtually anyone to edit anything. However, this mutability also leads many to distrust it as a reliable source of information. Although there have been many attempts at developing metrics to help users judge the trustworthiness of content, it is unknown how much impact such measures can have on a system that is perceived as inherently unstable. Here we examine whether a visualization that exposes hidden article information can impact readers' perceptions of trustworthiness in a wiki environment. Our results suggest that surfacing information relevant to the stability of the article and the patterns of editor behavior can have a significant impact on users' trust across a variety of page types.
collaboration, social computing, stability, trust, visualization
Masahiro Ito, Kotaro Nakayama, Takahiro Hara, Shojiro Nishio Association Thesaurus Construction Methods based on Link Co-occurrence Analysis for Wikipedia Conference on Information and Knowledge Management (CIKM 2008) 2008 [48] Wikipedia-Lab

CIKM 2008

Wikipedia, a huge scale Web based encyclopedia, attracts great attention as an invaluable corpus for knowledge extraction because it has various impressive characteristics such as a huge number of articles, live updates, a dense link structure, brief anchor texts and URL identification for concepts. We have already proved that we can use Wikipedia to construct a huge scale accurate association thesaurus. The association thesaurus we constructed covers almost 1.3 million concepts and its accuracy is proved in detailed experiments. However, we still need scalable methods to analyze the huge number of Web pages and hyperlinks among articles in the Web based encyclopedia.

In this paper, we propose a scalable method for constructing an association thesaurus from Wikipedia based on link co-occurrences. Link co-occurrence analysis is more scalable than link structure analysis because it is a one-pass process. We also propose integration method of tfidf and link co-occurrence analysis. Experimental results show that both our proposed methods are more accurate and scalable than conventional methods. Furthermore, the integration of tfidf achieved higher accuracy than using only link co-occurrences.
Wikipedia Mining, Association Thesaurus, Link Co-occurrence, Semantic Relatedness
Amitava Dutta, Rahul Roy and Priya Seetharaman Wikipedia Usage Patterns: The Dynamics of Growth International Conference on Information Systems (ICIS 2008) 2008 [49]
Wikis have attracted attention as a powerful technological platform on which to harness the potential benefits of collective knowledge. Current literature identifies different behavioral factors that modulate the interaction between contributors and wikis. Some inhibit growth while others enhance it. However, while these individual factors have been identified in the literature, their collective effects have not yet been identified. In this paper, we use the system dynamics methodology, and a survey of Wikipedia users, to propose a holistic model of the interaction among different factors and their collective impact on Wikipedia growth. The model is simulated to examine its ability to replicate observed growth patterns of Wikipedia metrics. Results indicate that the model is a reasonable starting point for understanding observed Wiki growth patterns. To the best of our knowledge, this is the first attempt in the literature to synthesize a holistic model of the forces underlying Wiki growth.
Wikipedia, behavioral factors, system dynamics, simulation, survey data
Wan Muhammad Salehuddin Wan Hassan and Khairulmizam Samsudin Delta-encoding for document revision control system of Wikipedia Sixth IEEE Student Conference on Research and Development (SCOReD 2008) 2008 [50]
A revision control system keep track of changes for multiple versions of the same unit of information. It is often used in engineering and software development to manage storing, retrieval, logging, identification and merging of source files and electronic documents. Changes to these documents are noted by incrementing an associated number or letter code and associated historically with the person making the change. Revision control system is an important component of collaborative software platform that allows several member of a development team to work concurrently on an electronic document. Wikipedia, a free content encyclopedia is an example of a successful application of collaborative technology. A poorly implemented document revision control system will affect the performance and cause difficulty in managing Wikipedia huge amount of electronic data. In this work, efficiency of the current revision control system of Wikipedia will be evaluated. Feasibility of delta-encoding to address the current limitation of Wikipedia document revision control system will be presented.
revision control, document revision, delta-encoding
Joel Nothman, James R. Curran and Tara Murphy Transforming Wikipedia into Named Entity Training Data Australian Language Technology Workshop 2008 [51]
Statistical named entity recognisers require costly hand-labelled training data and, as a result, most existing corpora are small. We exploit Wikipedia to create a massive corpus of named entity annotated text. We transform Wikipedia’s links into named entity annotations by classifying the target articles into common entity types (e.g. person, organisation and location). Comparing to MUC, CONLL and BBN corpora, Wikipedia generally performs better than other cross-corpus train/test pairs.
named-entities, training corpora
Johannes Schoning, Brent Hecht, Martin Raubal, Antonio Kruger, Meri Marsh, and Michael Rohs Improving Interaction with Virtual Globes through Spatial Thinking: Helping Users Ask "Why?" Intelligent User Interfaces (IUI) 2008 [52] virtual globes, spatial thinking, multi-touch interaction, wall-size interfaces, artificial intelligence, wikipedia, semantic relatedness
Brent Hecht and Johannes Schoning Mapping the Zeitgeist Fifth International Conference on Geographic Information Science (GIScience) 2008 [53] zeitgeist, semantic relatedness, spatialization, spatial wikipedia
Brent Hecht and Martin Raubal Geographically explore semantic relations in world knowledge 11th AGILE International Conference on Geographic Information Science 2008 [54]
Methods to determine the semantic relatedness (SR) value between two lexically expressed entities abound in the field of natural language processing (NLP). The goal of such efforts is to identify a single measure that summarizes the number and strength of the relationships between the two entities. In this paper, we present GeoSR, the first adaptation of SR methods to the context of geographic data exploration. By combining the first use of a knowledge repository structure that is replete with non-classical relations, a new means of explaining those relations to users, and the novel application of SR measures to a geographic reference system, GeoSR allows users to geographically navigate and investigate the world knowledge encoded in Wikipedia. There are numerous visualization and interaction paradigms possible with GeoSR; we present one implementation as a proof-of-concept and discuss others. Although, Wikipedia is used as the knowledge repository for our implementation, GeoSR will also work with any knowledge repository having a similar set of properties.
semantic relatendess, network analysis, non-classical relations, geography, wikipedia
Darren Hardy Discovering behavioral patterns in collective authorship of place-based information Internet Research 9.0: Rethinking Community, Rethinking Place 2008 [55]

While current GIS research has focused on technological issues of visualization and data organization, the emergence of new forms of collective authorship suggest we need new information frameworks and behaviors. How do individuals contribute place-based information to a digital commons? What are the authorship dynamics of such collective effort? For my research, I will use spatial data mining methods to characterize authorship behavior on a corpus

of 1 million geotagged articles across 20 languages from Wikipedia.
geotagging, peer production, Wikipedia, bots
Andrew Krizhanovsky Index wiki database: design and experiments FLINS'08, Corpus Linguistics'08, AIS/CAD'08 2008 [56] Synarcher
With the fantastic growth of Internet usage, information search in documents of a special type called a "wiki page" that is written using a simple markup language, has become an important problem. This paper describes the software architectural model for indexing wiki texts in three languages (Russian, English, and German) and the interaction between the software components (GATE, Lemmatizer, and Synarcher). The inverted file index database was designed using visual tool DBDesigner. The rules for parsing Wikipedia texts are illustrated by examples. Two index databases of Russian Wikipedia (RW) and Simple English Wikipedia (SEW) are built and compared. The size of RW is by order of magnitude higher than SEW (number of words, lexemes), though the growth rate of number of pages in SEW was found to be 12% higher than in Russian, and the rate of acquisition of new words in SEW lexicon was 6% higher during a period of five months (from September 2007 to February 2008). The Zipf's law was tested with both Russian and Simple Wikipedias. The entire source code of the indexing software and the generated index databases are freely available under GPL.
corpus linguistics, inverted index, Zipf's law, information retrieval
Torsten Zesch, Christof Muller and Iryna Gurevych Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary LREC'08 2008 [57]
Recently, collaboratively constructed resources such as Wikipedia and Wiktionary have been discovered as valuable lexical semantic knowledge bases with a high potential in diverse Natural Language Processing (NLP) tasks. Collaborative knowledge bases however significantly differ from traditional linguistic knowledge bases in various respects, and this constitutes both an asset and an impediment for research in NLP. This paper addresses one such major impediment, namely the lack of suitable programmatic access mechanisms to the knowledge stored in these large semantic knowledge bases. We present two application programming interfaces for Wikipedia and Wiktionary which are especially designed for mining the rich lexical semantic information dispersed in the knowledge bases, and provide efficient and structured access to the available knowledge. As we believe them to be of general interest to the NLP community, we have made them freely available for research purposes.
Lexical semantics, Wikipedia API, Wiktionary API
Michael Roth and Sabine Schulte im Walde Corpus Co-Occurrence, Dictionary and Wikipedia Entries as Resources for Semantic Relatedness Information LREC'08 2008 [58]
Distributional, corpus-based descriptions have frequently been applied to model aspects of word meaning. However, distributional models that use corpus data as their basis have one well-known disadvantage: even though the distributional features based on corpus co-occurrence were often successful in capturing meaning aspects of the words to be described, they generally fail to capture those meaning aspects that refer to world knowledge, because coherent texts tend not to provide redundant information that is presumably available knowledge. The question we ask in this paper is whether dictionary and encyclopaedic resources might complement the distributional information in corpus data, and provide world knowledge that is missing in corpora. As test case for meaning aspects, we rely on a collection of semantic associates to German verbs and nouns. Our results indicate that a combination of the knowledge resources should be helpful in work on distributional descriptions.
Laura Kassner, Vivi Nastase and Michael Strube Acquiring a Taxonomy from the German Wikipedia LREC'08 2008 [59]
This paper presents the process of acquiring a large, domain independent, taxonomy from the German Wikipedia. We build upon a previously implemented platform that extracts a semantic network and taxonomy from the English version of the Wikipedia. We describe two accomplishments of our work: the semantic network for the German language in which isa links are identified and annotated, and an expansion of the platform for easy adaptation for a new language. We identify the platform’s strengths and shortcomings, which stem from the scarcity of free processing resources for languages other than English. We show that the taxonomy induction process is highly reliable - evaluated against the German version of WordNet, GermaNet, the resource obtained shows an accuracy of 83.34%.
Jordi Atserias, Hugo Zaragoza, Massimiliano Ciaramita and Giuseppe Attardi Semantically Annotated Snapshot of the English Wikipedia LREC'08 2008 [60]
This paper describes SW1, the first version of a semantically annotated snapshot of the English Wikipedia. In recent years Wikipedia has become a valuable resource for both the Natural Language Processing (NLP) community and the Information Retrieval (IR) community. Although NLP technology for processing Wikipedia already exists, not all researchers and developers have the computational resources to process such a volume of information. Moreover, the use of different versions of Wikipedia processed differently might make it difficult to compare results. The aim of this work is to provide easy access to syntactic and semantic annotations for researchers of both NLP and IR communities by building a reference corpus to homogenize experiments and make results comparable. These resources, a semantically annotated corpus and a “entity containment” derived graph, are licensed under the GNU Free Documentation License and available from http://www.yr-bcn.es/semanticWikipedia
Adrian Iftene and Alexandra Balahur-Dobrescu Named Entity Relation Mining using Wikipedia LREC'08 2008 [61]
Discovering relations among Named Entities (NEs) from large corpora is both a challenging, as well as useful task in the domain of Natural Language Processing, with applications in Information Retrieval (IR), Summarization (SUM), Question Answering (QA) and Textual Entailment (TE). The work we present resulted from the attempt to solve practical issues we were confronted with while building systems for the tasks of Textual Entailment Recognition and Question Answering, respectively. The approach consists in applying grammar induced extraction patterns on a large corpus - Wikipedia - for the extraction of relations between a given Named Entity and other Named Entities. The results obtained are high in precision, determining a reliable and useful application of the built resource.
Gaoying Cui, Qin Lu, Wenjie Li and Yirong Chen Corpus Exploitation from Wikipedia for Ontology Construction LREC'08 2008 [62]
Ontology construction usually requires a domain-specific corpus for building corresponding concept hierarchy. The domain corpus must have a good coverage of domain knowledge. Wikipedia(Wiki), the world’s largest online encyclopaedic knowledge source, is open-content, collaboratively edited, and free of charge. It covers millions of articles and still keeps on expanding continuously. These characteristics make Wiki a good candidate as domain corpus resource in ontology construction. However, the selected article collection must have considerable quality and quantity. In this paper, a novel approach is proposed to identify articles in Wiki as domain-specific corpus by using available classification information in Wiki pages. The main idea is to generate a domain hierarchy from the hyperlinked pages of Wiki. Only articles strongly linked to this hierarchy are selected as the domain corpus. The proposed approach makes use of linked category information in Wiki pages to produce the hierarchy as a directed graph for obtaining a set of pages in the same connected branch. Ranking and filtering are then done on these pages based on the classification tree generated by the traversal algorithm. The experiment and evaluation results show that Wiki is a good resource for acquiring a relative high quality domain-specific corpus for ontology construction.
Alexander E. Richman, Patrick Schone Mining Wiki Resources for Multilingual Named Entity Recognition ACL-08: HLT, pp. 1-9 2008 [63]

In this paper, we describe a system by which the multilingual characteristics of Wikipedia can be utilized to annotate a large corpus of text with Named Entity Recognition (NER) tags requiring minimal human intervention and no linguistic expertise. This process, though of value in languages for which resources exist, is particularly useful for less commonly taught languages. We show how the Wikipedia format can be used to identify possible named entities and discuss in detail the process by which we use the Category structure inherent to Wikipedia to determine the named entity type of a proposed entity.

We further describe the methods by which English language data can be used to bootstrap the NER process in other languages. We demonstrate the system by using the generated corpus as training sets for a variant of BBN's Identifinder in French, Ukrainian,

Spanish, Polish, Russian, and Portuguese, achieving overall F-scores as high as 84.7% on independent, human-annotated corpora, comparable to a system trained on up to 40,000 words of human-annotated newswire.
Michael Kaisser The QuALiM Question Answering Demo: Supplementing Answers with Paragraphs drawn from Wikipedia ACL-08: HLT Demo Session, pp. 32-35 2008 [64]
This paper describes the online demo of the QuALiM Question Answering system. While the system actually gets answers from the web by querying major search engines, during presentation answers are supplemented with relevant passages from Wikipedia. We believe that this additional information improves a user’s search experience.
Elif Yamangil, Rani Nelken Mining Wikipedia Revision Histories for Improving Sentence Compression ACL-08: HLT, Short Papers, pp. 137-140 2008 [65]
A well-recognized limitation of research on supervised sentence compression is the dearth of available training data. We propose a new and bountiful resource for such training data, which we obtain by mining the revision history of Wikipedia for sentence compressions and expansions. Using only a fraction of the available Wikipedia data, we have collected a training corpus of over 380,000 sentence pairs, two orders of magnitude larger than the standardly used Ziff-Davis corpus. Using this newfound data, we propose a novel lexicalized noisy channel model for sentence compression, achieving improved results in grammaticality and compression rate criteria with a slight decrease in importance.
Fadi Biadsy, Julia Hirschberg, Elena Filatova An Unsupervised Approach to Biography Production using Wikipedia ACL-08: HLT, pp. 807-815 2008 [66]
We describe an unsupervised approach to multi-document sentence-extraction based summarization for the task of producing biographies. We utilize Wikipedia to automatically construct a corpus of biographical sentences and TDT4 to construct a corpus of non-biographical sentences. We build a biographical-sentence classifier from these corpora and an SVM regression model for sentence ordering from the Wikipedia corpus. We evaluate our work on the DUC2004 evaluation data and with human judges. Overall, our system significantly outperforms all systems that participated in DUC2004, according to the ROUGE-L metric, and is preferred by human subjects.
Kai Wang, Chien-Liang Lin, Chun-Der Chen, and Shu-Chen Yang The adoption of Wikipedia: a community- and information quality-based view 12th Pacific Asia Conference on Information Systems (PACIS) 2008 [67] TAM, Wikipedia, Critical Mass, Community identification, Information quality
Carlo A. Curino, Hyun J. Moon, Letizia Tanca, Carlo Zaniolo Schema Evolution in Wikipedia: toward a Web Information System Benchmark International Conference on Enterprise Information System (ICEIS), 2008 [68] Panta Rhei Project

Evolving the database that is at the core of an Information System represents a difficult maintenance problem that has only been studied in the framework of traditional information systems. However, the problem is likely to be even more severe in web information systems, where open-source software is often developed through the contributions and collaboration of many groups and individuals. Therefore, in this paper, we present an in-depth analysis of the evolution history of the Wikipedia database and its schema; Wikipedia is the best-known example of a large family of web information systems built using the open-source software MediaWiki.

Our study is based on: (i) a set of Schema Modification Operators that provide a simple conceptual representation for complex schema changes, and (ii) simple software tools to automate the analysis. This framework allowed us to dissect and analyze the 4.5 years of Wikipedia history, which was short in time, but intense in terms of growth and evolution. Beyond confirming the initial hunch about the severity of the problem, our analysis suggests the need for developing better methods and tools to support graceful schema evolution. Therefore, we briefly discuss documentation and automation support systems for database evolution, and suggest that the Wikipedia case study can provide the kernel of a benchmark for testing and improving such systems.
Schema Evolution, Benchmark, Schema Versioning, Query Rewriting
Carlo A. Curino, Hyun J. Moon, Carlo Zaniolo Graceful Database Schema Evolution: the PRISM Workbench Very Large DataBases (VLDB), 2008 Panta Rhei Project

Supporting graceful schema evolution represents an unsolved problem for traditional information systems that is further exacerbated in web information systems, such as Wikipedia and public scientific databases: in these projects based on multiparty cooperation the frequency of database schema changes has increased while tolerance for downtimes has nearly disappeared. As of today, schema evolution remains an error-prone and time-consuming undertaking, because the DB Administrator (DBA) lacks the methods and tools needed to manage and automate this endeavor by (i) predicting and evaluating the effects of the proposed schema changes, (ii) rewriting queries and applications to operate on the new schema, and (iii) migrating the database.

Our PRISM system takes a big ?rst step toward addressing this pressing need by providing: (i) a language of Schema Modification Operators to express concisely complex schema changes, (ii) tools that allow the DBA to evaluate the effects of such changes, (iii) optimized translation of old queries to work on the new schema version, (iv) automatic data migration, and (v) full documentation of intervened changes as needed to support data provenance, database flash back, and historical queries.

PRISM solves these problems by integrating recent theoretical advances on mapping composition and invertibility, into a design that also achieves usability and scalability. Wikipedia and its 170+ schema versions provided an invaluable testbed for validating tools and their ability to support legacy queries.
Schema Evolution, Graceful Evolution, Schema Versioning, Query Rewriting
Hyun J. Moon, Carlo A. Curino, Alin Deutsch, Chien-Yi Hou, Carlo Zaniolo Managing and Querying Transaction-time Databases under Schema Evolution Very Large DataBases (VLDB), 2008 Panta Rhei Project
The old problem of managing the history of database information is now made more urgent and complex by fast-spreading web information systems. Indeed, systems such as Wikipedia are faced with the challenge of managing the history of their databases in the face of intense database schema evolution. Our PRIMA system addresses this difficult problem by introducing two key pieces of new technology. The ?rst is a method for publishing the history of a relational database in XML, whereby the evolution of the schema and its underlying database are given a unified representation. This temporally grouped representation makes it easy to formulate sophisticated historical queries on any given schema version using standard XQuery. The second key piece of technology provided by PRIMA is that schema evolution is transparent to the user: she writes queries against the current schema while retrieving the data from one or more schema versions. The system then performs the labor-intensive and error-prone task of rewriting such queries into equivalent ones for the appropriate versions of the schema. This feature is particularly relevant for historical queries spanning over potentially hundreds of different schema versions. The latter one is realized by (i) introducing Schema Modification Operators (SMOs) to represent the mappings between successive schema versions and (ii) an XML integrity constraint language (XIC) to efficiently rewrite the queries using the constraints established by the SMOs. The scalability of the approach has been tested against both synthetic data and real-world data from the Wikipedia DB schema evolution history.
Schema Evolution, Transaction Time DB, Query Rewriting
Fogarolli Angela and Ronchetti Marco Intelligent Mining and Indexing of Multi-Language e-Learning Material Proc. of 1st International Symposium on Intelligent Interactive Multimedia Systems and Services, KES IIMS 2008, 9-11 July 2008 Piraeus, Greece Studies in Computational Intelligence, Springer-Verlag (2008). 2008
In this paper we describe a method to automatically discover important concepts and their relationships in e-Lecture material. The discovered knowledge is used to display semantic aware categorizations and query suggestions for facilitating navigation inside an unstructured multimedia repository of e-Lectures. We report about an implemented approach for dealing with learning materials referring to the same event in different languages. The information acquired from the speech is combined with the documents such as presentation slides which are temporally synchronized with the video for creating new knowledge through a mapping with a taxonomy representation such as Wikipedia.
Content Retrieval, Content Filtering, Search over semi-structural Web sources, Multimedia, e-Learning
Fogarolli Angela and Ronchetti Marco Towards Bridging the Semantic-annotation-retrieval Gap in e-Learning Proc. of International Conference on e-Society, 9-12 April 2008 Algarve, Portugal. IADIS 2008
Semantic-based information retrieval is an area of ongoing work. In this paper we present a solution for giving semantic support to multimedia content information retrieval in an e-Learning environment where very often a large number of multimedia objects and information sources are used in combination. Semantic support is given through intelligent use of Wikipedia in combination with statistical Information Extraction techniques.
Content Retrieval, Content Filtering, Search over semi-structural Web sources, Multimedia, e-Learning
Tyers, F. and Pienaar, J. Extracting bilingual word pairs from Wikipedia SALTMIL workshop at Language Resources and Evaluation Conference (LREC) 2008 2008

A bilingual dictionary or word list is an important resource for many purposes, among them, machine translation. For many language pairs these are either non-existent, or very often unavailable owing to licensing restrictions. We describe a simple, fast and computationally inexpensive method for extracting bilingual dictionary entries from Wikipedia (using the interwiki link system) and assess the performance of this method with respect to four language pairs. Precision was found to be in the 69-92% region, but open to

improvement.
Under-resourced languages, Machine translation, Language resources, Bilingual terminology, Interwiki links
Fei Wu, Daniel S. Weld Automatically Refining the Wikipedia Infobox Ontology 17th International World Wide Web Conference (www-08) 2008 [69] WWW '08: Best student paper honorable mention, The Intelligence in Wikipedia Project at University of Washington

Google tech talk
The combined efforts of human volunteers have recently extracted numerous facts fromWikipedia, storing them asmachine-harvestable object-attribute-value triples inWikipedia infoboxes. Machine learning systems, such as Kylin, use these infoboxes as training data, accurately extracting even more semantic knowledge from natural language text. But in order to realize the full power of this information, it must be situated in a cleanly-structured ontology. This paper introduces KOG, an autonomous system for refining Wikipedia’s infobox-class ontology towards this end. We cast the problem of ontology refinement as a machine learning problem and solve it using both SVMs and a more powerful joint-inference approach expressed in Markov Logic Networks. We present experiments demonstrating the superiority of the joint-inference approach and evaluating other aspects of our system. Using these techniques, we build a rich ontology, integratingWikipedia’s infobox-class schemata with WordNet. We demonstrate how the resulting ontology may be used to enhance Wikipedia with improved query processing and other features.
Semantic Web, Ontology, Wikipedia, Markov Logic Networks
Maike Erdmann, Kotaro Nakayama, Takahiro Hara, Sojiro Nishio An Approach for Extracting Bilingual Terminology from Wikipedia 13th International Conference on Database Systems for Advanced Applications (DASFAA) 2008 [70] Wikipedia-Lab work
With the demand of bilingual dictionaries covering domain-specific terminology, research in the field of automatic dictionary extraction has become popular. However, accuracy and coverage of dictionaries created based on bilingual text corpora are often not sufficient for domain-specific terms. Therefore, we present an approach to extracting bilingual dictionaries from the link structure of Wikipedia, a huge scale encyclopedia that contains a vast amount of links between articles in different languages. Our methods analyze not only these interlanguage links but extract even more translation candidates from redirect page and link text information. In an experiment, we proved the advantages of our methods compared to a traditional approach of extracting bilingual terminology from parallel corpora.
Wikipedia Mining, Bilingual Terminology, Link Structure Analysis
Kotaro Nakayama, Takahiro Hara, Sojiro Nishio A Search Engine for Browsing the Wikipedia Thesaurus 13th International Conference on Database Systems for Advanced Applications, Demo session (DASFAA) 2008 [71] Wikipedia-Lab work
Wikipedia has become a huge phenomenon on the WWW. As a corpus for knowledge extraction, it has various impressive characteristics such as a huge amount of articles, live updates, a dense link structure, brief link texts and URL identification for concepts. In our previous work, we proposed link structure mining algorithms to extract a huge scale and accurate association thesaurus from Wikipedia. The association thesaurus covers almost 1.3 million concepts and the significant accuracy is proved in detailed experiments. To prove its practicality, we implemented three features on the association thesaurus; a search engine for browsing Wikipedia Thesaurus, an XML Web service for the thesaurus and a Semantic Web support feature. We show these features in this demonstration.
Wikipedia Mining, Association Thesaurus, Link Structure Analysis, XML Web Services
Kotaro Nakayama, Masahiro Ito, Takahiro Hara, Sojiro Nishio Wikipedia Mining for Huge Scale Japanese Association Thesaurus Construction International Symposium on Mining And Web (IEEE MAW) conjunction with IEEE AINA 2008 [72] Wikipedia-Lab work Wikipedia Mining, Association Thesaurus, Link Structure Analysis
Minghua Pei, Kotaro Nakayama, Takahiro Hara, Sojiro Nishio Constructing a Global Ontology by Concept Mapping using Wikipedia Thesaurus International Symposium on Mining And Web (IEEE MAW) conjunction with IEEE AINA 2008 [73] Wikipedia-Lab work Wikipedia Mining, Association Thesaurus, Ontology Mapping, Global Ontology
Joachim Schroer, Guido Hertel Voluntary engagement in an open web-based encyclopedia: From reading to contributing 10th International General Online Research Conference, Hamburg, Germany 2008 [74]
{{{2}}}
wikipedia, contributors, motivation, instrumentality, intrinsic motivation
Martin Potthast, Benno Stein, Maik Anderka A Wikipedia-Based Multilingual Retrieval Model 30th European Conference on IR Research, ECIR 2008, Glasgow 2008 [75]

This paper introduces CL-ESA, a new multilingual retrieval model for the analysis of cross-language similarity. The retrieval model exploits the multilingual alignment of Wikipedia: given a document d written in language L we construct a concept vector d for d, where each dimension i in d quantifies the similarity of d with respect to a document d*i chosen from the "L-subset" of Wikipedia. Likewise, for a second document d‘ written in language L‘, LL‘, we construct a concept vector d‘, using from the L‘-subset of the Wikipedia the topic-aligned counterparts d*i of our previously chosen documents.

Since the two concept vectors d and d‘ are collection-relative representations of d and d‘ they are language-independent. I.e., their similarity can directly be computed with the cosine similarity measure, for instance.

We present results of an extensive analysis that demonstrates the power of this new retrieval model: for a query document d the topically most similar documents from a corpus in another language are properly ranked. Salient property of the new retrieval model is its robustness with respect to both the size and the quality of the index document collection.
multilingual retrieval model, explicit semantic analysis, wikipedia
Martin Potthast, Benno Stein, Robert Gerling Automatic Vandalism Detection in Wikipedia 30th European Conference on IR Research, ECIR 2008, Glasgow 2008 [76] ECIR 2008: Best poster award
We present results of a new approach to detect destructive article revisions, so-called vandalism, in Wikipedia. Vandalism detection is a one-class classification problem, where vandalism edits are the target to be identified among all revisions. Interestingly, vandalism detection has not been addressed in the Information Retrieval literature by now. In this paper we discuss the characteristics of vandalism as humans recognize it and develop features to render vandalism detection as a machine learning task. We compiled a large number of vandalism edits in a corpus, which allows for the comparison of existing and new detection approaches. Using logistic regression we achieve 83% precision at 77% recall with our model. Compared to the rule-based methods that are urrently applied in Wikipedia, our approach increases the F-Measure performance by 49% while being faster at the same time.
vandalism, machine learning, wikipedia
Ivan Beschastnikh, Travis Kriplean, David W. McDonald Wikipedian Self-Governance in Action: Motivating the Policy Lens Proceedings of the Second International Conference on Weblogs and Social Media, AAAI, March 31, 2008 (ICWSM '08) 2008 [77] ICWSM '08: Best paper award
While previous studies have used the Wikipedia dataset to provide an understanding of its growth, there have been few attempts to quantitatively analyze the establishment and evolution of the rich social practices that support this editing community. One such social practice is the enactment and creation of Wikipedian policies. We focus on the enactment of policies in discussions on the talk pages that accompany each article. These policy citations are a valuable micro-to-macro connection between everyday action, communal norms and the governance structure of Wikipedia. We find that policies are widely used by registered users and administrators, that their use is converging and stabilizing in and across these groups, and that their use illustrates the growing importance of certain classes of work, in particular source attribution. We also find that participation in Wikipedias governance structure is inclusionary in practice.
policy use, governance, wikipedia
Andrea Forte, Amy Bruckman Scaling Consensus: Increasing Decentralization in Wikipedia Governance HICSS 2008, pp. 157-157. 2008 [78]
How does "self-governance" happen in Wikipedia? Through in-depth interviews with eleven individuals who have held a variety of responsibilities in the English Wikipedia, we obtained rich descriptions of how various forces produce and regulate social structures on the site. Our analysis describes Wikipedia as an organization with highly refined policies, norms, and a technological architecture that supports organizational ideals of consensus building and discussion. We describe how governance in the site is becoming increasingly decentralized as the community grows and how this is predicted by theories of commons-based governance developed in offline contexts. The trend of decentralization is noticeable with respect to both content-related decision making processes and social structures that regulate user behavior.
governance, wikipedia
Zareen Syed, Tim Finin, and Anupam Joshi Wikipedia as an Ontology for Describing Documents Proceedings of the Second International Conference on Weblogs and Social Media, AAAI, March 31, 2008 2008 [79]
Identifying topics and concepts associated with a set of documents is a task common to many applications. It can help in the annotation and categorization of documents and be used to model a person's current interests for improving search results, business intelligence or selecting appropriate advertisements. One approach is to associate a document with a set of topics selected from a fixed ontology or vocabulary of terms. We have investigated using Wikipedia's articles and associated pages as a topic ontology for this purpose. The benefits are that the ontology terms are developed through a social process, maintained and kept current by the Wikipedia community, represent a consensus view, and have meaning that can be understood simply by reading the associated Wikipedia page. We use Wikipedia articles and the category and article link graphs to predict concepts common to a set of documents. We describe several algorithms to aggregate and refine results, including the use of spreading activation to select the most appropriate terms. While the Wikipedia category graph can be used to predict generalized concepts, the article links graph helps by predicting more specific concepts and concepts not in the category hierarchy. Our experiments demonstrate the feasibility of extending the category system with new concepts identified as a union of pages from the page link graph.
ontology, wikipedia, information retrieval, text classification
Felipe Ortega, Jesus M. Gonzalez-Barahona and Gregorio Robles On the Inequality of Contributions to Wikipedia HICSS 2008 2008 [80] Application of the Gini coefficient to measure the level of inequality of the contributions to the top ten language editions of Wikipedia.
Wikipedia is one of the most successful examples of massive collaborative content development. However, many of the mechanisms and procedures that it uses are still unknown in detail. For instance, how equal (or unequal) are the contributions to it has been discussed in the last years, with no conclusive results. In this paper, we study exactly that aspect by using Lorenz curves and Gini coefficients, very well known instruments to economists. We analyze the trends in the inequality of distributions for the ten biggest language editions of Wikipedia, and their evolution over time. As a result, we have found large differences in the number of contributions by different authors (something also observed in free, open source software development), and a trend to stable patterns of inequality in the long run.
wikipedia
Anne-Marie Vercoustre, James A. Thom and Jovan Pehcevski Entity Ranking in Wikipedia SAC’08 March 16-20, 2008, Fortaleza, Ceara, Brazil 2008 [81] Application of the Gini coefficient to measure the level of inequality of the contributions to the top ten language editions of Wikipedia.
The traditional entity extraction problem lies in the ability of extracting named entities from plain text using natural language processing techniques and intensive training from large document collections. Examples of named entities include organisations, people, locations, or dates. There are many research activities involving named entities; we are interested in entity ranking in the field of information retrieval. In this paper, we describe our approach to identifying and ranking entities from the INEX Wikipedia document collection. Wikipedia offers a number of interesting features for entity identification and ranking that we first introduce. We then describe the principles and the architecture of our entity ranking system, and introduce our methodology for evaluation. Our preliminary results show that the use of categories and the link structure of Wikipedia, together with entity examples, can significantly improve retrieval effectiveness.
Entity Ranking, XML Retrieval, Test collection
Robert P. Biuk-Aghai, Christopher Kelen and Hari Venkatesan Visualization of Interactions in Collaborative Writing Proceedings of the 2008 Second IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2008), Phitsanulok, Thailand, 26-29 February 2008 2008 [82] Presents an analysis and visualization tool to help assess process and outcome of collaborative writing.
Wikis have become an important component of a collaboration infrastructure, particularly in loosely-coupled and self-organizing settings such as those of digital ecosystems. We report on our use of wikis in the education doman to support collaborative creative writing, as well as collaborative translation. This paper presents an analysis and visualization tool that we have developed as an aid for assessing both the process and the outcome of these collaborative writing tasks.
information visualization, collaborative writing, assessment, digital ecosystems
Krishnan Ramanathan, Yogesh Sankarasubramaniam, Nidhi Mathur, Ajay Gupta Document Summarization using Wikipedia First IEEE international conference on Human computer interaction (IHCI) 2008 [83]
Although most of the developing world is likely to first access the Internet through mobile phones, mobile devices are constrained by screen space, bandwidth and limited attention span. Single document summarization techniques have the potential to simplify information consumption on mobile phones by presenting only the most relevant information contained in the document. In this paper we present a language independent single-document summarization method. We map document sentences to semantic concepts in Wikipedia and select sentences for the summary based on the frequency of the mapped-to concepts. Our evaluation on English documents using the ROUGE package indicates our summarization method is competitive with the state of the art in single document summarization.
Document summarization, Wikipedia, ROUGE
Brent Hecht, Michael Rohs, Johannes Schoning and Antonio Kruger WikEye - Using Magic Lenses to Explore Georeferenced Wikipedia Content. 3rd International Workshop on Pervasive Mobile Interaction Devices (PERMID) in Conjuncation with Pervasive Computing 2007 [84] wikipedia data-mining, magic lens, augmented reality, markerless tracking
Marek Meyer, Christoph Rensing, Ralf Steinmetz Categorizing Learning Objects Based On Wikipedia as Substitute Corpus First International Workshop on Learning Object Discovery & Exchange (LODE'07), September 18, 2007, Crete, Greece 2007 [85] Usage of Wikipedia as corpus for machine learning methods.
As metadata is often not sufficiently provided by authors of Learning Resources, automatic metadata generation methods are used to create metadata afterwards. One kind of metadata is categorization, particularly the partition of Learning Resources into distinct subject cat- egories. A disadvantage of state-of-the-art categorization methods is that they require corpora of sample Learning Resources. Unfortunately, large corpora of well-labeled Learning Resources are rare. This paper presents a new approach for the task of subject categorization of Learning Re- sources. Instead of using typical Learning Resources, the free encyclope- dia Wikipedia is applied as training corpus. The approach presented in this paper is to apply the k-Nearest-Neighbors method for comparing a Learning Resource to Wikipedia articles. Different parameters have been evaluated regarding their impact on the categorization performance.
Wikipedia, Categorization, Metadata, kNN, Classification, Substitute Corpus, Automatic Metadata Generation
Overell, Simon E., and Stefan Ruger Geographic co-occurrence as a tool for GIR. 4th ACM workshop on Geographical Information Retrieval. Lisbon, Portugal. 2007 [86]
In this paper we describe the development of a geographic co-occurrence model and how it can be applied to geographic information retrieval. The model consists of mining co-occurrences of placenames from Wikipedia, and then mapping these placenames to locations in the Getty Thesaurus of Geographical Names. We begin by quantifying the accuracy of our model and compute theoretical bounds for the accuracy achievable when applied to placename disambiguation in free text. We conclude with a discussion of the improvement such a model could provide for placename disambiguation and geographic relevance ranking over traditional methods.
Wikipedia, disambiguation, geographic information retrieval
Torsten Zesch, Iryna Gurevych Analysis of the Wikipedia Category Graph for NLP Applications. Proceedings of the TextGraphs-2 Workshop (NAACL-HLT) 2007 [87]
In this paper, we discuss two graphs in Wikipedia (i) the article graph, and (ii) the category graph. We perform a graphtheoretic analysis of the category graph, and show that it is a scale-free, small world graph like other well-known lexical semantic networks. We substantiate our findings by transferring semantic relatedness algorithms defined on WordNet to the Wikipedia category graph. To assess the usefulness of the category graph as an NLP resource, we analyze its coverage and the performance of the transferred semantic relatedness algorithms.
nlp, relatedness, semantic, wikipedia
Antonio Toral and Rafael Munozh Towards a Named Entity Wordnet (NEWN) Proceedings of the 6th International Conference on Recent Advances in Natural Language Processing (RANLP). Borovets (Bulgaria). pp. 604-608 . September 2007 2007 [88] poster?
Ulrik Brandes and Jurgen Lerner Visual Analysis of Controversy in User-generated Encyclopedias Proc. IEEE Symp. Visual Analytics Science and Technology (VAST ' 07) 2007 [89]
Wikipedia is a large and rapidly growing Web-based collaborative authoring environment, where anyone on the Internet can create, modify, and delete pages about encyclopedic topics. A remarkable property of some Wikipedia pages is that they are written by up to thousands of authors who may have contradicting opinions. In this paper we show that a visual analysis of the “who revises whom”- network gives deep insight into controversies. We propose a set of analysis and visualization techniques that reveal the dominant authors of a page, the roles they play, and the alters they confront. Thereby we provide tools to understand howWikipedia authors collaborate in the presence of controversy.
social network controversy editing visualisation wikipedia
V Jijkoun, M de Rijke WiQA: Evaluating Multi-lingual Focused Access to Wikipedia Proceedings EVIA, 2007 2007 [90]
We describe our experience with WiQA 2006, a pilot task aimed at studying question answering using Wikipedia. Going beyond traditional factoid questions, the task considered at WiQA 2006 was to identify --given an source article from Wikipedia-- snippets from other Wikipedia articles, possibly in languages different from the language of the source article, that add new and important information to the source article, and that do so without repetition. A total of 7 teams took part, submitting 20 runs. Our main findings are two-fold: (i) while challenging, the tasks considered at WiQA are do-able as participants achieved precision@10 scores in the .5 range and MRR scores upwards of .5; (ii) on the bilingual task, substantially higher scores were achieved than on the monolingual tasks.
Sorin Adam Matei and Caius Dobrescu Ambiguity and conflict in the Wikipedian knowledge production system International Communication Association Annual Conference, Dresden, Germany 2006 [91]
The paper analyzes the manner in which the most important implicit explanatory framework, emergence theory, and the central Wikipedia policy, the "Neutral point of view," are appropriated and reinterpreted by Wikipedia actors. Analyzing mailing list messages posted on Wikipedia-l and on Wikipedia's "Neutral Point of View Policy" discussion page (a footnoting space used for coordinating the editorial process on Wikipedia) the paper comes to the conclusion that the debates are often conflictual and their solution is found in ambiguity. The overarching conclusion is that conflict and ambiguity on Wikipedia are not extraneous, but central ingredients of this wiki project. They naturally develop from the pluralist and non-hierarchic nature of the medium and of the culture that brought it to life.
wikipedia, conflict, policy, ambiguity, process, neutral point of view, rules, editors, revert wars
Martin Potthast Wikipedia in the pocket: indexing technology for near-duplicate detection and high similarity search SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval 2007 [92]
We develop and implement a new indexing technology which allows us to use complete (and possibly very large) documents as queries, while having a retrieval performance comparable to a standard term query. Our approach aims at retrieval tasks such as near duplicate detection and high similarity search. To demonstrate the performance of our technology we have compiled the search index "Wikipedia in the Pocket", which contains about 2 million English and German Wikipedia articles.1 This index--along with a search interface--fits on a conventional CD (0.7 gigabyte). The ingredients of our indexing technology are similarity hashing and minimal perfect hashing.
wikipedia
Minier, Zsolt; Bodo, Zalan; Csato, Lehel Wikipedia-Based Kernels for Text Categorization Symbolic and Numeric Algorithms for Scientific Computing, 2007. SYNASC. International Symposium on 2007 [93]
In recent years several models have been proposed for text categorization. Within this, one of the widely applied models is the vector space model (VSM), where independence between indexing terms, usually words, is assumed. Since training corpora sizes are relatively small - compared to what would be required for a realistic number of words - the generalization power of the learning algorithms is low. It is assumed that a bigger text corpus can boost the representation and hence the learning process. Based on the work of Gabrilovich and Markovitch [6], we incorporate Wikipedia articles into the system to give word distributional representation for documents. The extension with this new corpus causes dimensionality increase, therefore clustering of features is needed. We use Latent Semantic Analysis (LSA), Kernel Principal Component Analysis (KPCA) and Kernel Canonical Correlation Analysis (KCCA) and present results for these experiments on the Reuters corpus.
Thomas, Christopher; Sheth, Amit P. Semantic Convergence of Wikipedia Articles Web Intelligence, IEEE/WIC/ACM International Conference on 2007 [94]
Social networking, distributed problem solving and human computation have gained high visibility. Wikipedia is a well established service that incorporates aspects of these three fields of research. For this reason it is a good object of study for determining quality of solutions in a social setting that is open, completely distributed, bottom up and not peer reviewed by certified experts. In particular, this paper aims at identifying semantic convergence of Wikipedia articles; the notion that the content of an article stays stable regardless of continuing edits. This could lead to an automatic recommendation of good article tags but also add to the usability of Wikipedia as a Web Service and to its reliability for information extraction. The methods used and the results obtained in this research can be generalized to other communities that iteratively produce textual content.
Rada Mihalcea Using Wikipedia for Automatic Word Sense Disambiguation Proceedings of NAACL HLT, 2007 2007 [95]
This paper describes a method for generating sense-tagged data using Wikipedia as a source of sense annotations. Through word sense disambiguation experiments, we show that the Wikipedia-based sense annotations are reliable and can be used to construct accurate sense classifiers.
J Yu, JA Thom, A Tam Ontology evaluation using wikipedia categories for browsing Proceedings of the sixteenth ACM conference on Conference on information and knowledge management 2007 [96]
Ontology evaluation is a maturing discipline with methodologies and measures being developed and proposed. However, evaluation methods that have been proposed have not been applied to specific examples. In this paper, we present the state-of-the-art in ontology evaluation - current methodologies, criteria and measures, analyse appropriate evaluations that are important to our application - browsing in Wikipedia, and apply these evaluations in the context of ontologies with varied properties. Specifically, we seek to evaluate ontologies based on categories found in Wikipedia.
browsing, ontology evaluation, user studies, wikipedia
Reagle, Joseph M. Do as I do: authorial leadership in wikipedia WikiSym '07: Proceedings of the 2007 international symposium on Wikis 2007 [97] / [98]
In seemingly egalitarian collaborative on-line communities, like Wikipedia, there is often a paradoxical, or perhaps merely playful, use of the title "Benevolent Dictator" for leaders. I explore discourse around the use of this title so as to address how leadership works in open content communities. I first review existing literature on "emergent leadership" and then relate excerpts from community discourse on how leadership is understood, performed, and discussed by Wikipedians. I conclude by integrating concepts from existing literature and my own findings into a theory of "authorial" leadership.
Wikipedia, authorial, benevolent dictator, leadership
Martin Wattenberg, Fernanda B. Viegas and Katherine Hollenbach Visualizing Activity on Wikipedia with Chromograms Human-Computer Interaction ? INTERACT 2007 2007 [99]
To investigate how participants in peer production systems allocate their time, we examine editing activity on Wikipedia, the well-known online encyclopedia. To analyze the huge edit histories of the site’s administrators we introduce a visualization technique, the chromogram, that can display very long textual sequences through a simple color coding scheme. Using chromograms we describe a set of characteristic editing patterns. In addition to confirming known patterns, such reacting to vandalism events, we identify a distinct class of organized systematic activities. We discuss how both reactive and systematic strategies shed light on self-allocation of effort in Wikipedia, and how they may pertain to other peer-production systems.
Wikipedia - Visualization - Peer Production - Visualization
A Kittur, E Chi, BA Pendleton, B Suh, T Mytkowicz Power of the Few vs. Wisdom of the Crowd: Wikipedia and the Rise of the Bourgeoisie 25th Annual ACM Conference on Human Factors in Computing Systems (CHI 2007); 2007 April 28 - May 3; San Jose; CA. 2007 [100]
Wikipedia has been a resounding success story as a collaborative system with a low cost of online participation. However, it is an open question whether the success of Wikipedia results from a “wisdom of crowds” type of effect in which a large number of people each make a small number of edits, or whether it is driven by a core group of “elite” users who do the lion’s share of the work. In this study we examined how the influence of “elite” vs. “common” users changed over time in Wikipedia. The results suggest that although Wikipedia was driven by the influence of “elite” users early on, more recently there has been a dramatic shift in workload to the “common” user. We also show the same shift in del.icio.us, a very different type of social collaborative knowledge system. We discuss how these results mirror the dynamics found in more traditional social collectives, and how they can influence the design of new collaborative knowledge systems.
Wikipedia, Wiki, collaboration, collaborative knowledge systems, social tagging, delicious.
Meiqun Hu, Ee-Peng Lim, Aixin Sun, Hady W Lauw, Ba-Quy Vuong On improving wikipedia search using article quality WIDM '07: Proceedings of the 9th annual ACM international workshop on Web information and data management 2007 [101]
Wikipedia is presently the largest free-and-open online encyclopedia collaboratively edited and maintained by volunteers. While Wikipedia offers full-text search to its users, the accuracy of its relevance-based search can be compromised by poor quality articles edited by non-experts and inexperienced contributors. In this paper, we propose a framework that re-ranks Wikipedia search results considering article quality. We develop two quality measurement models, namely Basic and Peer Review, to derive article quality based on co-authoring data gathered from articles' edit history. Compared withWikipedia's full-text search engine, Google and Wikiseek, our experimental results showed that (i) quality-only ranking produced by Peer Review gives comparable performance to that of Wikipedia and Wikiseek; (ii) Peer Review combined with relevance ranking outperforms Wikipedia's full-text search significantly, delivering search accuracy comparable to Google.
quality, wikipedia
Wilkinson, Dennis M. and Huberman, Bernardo A. Cooperation and quality in wikipedia WikiSym '07: Proceedings of the 2007 international symposium on Wikis. 2007 [102]
The rise of the Internet has enabled collaboration and cooperation on anunprecedentedly large scale. The online encyclopedia Wikipedia, which presently comprises 7.2 million articles created by 7.04 million distinct editors, provides a consummate example. We examined all 50 million edits made tothe 1.5 million English-language Wikipedia articles and found that the high-quality articles are distinguished by a marked increase in number of edits, number of editors, and intensity of cooperative behavior, as compared to other articles of similar visibility and age. This is significant because in other domains, fruitful cooperation has proven to be difficult to sustain as the size of the collaboration increases. Furthermore, in spite of the vagaries of human behavior, we show that Wikipedia articles accrete edits according to a simple stochastic mechanism in which edits beget edits. Topics of high interest or relevance are thus naturally brought to the forefront of quality.
Wikipedia, collaborative authoring, cooperation, groupware
DPT Nguyen, Y Matsuo, M Ishizuka Subtree Mining for Relation Extraction from Wikipedia Proc. of NAACL/HLT 2007 2007 [103]
In this study, we address the problem of extracting relations between entities fromWikipedia’s English articles. Our proposed method first anchors the appearance of entities in Wikipedia’s articles using neither Named Entity Recognizer (NER) nor coreference resolution tool. It then classifies the relationships between entity pairs using SVM with features extracted from the web structure and subtrees mined from the syntactic structure of text. We evaluate our method on manually annotated data from actual Wikipedia articles.
Bongwon Suh, Ed H Chi, Bryan A Pendleton, Aniket Kittur Us vs. Them: Understanding Social Dynamics in Wikipedia with Revert Graph Visualizations Visual Analytics Science and Technology, 2007. VAST 2007. IEEE Symposium on (2007), pp. 163-170. 2007 [104]
Wikipedia is a wiki-based encyclopedia that has become one of the most popular collaborative on-line knowledge systems. As in any large collaborative system, as Wikipedia has grown, conflicts and coordination costs have increased dramatically. Visual analytic tools provide a mechanism for addressing these issues by enabling users to more quickly and effectively make sense of the status of a collaborative environment. In this paper we describe a model for identifying patterns of conflicts in Wikipedia articles. The model relies on users' editing history and the relationships between user edits, especially revisions that void previous edits, known as "reverts". Based on this model, we constructed Revert Graph, a tool that visualizes the overall conflict patterns between groups of users. It enables visual analysis of opinion groups and rapid interactive exploration of those relationships via detail drill-downs. We present user patterns and case studies that show the effectiveness of these techniques, and discuss how they could generalize to other systems.
motivation, social-network, wikipedia
Kittur, Aniket and Suh, Bongwon and Pendleton, Bryan A. and Chi, Ed H. He says, she says: conflict and coordination in Wikipedia CHI '07: Proceedings of the SIGCHI conference on Human factors in computing systems 2007 [105]
Wikipedia, a wiki-based encyclopedia, has become one of the most successful experiments in collaborative knowledge building on the Internet. As Wikipedia continues to grow, the potential for conflict and the need for coordination increase as well. This article examines the growth of such non-direct work and describes the development of tools to characterize conflict and coordination costs in Wikipedia. The results may inform the design of new collaborative knowledge systems.
Wiki, Wikipedia, collaboration, conflict, user model, visualization, web-based interaction
Davide Buscaldi and Paolo Rosso A Comparison of Methods for the Automatic Identification of Locations in Wikipedia Proceedings of GIR’07 2007 [106]
In this paper we compare two methods for the automatic identification of geographical articles in encyclopedic resources such asWikipedia. The methods are aWordNet-basedmethod that uses a set of keywords related to geographical places, and a multinomial Naïve Bayes classificator, trained over a randomly selected subset of the English Wikipedia. This task may be included into the broader task of Named Entity classification, a well-known problem in the field of Natural Language Processing. The experiments were carried out considering both the full text of the articles and only the definition of the entity being described in the article. The obtained results show that the information contained in the page templates and the category labels is more useful than the text of the articles.
Algorithms, Measurement, Performance, text analysis, language models
Li, Yinghao and Wing and Kei and Fu Improving weak ad-hoc queries using wikipedia asexternal corpus SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval 2007 [107]
In an ad-hoc retrieval task, the query is usually short and the user expects to find the relevant documents in the first several result pages. We explored the possibilities of using Wikipedia's articles as an external corpus to expand ad-hoc queries. Results show promising improvements over measures that emphasize on weak queries.
Wikipedia, external corpus, pseudo-relevance feedback
Y Watanabe, M Asahara, Y Matsumoto A Graph-based Approach to Named Entity Categorization in Wikipedia Using Conditional Random Fields Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) 2007 [108]
This paper presents a method for categorizing named entities in Wikipedia. In Wikipedia, an anchor text is glossed in a linked HTML text. We formalize named entity categorization as a task of categorizing anchor texts with linked HTML texts which glosses a named entity. Using this representation, we introduce a graph structure in which anchor texts are regarded as nodes. In order to incorporate HTML structure on the graph, three types of cliques are defined based on the HTML tree structure. We propose a method with Conditional Random Fields (CRFs) to categorize the nodes on the graph. Since the defined graph may include cycles, the exact inference of CRFs is computationally expensive. We introduce an approximate inference method using Treebased Reparameterization (TRP) to reduce computational cost. In experiments, our proposed model obtained significant improvements compare to baseline models that use Support Vector Machines.
Simone Braun and Andreas Schmidt Wikis as a Technology Fostering Knowledge Maturing: What we can learn from Wikipedia 7th International Conference on Knowledge Management (IKNOW '07),Special Track on Integrating Working and Learning in Business (IWL), 2007. 2007 [109]
The knowledge maturing theory opens an important macro perspective within the new paradigm of work-integrated learning. Especially wikis are interesting socio-technical systems to foster maturing activities by overcoming typical barriers. But so far, the theory has been mainly based on anecdotal evidence collected from various projects and observations. In this paper, we want to present the results of a qualitative and quantitative study of Wikipedia with respect to maturing phenomena, identifying instruments and measures indicating maturity. The findings, generalized to enterprise wikis, open the perspective on what promotes maturing on a method level and what can be used to spot maturing processes on a technology level.
knowledge management wiki wikipedia
Linyun Fu and Haofen Wang and Haiping Zhu and Huajie Zhang and Yang Wang and Yong Yu Making More Wikipedians: Facilitating Semantics Reuse for Wikipedia Authoring Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC/ASWC2007), Busan, South Korea, 4825: 127--140, 2007. 2007 [110]
Wikipedia, a killer application in Web 2.0, has embraced the power of collaborative editing to harness collective intelligence. It can also serve as an ideal Semantic Web data source due to its abundance, influence, high quality and well-structuring. However, the heavy burden of up-building and maintaining such an enormous and ever-growing online encyclopedic knowledge base still rests on a very small group of people. Many casual users may still feel difficulties in writing high quality Wikipedia articles. In this paper, we use RDF graphs to model the key elements in Wikipedia authoring, and propose an integrated solution to make Wikipedia authoring easier based on RDF graph matching, expecting making more Wikipedians. Our solution facilitates semantics reuse and provides users with: 1) a link suggestion module that suggests and auto-completes internal links between Wikipedia articles for the user; 2) a category suggestion module that helps the user place her articles in correct categories. A prototype system is implemented and experimental results show significant improvements over existing solutions to link and category suggestion tasks. The proposed enhancements can be applied to attract more contributors and relieve the burden of professional editors, thus enhancing the current Wikipedia to make it an even better Semantic Web data source.
semanticWeb web2.0 wikipedia
Soren Auer and Chris Bizer and Jens Lehmann and Georgi Kobilarov and Richard Cyganiak and Zachary Ives DBpedia: A Nucleus for a Web of Open Data Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC/ASWC2007), Busan, South Korea, 4825: 715--728, 2007. 2007 [111]
DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against datasets derived from Wikipedia and to link other datasets on the Web to Wikipedia data. We describe the extraction of the DBpedia datasets, and how the resulting information can be made available on the Web for humans and machines. We describe some emerging applications from the DBpedia community and show how website operators can reduce costs by facilitating royalty-free DBpedia content within their sites. Finally, we present the current status of interlinking DBpedia with other open datasets on the Web and outline how DBpedia could serve as a nucleus for an emerging Web of open data sources.
information retrieval mashup semantic Web wikipedia
Simone P. Ponzetto and Michael Strube An API for Measuring the Relatedness of Words in Wikipedia Companion Volume to the Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, : 23--30, 2007. 2007 [112]
We present an API for computing the semantic relatedness of words in Wikipedia.
api, relatedness semantic\_web, sematic, wikipedia
Ponzetto, Simone P. and Strube, Michael Deriving a Large Scale Taxonomy from Wikipedia Proceedings of the 22nd National Conference on Artificial Intelligence, Vancouver, B.C., 22-26 July 2007 [113]
We take the category system in Wikipedia as a conceptual network. We label the semantic relations between categories us- ing methods based on connectivity in the network and lexico- syntactic matching. As a result we are able to derive a large scale taxonomy containing a large amount of subsumption, i.e. isa, relations. We evaluate the quality of the created resource by comparing it with ResearchCyc, one of the largest manually annotated ontologies, as well as computing seman- tic similarity between words in benchmarking datasets. able to derive a large scale taxonomy.
api, relatedness semantic web, sematic, wikipedia
Simone Paolo Ponzetto Creating a Knowledge Base from a Collaboratively Generated Encyclopedia Proceedings of the NAACL-HLT 2007 Doctoral Consortium, pp 9-12, Rochester, NY, April 2007 2007 [114]
We present our work on using Wikipedia as a knowledge source for Natural Language Processing. We first describe our previous work on computing semantic relatedness from Wikipedia, and its application to a machine learning based coreference resolution system. Our results suggest that Wikipedia represents a semantic resource to be treasured for NLP applications, and accordingly present the work directions to be explored in the future.
Ralf Schenkel, Fabian Suchanek and Gjergji Kasneci YAWN: A Semantically Annotated Wikipedia XML Corpus BTW2007 2007 [115]
The paper presents YAWN, a system to convert the well-known and widely used Wikipedia collection into an XML corpus with semantically rich, self-explaining tags. We introduce algorithms to annotate pages and links with concepts from the WordNet thesaurus. This annotation process exploits categorical information in Wikipedia, which is a high-quality, manually assigned source of information, extracts additional information from lists, and utilizes the invocations of templates with named parameters. We give examples how such annotations can be exploited for high-precision queries.
Hugo Zaragoza, Henning Rode, Peter Mika, Jordi Atserias, Massimiliano Ciaramita & Giuseppe Attardi Ranking Very Many Typed Entities on Wikipedia CIKM '07: Proceedings of the Sixteenth ACM International Conference on Information and Knowledge Management 2007 [116]
We discuss the problem of ranking very many entities of different types. In particular we deal with a heterogeneous set of types, some being very generic and some very speci??c. We discuss two approaches for this problem: i) exploiting the entity containment graph and ii) using a Web search engine to compute entity relevance. We evaluate these approaches on the real task of ranking Wikipedia entities typed with a state-of-the-art named-entity tagger. Results show that both approaches can greatly increase the performance of methods based only on passage retrieval.
Soren Auer and Jens Lehmann What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content Proceedings of 4th European Semantic Web Conference; published in The Semantic Web: Research and Applications, pages 503-517 2007 [117]
WWikis are established means for the collaborative authoring, versioning and publishing of textual articles. The Wikipedia project, for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Recently, several approaches have been proposed on how to extend wikis to allow the creation of structured and semantically enriched content. However, the means for creating semantically enriched structured content are already available and are, although unconsciously, even used by Wikipedia authors. In this article, we present a method for revealing this structured content by extracting information from template instances. We suggest ways to efficiently query the vast amount of extracted information (e.g. more than 8 million RDF statements for the English Wikipedia version alone), leading to astonishing query answering possibilities (such as for the title question). We analyze the quality of the extracted content, and propose strategies for quality improvements with just minor modifications of the wiki systems being currently used.
George Bragues Wiki-Philosophizing in a Marketplace of Ideas: Evaluating Wikipedia's Entries on Seven Great Minds Social Science Research Network Working Paper Series (April 2007) 2007 [118]
A very conspicuous part of the new participatory media, Wikipedia has emerged as the Internet's leading source of all-purpose information, the volume and range of its articles far surpassing that of its traditional rival, the Encyclopedia Britannica. This has been accomplished by permitting virtually anyone to contribute, either by writing an original article or editing an existing one. With almost no entry barriers to the production of information, the result is that Wikipedia exhibits a perfectly competitive marketplace of ideas. It has often been argued that such a marketplace is the best guarantee that quality information will be generated and disseminated. We test this contention by examining Wikipedia's entries on seven top Western philosophers. These entries are evaluated against the consensus view elicited from four academic reference works in philosophy. Wikipedia's performance turns out to be decidedly mixed. Its average coverage rate of consensus topics is 52%, while the median rate is 56%. A qualitative analysis uncovered no outright errors, though there were significant omissions. The online encyclopedia's harnessing of the marketplace of ideas, though not unimpressive, fails to emerge as clearly superior to the traditional alternative of relying on individual expertise for information.
quality, wikipedia
Gang Wang and Yong Yu and Haiping Zhu PORE: Positive-Only Relation Extraction from Wikipedia Text Proceedings of the 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference (ISWC/ASWC2007), Busan, South Korea 2007 [119]
Extracting semantic relations is of great importance for the creation of the Semantic Web content. It is of great benefit to semi-automatically extract relations from the free text of Wikipedia using the structured content readily available in it. Pattern matching methods that employ information redundancy cannot work well since there is not much redundancy information in Wikipedia, compared to the Web. Multi-class classification methods are not reasonable since no classification of relation types is available in Wikipedia. In this paper, we propose PORE (Positive-Only Relation Extraction), for relation extraction from Wikipedia text. The core algorithm B-POL extends a state-of-the-art positive-only learning algorithm using bootstrapping, strong negative identification, and transductive inference to work with fewer positive training examples. We conducted experiments on several relations with different amount of training data. The experimental results show that B-POL can work effectively given only a small amount of positive training examples and it significantly outperforms the original positive learning approaches and a multi-class SVM. Furthermore, although PORE is applied in the context of Wikipedia, the core algorithm B-POL is a general approach for Ontology Population and can be adapted to other domains.
annotation iswc, knowledge-extraction nlp semantic-web text-mining wikipedia
Fei Wu, Daniel S. Weld Autonomously semantifying wikipedia Proceedings of the sixteenth ACM conference on Conference on information and knowledge management 2007 [120] CIKM-07: Best paper award, The Intelligence in Wikipedia Project at University of Washington
Berners-Lee's compelling vision of a Semantic Web is hindered by a chicken-and-egg problem, which can be best solved by a bootstrapping method - creating enough structured data to motivate the development of applications. This paper argues that autonomously "Semantifying Wikipedia" is the best way to solve the problem. We choose Wikipedia as an initial data source, because it is comprehensive, not too large, high-quality, and contains enough manually-derived structure to bootstrap an autonomous, self-supervised process. We identify several types of structures which can be automatically enhanced in Wikipedia (e.g., link structure, taxonomic data, infoboxes, etc.), and we describea prototype implementation of a self-supervised, machine learning system which realizes our vision. Preliminary experiments demonstrate the high precision of our system's extracted data - in one case equaling that of humans.
Information Extraction, Wikipedia, Semantic Web
Viegas, Fernanda The Visual Side of Wikipedia System Sciences, 2007. HICSS 2007. 40th Annual Hawaii International Conference on 2007 [121] HICSS '07: Best paper honorable mention
Critical social theorists often emphasize the control and surveillance aspects of information systems, building upon a characterization of information technology as a tool for increased rationalization. The emancipatory potential of information systems is often overlooked. In this paper, we apply the Habermasian ideal of rational discourse to Wikipedia as an illustration of the emancipatory potential of information systems. We conclude that Wikipedia does embody an approximation of rational discourse, while several challenges remain
Sean Hansen, Nicholas Berente, Kalle Lyytinen Wikipedia as Rational Discourse: An Illustration of the Emancipatory Potential of Information Systems Proceedings of Hawaiian International Conference of Systems Sciences Big Island, Hawaii.) 2007 [122] HICSS '07: Best paper award
The name “Wikipedia” has been associated with terms such as collaboration, volunteers, reliability, vandalism, and edit-war. Fewer people might think of “images,” “maps,” “diagrams,” “illustrations” in this context. This paper presents the burgeoning but underexplored visual side of the online encyclopedia. A survey conducted with image contributors to Wikipedia reveals key differences in collaborating around images as opposed to text. The results suggest that, even though image editing is a more isolated activity, somewhat shielded from vandalism, the sense of community is an important motivation for image contributors. By examining how contributors are appropriating text-oriented wiki technology to support collective editing around visual materials, this paper reveals the potential and some of the limitations of wikis in the realm of visual collaboration.
Fissaha Adafre, Sisay, Jijkoun, Valentin, de Rijke, Maarten Fact Discovery in Wikipedia Web Intelligence, IEEE/WIC/ACM International Conference on 2007 [123]
We address the task of extracting focused salient information items, relevant and important for a given topic, from a large encyclopedic resource. Specifically, for a given topic (a Wikipedia article) we identify snippets from other articles in Wikipedia that contain important information for the topic of the original article, without duplicates. We compare several methods for addressing the task, and find that a mixture of content-based, link-based, and layout-based features outperforms other methods, especially in combination with the use of so-called reference corpora that capture the key properties of entities of a common type.
nlp, relatedness, semantic, wikipedia
Li, Bing; Chen, Qing-Cai; Yeung, Daniel S.; Ng, Wing W.Y.; Wang, Xiao-Long Exploring Wikipedia and Query Log's Ability for Text Feature Representation Machine Learning and Cybernetics, 2007 International Conference on 2007 [124]
The rapid increase of internet technology requires a better management of web page contents. Many text mining researches has been conducted, like text categorization, information retrieval, text clustering. When machine learning methods or statistical models are applied to such a large scale of data, the first step we have to solve is to represent a text document into the way that computers could handle. Traditionally, single words are always employed as features in Vector Space Model, which make up the feature space for all text documents. The single-word based representation is based on the word independence and doesn't consider their relations, which may cause information missing. This paper proposes Wiki-Query segmented features to text classification, in hopes of better using the text information. The experiment results show that a much better F1 value has been achieved than that of classical single-word based text representation. This means that Wikipedia and query segmented feature could better represent a text document.
Wei Che Huang, Andrew Trotman, and Shlomo Geva Collaborative Knowledge Management: Evaluation of Automated Link Discovery in the Wikipedia SIGIR 2007 Workshop on Focused Retrieval, July 27, 2007, Amsterdam 2007 [125]
Using the Wikipedia as a corpus, the Link-the-Wiki track, launched by INEX in 2007, aims at producing a standard procedure and metrics for the evaluation of (automated) link discovery at different element levels. In this paper, we describe the preliminary procedure for the assessment, including the topic selection, submission, pooling and evaluation. Related techniques are also presented such as the proposed DTD, submission format, XML element retrieval and the concept of Best Entry Points (BEPs). Due to the task required by LTW, it represents a considerable evaluation challenge. We propose a preliminary procedure of assessment for this stage of the LTW and also discuss the further issues for improvement. Finally, an efficiency measurement is introduced for investigation since the LTW task involves two studies: the selection of document elements that represent the topic of request and the nomination of associated links that can access different levels of the XML document.
Wikipedia, Link-the-Wiki, INEX, Evaluation, DTD, Best Entry Point
Morten Rask The Richness and Reach of Wikinomics: Is the Free Web-Based Encyclopedia Wikipedia Only for the Rich Countries? Proceedings of the Joint Conference of The International Society of Marketing Development and the Macromarketing Society, June 2-5, 2007 2007 [126]
In this paper, a model of the patterns of correlation in Wikipedia, reach and richness, lays the foundation for studying whether or not the free web-based encyclopedia Wikipedia is only for developed countries. Wikipedia is used in this paper, as an illustrative case study for the enormous rise of the so-called Web 2.0 applications, a subject which has become associated with many golden promises: Instead of being at the outskirts of the global economy, the development of free or low-cost internet-based content and applications, makes it possible for poor, emerging, and transition countries to compete and collaborate on the same level as developed countries. Based upon data from 12 different Wikipedia language editions, we find that the central structural effect is on the level of human development in the current country. In other words, Wikipedia is in general, more for rich countries than for less developed countries. It is suggested that policy makers make investments in increasing the general level of literacy, education, and standard of living in their country. The main managerial implication for businesses, that will expand their social network applications to other countries, is to use the model of the patterns of correlation in Wikipedia, reach and richness, as a market screening and monitoring model.
Digital divide, Developing countries, Internet, Web 2.0, Social networks, Reach and richness, Wikipedia, Wikinomics, culture, language
Kotaro Nakayama, Takahiro Hara, Sojiro Nishio A Thesaurus Construction Method from Large Scale Web Dictionaries 21st IEEE International Conference on Advanced Information Networking and Applications (AINA) 2007 [127] Wikipedia-Lab work
Web-based dictionaries, such as Wikipedia, have become dramatically popular among the internet users in past several years. The important characteristic of Web-based dictionary is not only the huge amount of articles, but also hyperlinks. Hyperlinks have various information more than just providing transfer function between pages. In this paper, we propose an efficient method to analyze the link structure of Web-based dictionaries to construct an association thesaurus. We have already applied it to Wikipedia, a huge scale Web-based dictionary which has a dense link structure, as a corpus. We developed a search engine for evaluation, then conducted a number of experiments to compare our method with other traditional methods such as co-occurrence analysis.
Wikipedia Mining, Association Thesaurus, Link Structure Analysis, Link Text, Synonyms
Sergio Ferrandez, Antonio Toral, Oscar Ferrandez, Antonio Ferrandez and Rafael Munoz Applying Wikipedia’s Multilingual Knowledge to Cross-Lingual Question Answering Lecture Notes in Computer Science 2007 [128]
The application of the multilingual knowledge encoded in Wikipedia to an open-domain Cross-Lingual Question Answering system based on the Inter Lingual Index (ILI) module of EuroWordNet is proposed and evaluated. This strategy overcomes the problems due to ILI’s low coverage on proper nouns (Named Entities). Moreover, as these are open class words (highly changing), using a community-based up-to-date resource avoids the tedious maintenance of hand-coded bilingual dictionaries. A study reveals the importance to translate Named Entities in CL?QA and the advantages of relying on Wikipedia over ILI for doing this. Tests on questions from the Cross-Language Evaluation Forum (CLEF) justify our approach (20% of these are correctly answered thanks to Wikipedia’s Multilingual Knowledge).
G Urdaneta, G Pierre, M van Steen A Decentralized Wiki Engine for Collaborative Wikipedia Hosting 3rd International Conference on Web Information Systems and Technology (WEBIST), March 2007 2007 [129]
This paper presents the design of a decentralized system for hosting large-scale wiki web sites like Wikipedia, using a collaborative approach. Our design focuses on distributing the pages that compose the wiki across a network of nodes provided by individuals and organizations willing to collaborate in hosting the wiki. We present algorithms for placing the pages so that the capacity of the nodes is not exceeded and the load is balanced, and algorithms for routing client requests to the appropriate nodes. We also address fault tolerance and security issues.
M Hu, EP Lim, A Sun, HW Lauw, BQ Vuong Measuring article quality in wikipedia: models and evaluation Proceedings of the sixteenth ACM conference on Conference on information and knowledge management 2007 [130]
Wikipedia has grown to be the world largest and busiest free encyclopedia, in which articles are collaboratively written and maintained by volunteers online. Despite its success as a means of knowledge sharing and collaboration, the public has never stopped criticizing the quality of Wikipedia articles edited by non-experts and inexperienced contributors. In this paper, we investigate the problem of assessing the quality of articles in collaborative authoring of Wikipedia. We propose three article quality measurement models that make use of the interaction data between articles and their contributors derived from the article edit history. Our Basic model is designed based on the mutual dependency between article quality and their author authority. The Peer Review model introduces the review behavior into measuring article quality. Finally, our Prob Review models extend Peer Review with partial reviewership of contributors as they edit various portions of the articles. We conduct experiments on a set of well-labeled Wikipedia articles to evaluate the effectiveness of our quality measurement models in resembling human judgement
article quality, authority, collaborative authoring, peer review, wikipedia
Rodrigo B. Almeida, Barzan Mozafari, Junghoo Cho On the Evolution of Wikipedia Proc. of the Int. Conf. on Weblogs and Social Media, 2007 2007 [131]
A recent phenomenon on the Web is the emergence and pro- liferation of new social media systems allowing social inter- action between people. One of the most popular of these systems is Wikipedia that allows users to create content in a collaborative way. Despite its current popularity, not much is known about how users interact with Wikipedia and how it has evolved over time. In this paper we aim to provide a first, extensive study of the user behavior on Wikipedia and its evolution. Compared to prior studies, our work differs in several ways. First, previ- ous studies on the analysis of the user workloads (for systems such as peer-to-peer systems [10] and Web servers [2]) have mainly focused on understanding the users who are accessing information. In contrast, Wikipedia’s provides us with the opportunity to understand how users create and maintain in- formation since it provides the complete evolution history of its content. Second, the main focus of prior studies is eval- uating the implication of the user workloads on the system performance, while our study is trying to understand the evo- lution of the data corpus and the user behavior themselves. Our main findings include that (1) the evolution and up- dates of Wikipedia is governed by a self-similar process, not by the Poisson process that has been observed for the general Web [4, 6] and (2) the exponential growth of Wikipedia is mainly driven by its rapidly increasing user base, indicating the importance of its open editorial policy for its current suc- cess. We also find that (3) the number of updates made to the Wikipedia articles exhibit a power-law distribution, but the distribution is less skewed than those obtained from other studies.
Wikipedia, user behavior, social systems
David Milne Computing Semantic Relatedness using Wikipedia Link Structure Proc. of NZCSRSC, 2007 2007 [132]
This paper describes a new technique for obtaining measures of semantic relatedness. Like other recent approaches, it uses Wikipedia to provide a vast amount of structured world knowledge about the terms of interest. Our system, the Wikipedia Link Vector Model or WLVM, is unique in that it does so using only the hyperlink structure of Wikipedia rather than its full textual content. To evaluate the algorithm we use a large, widely used test set of manually defined measures of semantic relatedness as our bench-mark. This allows direct comparison of our system with other similar techniques.
Wikipedia, Data Mining, Semantic Relatedness
Dat P.T. Nguyen, Yutaka Matsuo and Mitsuru Ishizuka Relation Extraction from Wikipedia Using Subtree Mining AAAI ‘07 2007 [133]
The exponential growth and reliability of Wikipedia have made it a promising data source for intelligent systems. The first challenge of Wikipedia is to make the encyclopedia machine-processable. In this study, we address the problem of extracting relations among entities from Wikipedia’s English articles, which in turn can serve for intelligent systems to satisfy users’ information needs. Our proposed method first anchors the appearance of entities in Wikipedia articles using some heuristic rules that are supported by their encyclopedic style. Therefore, it uses neither the Named Entity Recognizer (NER) nor the Coreference Resolution tool, which are sources of errors for relation extraction. It then classifies the relationships among entity pairs using SVM with features extracted from the web structure and subtrees mined from the syntactic structure of text. The innovations behind our work are the following: a) our method makes use of Wikipedia characteristics for entity allocation and entity classification, which are essential for relation extraction; b) our algorithm extracts a core tree, which accurately reflects a relationship between a given entity pair, and subsequently identifies key features with respect to the relationship from the core tree. We demonstrate the effectiveness of our approach through evaluation of manually annotated data from actual Wikipedia articles.
David Milne, Ian H. Witten and David M. Nichols A Knowledge-Based Search Engine Powered by Wikipedia CIKM ‘07 2007 [134]
This paper describes a new technique for obtaining measures of semantic relatedness. Like other recent approaches, it uses Wikipedia to provide a vast amount of structured world knowledge about the terms of interest. Our system, the Wikipedia Link Vector Model or WLVM, is unique in that it does so using only the hyperlink structure of Wikipedia rather than its full textual content. To evaluate the algorithm we use a large, widely used test set of manually defined measures of semantic relatedness as our bench-mark. This allows direct comparison of our system with other similar techniques.
Information Retrieval, Query Expansion, Wikipedia, Data Mining, Thesauri.
Torsten Zesch, Iryna Gurevych, Max Muhlhauser Comparing Wikipedia and German Wordnet by Evaluating Semantic Relatedness on Multiple Datasets. Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT) 2007 [135]
We evaluate semantic relatedness measures on different German datasets showing that their performance depends on: (i) the definition of relatedness that was underlying the construction of the evaluation dataset, and (ii) the knowledge source used for computing semantic relatedness. We analyze how the underlying knowledge source influences the performance of a measure. Finally, we investigate the combination of wordnets and Wikipedia to improve the performance of semantic relatedness measures.
relatedness, WordNet
Jun'ichi Kazama and Kentaro Torisawa Exploiting Wikipedia as External Knowledge for Named Entity Recognition Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, : 698--707, 2007. 2007 [136]
We explore the use of Wikipedia as external knowledge to improve named entity recognition (NER). Our method retrieves the corresponding Wikipedia entry for each candidate word sequence and extracts a category label from the first sentence of the entry, which can be thought of as a definition part. These category labels are used as features in a CRF-based NE tagger. We demonstrate using the CoNLL 2003 dataset that the Wikipedia category labels extracted by such a simple method actually improve the accuracy of NER.
named-entities wikipedia
D. P. T. Nguyen and Y. Matsuo and M. Ishizuka Exploiting Syntactic and Semantic Information for Relation Extraction from Wikipedia IJCAI Workshop on Text-Mining \\& Link-Analysis (TextLink 2007), 2007. 2007 [137]
The exponential growth of Wikipedia recently attracts the attention of a large number of researchers and practitioners. However, one of the current challenges on Wikipedia is to make the encyclopedia processable for machines. In this paper, we deal with the problem of extracting relations between entities from Wikipedia’s English articles, which can straightforwardly be transformed into Semantic Web meta data. We propose a novel method to exploit syntactic and semantic information for relation extraction. We mine frequent subsequences from the path between an entity pair in the syntactic and semantic structure in order to explore key patterns reflecting the relationship between the pair. In addition, our method can utilize the nature of Wikipedia to automatically obtain training data. The preliminary results of our experiments strongly support our hyperthesis that analyzing language in higher level is better for relation extraction on Wikipedia and show that our method is promising for text understanding.
knowledge-extraction wikipedia
J. A. Thom and J. Pehcevski and A.-M. Vercoustre Use of Wikipedia Categories in Entity Ranking Proceedings of the 12th Australasian Document Computing Symposium, Melbourne, Australia, 2007. 2007 [138]
Wikipedia is a useful source of knowledge that has many applications in language processing and knowledge representation. The Wikipedia category graph can be compared with the class hierarchy in an ontology; it has some characteristics in common as well as some differences. In this paper, we present our approach for answering entity ranking queries from the Wikipedia. In particular, we explore how to make use of Wikipedia categories to improve entity ranking effectiveness. Our experiments show that using categories of example entities works significantly better than using loosely defined target categories.
named-entities wikipedia
S. Cucerzan Large-Scale Named Entity Disambiguation Based on Wikipedia Data EMNLP 2007: Empirical Methods in Natural Language Processing, June 28-30, 2007, Prague, Czech Republic, 2007. 2007 [139]
This paper presents a large-scale system for the recognition and semantic disambiguation of named entities based on information extracted from a large encyclopedic collection and Web search results. It describes in detail the disambiguation paradigm employed and the information extraction process from Wikipedia. Through a process of maximizing the agreement between the contextual information extracted from Wikipedia and the context of a document, as well as the agreement among the category tags associated with the candidate entities, the implemented system shows high disambiguation accuracy on both news stories and Wikipedia articles.
named-entities wikipedia
Anne-Marie Vercoustre and Jovan Pehcevski and James A. Thom Using Wikipedia Categories and Links in Entity Ranking Pre-proceedings of the sixth International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX 2007), 2007. 2007 [140]
This paper describes the participation of the INRIA group in the INEX 2007 XML entity ranking and ad hoc tracks. We developed a system for ranking Wikipedia entities in answer to a query. Our approach utilises the known categories, the link structure of Wikipedia, as well as the link co-occurrences with the examples (when provided) to improve the effectiveness of entity ranking. Our experiments on the training data set demonstrate that the use of categories and the link structure of Wikipedia, together with entity examples, can significantly improve entity retrieval effectiveness. We also use our system for the ad hoc tasks by inferring target categories from the title of the query. The results were worse than when using a full-text search engine, which confirms our hypothesis that ad hoc retrieval and entity retrieval are two different tasks.
information-retrieval link-mining wikipedia
Kotaro Nakayama and Takahiro Hara and Shojiro Nishio Wikipedia Mining for an Association Web Thesaurus Construction Web Information Systems Engineering (WISE) 2007 France 2007 [141] Wikipedia-Lab work
Wikipedia has become a huge phenomenon on the WWW. As a corpus for knowledge extraction, it has various impressive characteristics such as a huge amount of articles, live updates, a dense link structure, brief link texts and URL identification for concepts. In this paper, we propose an efficient link mining method pfibf (Path Frequency - Inversed Backward link Frequency) and the extension method “forward / backward link weighting (FB weighting)” in order to construct a huge scale association thesaurus. We proved the effectiveness of our proposed methods compared with other conventional methods such as cooccurrence analysis and TF-IDF.
dblp, thesaurus wikipedia
Klaus Stein, Claudia Hess Does it matter who contributes: a study on featured articles in the German wikipedia Proceedings of the 18th conference on Hypertext and hypermedia 2007 [142]
The considerable high quality of Wikipedia articles is often accredited to the large number of users who contribute to Wikipedia's encyclopedia articles, who watch articles and correct errors immediately. In this paper, we are in particular interested in a certain type of Wikipedia articles, namely, the featured articles - articles marked by a community's vote as being of outstanding quality. The German Wikipedia has the nice property that it has two types of featured articles: excellent and worth reading. We explore on the German Wikipedia whether only the mere number of contributors makes the difference or whether the high quality of featured articles results from having experienced authors contributing with a reputation for high quality contributions. Our results indicate that it does matter who contributes.
Wikipedia, collaborative working, measures of quality and reputation, statistical analysis of Wikipedia, wiki
Patrick AS Sinclair, Kirk Martinez, Paul H Lewis Dynamic link service 2.0: using wikipedia as a linkbase Proceedings of the 18th conference on Hypertext and hypermedia 2007 [143]
This paper describes how a Web 2.0 mashup approach, reusing technologies and services freely available on the web, have enabled the development of a dynamic link service system that uses Wikipedia as its linkbase.
dynamic link service, wikipedia
Tunsch, Thomas Museen und Wikipedia Gesellschaft zur Forderung angewandter Informatik, EVA Conferences International (eds). EVA 2007 Berlin, die 14. Berliner Veranstaltung der Internationalen EVA-Serie Electronic Imaging & the Visual Arts. Berlin: Gesellschaft zur Forderung angewandter Informatik, EVA Conferences International. (7th-9th Nov 2007). 87. 15-21 2007 [144] German
Suchanek Fabian M., Gjergji Kasneci, Gerhard Weikum YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia Proceedings of the 16th international conference on World Wide Web 2007 [145]
We present YAGO, a light-weight and extensible ontology with high coverage and quality. YAGO builds on entities and relations and currently contains more than 1 million entities and 5 million facts. This includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as HASONEPRIZE). The facts have been automatically extracted from Wikipedia and unified with WordNet, using a carefully designed combination of rule-based and heuristic methods described in this paper. The resulting knowledge base is a major step beyond WordNet: in quality by adding knowledge about individuals like persons, organizations, products, etc. with their semantic relationships - and in quantity by increasing the number of facts by more than an order of magnitude. Our empirical evaluation of fact correctness shows an accuracy of about 95%. YAGO is based on a logically clean model, which is decidable, extensible, and compatible with RDFS. Finally, we show how YAGO can be further extended by state-of-the-art information extraction techniques.
Andras Csomai and Rada Mihalcea Wikify! Linking Educational Materials to Encyclopedic Knowledge Proceedings of the International Conference on Artificial Intelligence in Education (AIED 2007), 2007. 2007 [146]
This paper describes a system that automatically links study materials to encyclopedic knowledge, and shows how the availability of such knowledge within easy reach of the learner can improve both the quality of the knowledge acquired and the time needed to obtain such knowledge.
E-NLP WSD keywords significance_testing terminology wikipedia
Rainer Hammwohner Semantic Wikipedia - Checking the Premises The Social Semantic Web 2007 - Proceedings of the 1st Conference on Social Semantic Web, 2007. 2007 [147]
Enhancing Wikipedia by means of semantic representations seems to be a promising issue. From a formal or technical point of view there are no major obstacles in the way. Nevertheless, a close look at Wikipedia, its structure and contents reveals that some questions have to be answered in advance. This paper will deal with these questions and present some first results based on empirical findings.
semantic, statistics, tagging, wikipedia
Torsten Zesch, Iryna Gurevych, Max Muhlhauser Analyzing and Accessing Wikipedia as a Lexical Semantic Resource. Biannual Conference of the Society for Computational Linguistics and Language Technology pp. 213-221 2007 [148]
We analyze Wikipedia as a lexical semantic resource and compare it with conventional resources, such as dictionaries, thesauri, semantic wordnets, etc. Different parts of Wikipedia record different aspects of these resources. We show that Wikipedia contains a vast amount of knowledge about, e.g., named entities, domain specific terms, and rare word senses. If Wikipedia is to be used as a lexical semantic resource in large-scale NLP tasks, efficient programmatic access to the knowledge therein is required. We review existing access mechanisms and show that they are limited with respect to performance and the provided access functions. Therefore, we introduce a general purpose, high performance Java-based Wikipedia API that overcomes these limitations.
api]
Somnath Banerjee Boosting Inductive Transfer for Text Classification Using Wikipedia Sixth International Conference on Machine Learning and Applications (ICMLA) 2007 [149]
Inductive transfer is applying knowledge learned on one set of tasks to improve the performance of learning a new task. Inductive transfer is being applied in improving the generalization performance on a classification task using the models learned on some related tasks. In this paper, we show a method of making inductive transfer for text classification more effective using Wikipedia. We map the text documents of the different tasks to a feature space created using Wikipedia, thereby providing some background knowledge of the contents of the documents. It has been observed here that when the classifiers are built using the features generated from Wikipedia they become more effective in transferring knowledge. An evaluation on the daily classification task on the Reuters RCV1 corpus shows that our method can significantly improve the performance of inductive transfer. Our method was also able to successfully overcome a major obstacle observed in a recent work on a similar setting.
classification, knowledge-extraction, wikipedia
Brent Hecht, Nicole Starosielski, and Drew Dara-Abrams Generating Educational Tourism Narratives from Wikipedia Association for the Advancement of Artificial Intelligence Fall Symposium on Intelligent Narrative Technologies (AAAI-INT) 2007 [150] notes
We present a narrative theory-based approach to data mining that generates cohesive stories from a Wikipedia corpus. This approach is based on a data mining-friendly view of narrative derived from narratology, and uses a prototype mining algorithm that implements this view. Our initial test case and focus is that of field-based educational tour narrative generation, for which we have successfully implemented a proof-of-concept system called Minotour. This system operates on a client-server model, in which the server mines a Wikipedia database dump to generate narratives between any two spatial features that have associated Wikipedia articles. The server then delivers those narratives to mobile device clients.
narrative theory, data mining, educational tourism
Travis Kriplean, Ivan Beschastnikh, David W. McDonald, and Scott A. Golder Community, Consensus, Coercion, Control: CS*W or How Policy Mediates Mass Participation GROUP 2007 -- ACM Conference on Supporting Group Work. 2007 [151] How Wikipedia participants apply and interpret policies on the talk pages that accompany each encyclopedia article.
When large groups cooperate, issues of conflict and control surface because of differences in perspective. Managing such diverse views is a persistent problem in cooperative group work. The Wikipedian community has responded with an evolving body of policies that provide shared principles, processes, and strategies for collaboration. We employ a grounded approach to study a sample of active talk pages and examine how policies are employed as contributors work towards consensus. Although policies help build a stronger community, we find that ambiguities in policies give rise to power plays. This lens demonstrates that support for mass collaboration must take into account policy and power.
Wikipedia, collaborative authoring, community, policy, power
Felipe Ortega and Jesus M. Gonzalez-Barahona Quantitative Analysis of the Wikipedia Community of Users WikiSym 2007, 21-23 October. Montreal, Canada. 2007 [152] Identification of the core group of very active users who leads most of the contribution process to the English Wikipedia. It extends the proposed research methodology to other language editions as well.
Many activities of editors in Wikipedia can be traced using its database dumps, which register detailed information about every single change to every article. Several researchers have used this information to gain knowledge about the production process of articles, and about activity patterns of authors. In this analysis, we have focused on one of those previous works, by Kittur et al. First, we have followed the same methodology with more recent and comprehensive data. Then, we have extended this methodology to precisely identify which fraction of authors are producing most of the changes in Wikipedia's articles, and how the behaviour of these authors evolves over time. This enabled us not only to validate some of the previous results, but also to find new interesting evidences. We have found that the analysis of sysops is not a good method for estimating different levels of contributions, since it is dependent on the policy for electing them (which changes over time and for each language). Moreover, we have found new activity patterns classifying authors by their contributions during specific periods of time, instead of using their total number of contributions over the whole life of Wikipedia. Finally, we present a tool that automates this extended methodology, implementing a quick and complete quantitative analysis of every language edition in Wikipedia.
wikipedia
Felipe Ortega, Jesus M. Gonzalez-Barahona and Gregorio Robles The Top Ten Wikipedias: A quantitative analysis using WikiXRay ICSOFT 2007, July 2007. Barcelona, Spain 2007 [153] Presents initial quantitative results and conclusions about the content creation process in the top ten language editions of Wikipedia.
In a few years, Wikipedia has become one of the information systems with more public (both producers and consumers) of the Internet. Its system and information architecture is relatively simple, but has proven to be capable of supporting the largest and more diverse community of collaborative authorship worldwide. In this paper, we analyze in detail this community, and the contents it is producing. Using a quantitative methodology based on the analysis of the public Wikipedia databases, we describe the main characteristics of the 10 largest language editions, and the authors that work in them. The methodology (which is almost completely automated) is generic enough to be used on the rest of the editions, providing a convenient framework to develop a complete quantitative analysis of the Wikipedia. Among other parameters, we study the evolution of the number of contributions and articles, their size, and the differences in contributions by different authors, inferring some relationships between contribution patterns and content. These relationships reflect (and in part, explain) the evolution of the different language editions so far, as well as their future trends.
wikipedia
Reid Priedhorsky, Jilin Chen, Shyong (Tony) K. Lam, Katherine Panciera, Loren Terveen, John Riedl Creating, Destroying, and Restoring Value in Wikipedia Department of Computer Science and Engineering University of Minnesota 2007 [154] Introduces the notion that the impact of an edit is best measured by the number of times the edited version is viewed.
Wikipedia's brilliance and curse is that any user can edit any of the encyclopedia entries. We introduce the notion of the impact of an edit, measured by the number of times the edited version is viewed. Using several datasets, including recent logs of all article views, we show that an overwhelming majority of the viewed words were written by frequent editors and that this majority is increasing. Similarly, using the same impact measure, we show that the probability of a typical article view being damaged is small but increasing, and we present empirically grounded classes of damage. Finally, we make policy recommendations for Wikipedia and other wikis in light of these findings.
wikipedia
Somnath Banerjee, Krishnan Ramanathan, Ajay Gupta Clustering Short Texts using Wikipedia The 30th Annual International ACM SIGIR Conference 2007 [155]
Subscribers to the popular news or blog feeds (RSS/Atom) often face the problem of information overload as these feed sources usually deliver large number of items periodically. One solution to this problem could be clustering similar items in the feed reader to make the information more manageable for a user. Clustering items at the feed reader end is a challenging task as usually only a small part of the actual article is received through the feed. In this paper, we propose a method of improving the accuracy of clustering short texts by enriching their representation with additional features from Wikipedia. Empirical results indicate that this enriched representation of text items can substantially improve the clustering accuracy when compared to the conventional bag of words representation.
cultering, rss
R. Almeida, B. Mozafari, and J. Junghoo On the Evolution of Wikipedia Proceedings of ICWSM 2007, International Conference on Weblogs and Social Media, 2007 2007 [156]
A recent phenomenon on the Web is the emergence and proliferation of new social media systems allowing social interaction between people. One of the most popular of these systems is Wikipedia that allows users to create content in a collaborative way. Despite its current popularity, not much is known about how users interact with Wikipedia and how it has evolved over time.
In this paper we aim to provide a first, extensive study of the user behavior on Wikipedia and its evolution. Compared to prior studies, our work differs in several ways. First, previous studies on the analysis of the user workloads (for systems such as peer-to-peer systems [10] and Web servers [2]) have mainly focused on understanding the users who are accessing information. In contrast, Wikipedia’s provides us with the opportunity to understand how users create and maintain information since it provides the complete evolution history of its content. Second, the main focus of prior studies is evaluating the implication of the user workloads on the system performance, while our study is trying to understand the evolution of the data corpus and the user behavior themselves.
Our main findings include that (1) the evolution and updates of Wikipedia is governed by a self-similar process, not by the Poisson process that has been observed for the general Web [4, 6] and (2) the exponential growth of Wikipedia is mainly driven by its rapidly increasing user base, indicating the importance of its open editorial policy for its current success. We also find that (3) the number of updates made to the Wikipedia articles exhibit a power-law distribution, but the distribution is less skewed than those obtained from other studies.
Wikipedia, user behavior, social systems
Enric Senabre Hidalgo Stigmergy, meritocracy and vandalism in peer-production: how can wikis grow Towards a Social Science of Web 2.0 2007 [157] All links have rotten? Abstract?
Adler, B. Thomas, and de Alfaro, Luca A Content-Driven Reputation System for the Wikipedia Proceedings of WWW 2007, the 16th International World Wide Web Conference, ACM Press, 2007 2007 [158]
We present a content-driven reputation system for Wikipedia authors. In our system, authors gain reputation when the edits they perform to Wikipedia articles are preserved by subsequent authors, and they lose reputation when their edits are rolled back or undone in short order. Thus, author reputation is computed solely on the basis of content evolution; user-to-user comments or ratings are not used. The author reputation we compute could be used to flag new contributions from low-reputation authors, or it could be used to allow only authors with high reputation to contribute to controversial or critical pages. A reputation system for the Wikipedia could also provide an incentive for high-quality contributions.
We have implemented the proposed system, and we have used it to analyze the entire Italian and French Wikipedias, consisting of a total of 691,551 pages and 5,587,523 revisions. Our results show that our notion of reputation has good predictive value: changes performed by low-reputation authors have a significantly larger than average probability of having poor quality, as judged by human observers, and of being later undone, as measured by our algorithms.
wikipedia
Gabrilovich, Evgeniy and Shaul Markovitch Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI), Hyderabad, India, January 2007. 2007 [159]
{{{2}}}
semantic, text-mining, wikipedia
Tunsch, Thomas: Museum Documentation and Wikipedia.de: Possibilities, opportunities and advantages for scholars and museums J. Trant and D. Bearman (eds). Museums and the Web 2007: Proceedings. Toronto: Archives & Museum Informatics, published March 31, 2007 at http://www.archimuse.com/mw2007/papers/tunsch/tunsch.html 2007 [160] post-conference communication: museums.wikia.com
The importance of Wikipedia for the documentation and promotion of museum holdings is gaining acceptance, and the number of references to articles is growing. However, the museum world still pays little attention to the Wikipedia project as a collaborative community with intentions, structures, and special features. Although these observations are based on museums in Germany and focus on the German Wikipedia, they are just as important and applicable to other museums and other editions of Wikipedia. Universities and libraries have already taken advantage of the Wikipedia and have established functional links.

In that the mission of museums is closely related to that of universities and libraries, the value of Wikipedia for museum professionals is worthy of consideration. This paper provides the complete study to serve as reference for the selected topics to be discussed in the professional forum.

Keywords: Wikipedia, documentation, collaborative, community, scholars, interconnections
Wikipedia; documentation; collaborative; community; scholars; interconnections
Viegas, Fernanda, Martin Wattenberg, Jesse Kriss, Frank van Ham Talk Before You Type: Coordination in Wikipedia Proceedings of Hawaiian International Conference of Systems Sciences Big Island, Hawaii. 2007 [161]
Wikipedia, the online encyclopedia, has attracted attention both because of its popularity and its unconventional policy of letting anyone on the internet edit its articles. This paper describes the results of an empirical analysis of Wikipedia and discusses ways in which the Wikipedia community has evolved as it hasgrown. We contrast our findings with an earlier study [11] and present three main results. First, the community maintains a strong resilience to malicious editing, despite tremendous growth and high traffic. Second, the fastest growing areas of Wikipedia are devoted to coordination and organization. Finally, we focus on a particular set of pages used to coordinate work, the “Talk” pages. By manually coding the content of a subset of these pages, we find that these pages serve many purposes, notably supporting strategic planning of edits and enforcement of standard guidelines and conventions. Our results suggest that despite the potential for anarchy, the Wikipedia community places a strong emphasis on group coordination, policy, and process.
empirical study, visualization, wiki, wikipedia
Ollivier, Yann, and Senellart, Pierre Finding Related Pages Using Green Measures: An Illustration with Wikipedia. Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (AAAI 2007) 2007 [162]
We introduce a new method for finding nodes semantically related to a given node in a hyperlinked graph: the Green method, based on a classical Markov chain tool. It is generic, adjustment-free and easy to implement. We test it in the case of the hyperlink structure of the English version of Wikipedia, the on-line encyclopedia. We present an extensive comparative study of the performance of our method versus several other classical methods in the case of Wikipedia. The Green method is found to have both the best average results and the best robustness.
PageRank, Markov chain, Green measure, Wikipedia
Robert P. Biuk-Aghai Visualizing Co-Authorship Networks in Online Wikipedia Proceedings of the International Symposium on Communications and Information Technologies 2006, Bangkok, Thailand, October 2006 2006 [163] Introduces using the concept of co-authorship of pages to infer relationships between those pages. These are then visualized in graph form. Also presents a novel visualization of results of Wikipedia searches.
The Wikipedia online user-contributed encyclopedia has rapidly become a highly popular and widely used online reference source. However, perceiving the complex relationships in the network of articles and other entities in Wikipedia is far from easy. We introduce the notion of using co-authorship of articles to determine relationship between articles, and present the WikiVis information visualization system which visualizes this and other types of relationships in the Wikipedia database in 3D graph form. A 3D star layout and a 3D nested cone tree layout are presented for displaying relationships between entities and between categories, respectively. A novel 3D pinboard layout is presented for displaying search results.
information visualization, co-authoring, Wikipedia, pinboard layout
Pedersen, Niels M. L. & Anders Due Wikipedia - viden som social handlen. Paper presented at The 3. Nordic Conference on Rhetoric, May 19-20, Oslo, Norway 2006 [164] Danish
Rafaeli, Sheizaf, Ariel, Yaron and Hayat, Tsahi Wikipedians Sense of (Virtual) Community. Presented at The eighth International Conference General Online Research (GOR06): Bielefeld, Germany 2006 [165] English
Sigurbjornsson, Borkur, Kamps, Jaap, and de Rijke, Maarten Focused Access to Wikipedia Proceedings DIR-2006 2006 [166]
Wikipedia is a "free" online encyclopedia. It contains millions of entries in many languages and is growing at a fast pace. Due to its volume, search engines play an important role in giving access to the information in Wikipedia. The "free" availability of the collection makes it an attractive corpus for in formation retrieval experiments. In this paper we describe the evaluation of a searchengine that provides focused search access to Wikipedia, i.e., a search engine which gives direct access to individual sections of Wikipedia pages. The main contributions of this paper are twofold. First, we introduce Wikipedia as a test corpus for information retrieval experiments in general and for semi-structured retrieval in particular. Second, we demonstrate that focused XML retrieval methods can be applied to a wider range of problems than searching scientific journals in XML format, including accessing reference works.
document structure, visualization, information searching
Rudiger Gleim, Alexander Mehler and Matthias Dehmer Web Corpus Mining by Instance of Wikipedia Proc. 2nd Web as Corpus Workshop at EACL 2006 2006 [167]
In this paper we present an approach to structure learning in the area of web documents. This is done in order to approach the goal of webgenre tagging in the area of web corpus linguistics. A central outcome of the paper is that purely structure oriented approaches to web document classification provide an information gain which may be utilized in combined approaches of web content and structure analysis.
Martin Hepp and Daniel Bachlechner and Katharina Siorpaes Harvesting Wiki Consensus - Using Wikipedia Entries as Ontology Elements Proceedings of the First Workshop on Semantic Wikis -- From Wiki to Semantics, co-located with the 3rd Annual European Semantic Web Conference (ESWC 2006), 2006. 2006 [168]
One major obstacle towards adding machine-readable annotation to existing Web content is the lack of domain ontologies. While FOAF and Dublin Core are popular means for expressing relationships between Web resources and between Web resources and literal values, we widely lack unique identifiers for common concepts and instances. Also, most available ontologies have a very weak community grounding in the sense that they are designed by single individuals or small groups of individuals, while the majority of potential users is not involved in the process of proposing new ontology elements or achieving consensus. This is in sharp contrast to natural language where the evolution of the vocabulary is under the control of the user community. At the same time, we can observe that, within Wiki communities, especially Wikipedia, a large number of users is able to create comprehensive domain representations in the sense of unique, machine-feasible, identifiers and concept definitions which are sufficient for humans to grasp the intension of the concepts. The English version of Wikipedia contains now more than one million entries and thus the same amount of URIs plus a human-readable description. While this collection is on the lower end of ontology expressiveness, it is likely the largest living ontology that is available today. In this paper, we (1) show that standard Wiki technology can be easily used as an ontology development environment for named classes, reducing entry barriers for the participation of users in the creation and maintenance of lightweight ontologies, (2) prove that the URIs of Wikipedia entries are surprisingly reliable identifiers for ontology concepts, and (3) demonstrate the applicability of our approach in a use case.
2006 ezweb folksonomy ontology wikipedia
Razvan Bunescu and Marius Pasca Using Encyclopedic Knowledge for Named Entity Disambiguation 11th Conference of the European Chapter of the Association for Computational Linguistics, : 9--16, 2006. 2006 [169] Despite not mentioning Wikipedia in title or abstract, the paper discusses it as one of the main examples.
We present a new method for detecting and disambiguating named entities in open domain text. A disambiguation SVM kernel is trained to exploit the high coverage and rich structure of the knowledge encoded in an online encyclopedia. The resulting model significantly outperforms a less informed baseline.
disambiguation named-entities wikipedia
Angela Beesley How and why Wikipedia works WikiSym '06: Proceedings of the 2006 international symposium on Wikis 2006 [170]
This talk discusses the inner workings of Wikipedia. Angela will address the roles, processes, and sociology that make up the project, with information on what happens behind the scenes and how the community builds and defends its encyclopedia on a daily basis. The talk will give some insight into why Wikipedia has worked so far and why we believe it will keep working in the the future despite the many criticisms that can be made of it. It is hoped that this review inspires further Wikipedia research. For this, please also see our Wikipedia Research workshop on Wednesday, which is open to walk-ins.
Design, Theory
Simon Overell and Stefan Ruger Identifying and Grounding Descriptions of Places SIGIR Workshop on Geographic Information Retrieval, 2006 [171]
In this paper we test the hypothesis Given a piece of text describing an object or concept our combined disambiguation method can disambiguate whether it is a place and ground it to a Getty Thesaurus of Geographical Names unique identifier with significantly more accuracy than naïve methods. We demonstrate a carefully engineered rule-based place name disambiguation system and give Wikipedia as a worked example with hand-generated ground truth and bench mark tests. This paper outlines our plans to apply the co-occurrence models generated with Wikipedia to solve the problem of disambiguating place names in text using supervised learning techniques.
Geographic Information Retrieval, Disambiguation, Wikipedia
A. Toral and R. Munoz A proposal to automatically build and maintain gazetteers for Named Entity Recognition by using Wikipedia EACL 2006, 2006. 2006 [172]
This paper describes a method to automatically create and maintain gazetteers for Named Entity Recognition (NER). This method extracts the necessary information from linguistic resources. Our approach is based on the analysis of an on-line encyclopedia entries by using a noun hierarchy and optionally a PoS tagger. An important motivation is to reach a high level of language independence. This restricts the techniques that can be used but makes the method useful for languages with few resources. The evaluation carried out proves that this approach can be successfully used to build NER gazetteers for location (F 78%) and person (F 68%) categories.
gazetteers, named-entities wikipedia
Ofer Arazy, Wayne Morgan and Raymond Patterson Wisdom of the Crowds: Decentralized Knowledge Construction in Wikipedia 16th Annual Workshop on Information Technologies & Systems (WITS) 2006 [173]
Recently, Nature published an article comparing the quality of Wikipedia articles to those of Encyclopedia Britannica (Giles 2005). The article, which gained much public attention, provides evidence for Wikipedia quality, but does not provide an explanation of the underlying source of that quality. Wikipedia, and wikis in general, aggregate information from a large and diverse author-base, where authors are free to modify any article. Building upon Surowiecki's (2005) Wisdom of Crowds, we develop a model of the factors that determine wiki content quality. In an empirical study of Wikipedia, we find strong support for our model. Our results indicate that increasing size and diversity of the author-base improves content quality. We conclude by highlighting implications for system design and suggesting avenues for future research.
Wikipedia, Wisdom of the Crowds, Collective Intelligence, information quality
Aurelie Herbelot and Ann Copestake Acquiring Ontological Relationships from Wikipedia Using RMRS Proc.of the ISWC 2006 Workshop on Web Content Mining with Human Language Technologies, 2006. 2006 [174]
We investigate the extraction of ontologies from biological text using a semantic representation derived from a robust parser. The use of a semantic representation avoids the problems that traditional pattern-based approaches have with complex syntactic constructions and long-distance dependencies. The discovery of taxonomic relationships is explored in a corpus consisting of 12,200 animal-related articles from the online encyclopaedia Wikipedia. The semantic representation used is Robust Minimal Recursion Semantics (RMRS). Initial experiments show good results in systematising extraction across a variety of hyponymic constructions.
linguistics ontology semantic text-mining wikipedia
Zhang, Yuejiao Wiki means more: hyperreading in Wikipedia HYPERTEXT '06: Proceedings of the seventeenth conference on Hypertext and hypermedia 2006 [175]
Based on the open-sourcing technology of wiki, Wikipedia has initiated a new fashion of hyperreading. Reading Wikipedia creates an experience distinct from reading a traditional encyclopedia. In an attempt to disclose one of the site's major appeals to the Web users, this paper approaches the characteristics of hyperreading activities in Wikipedia from three perspectives. Discussions are made regarding reading path, user participation, and navigational apparatus in Wikipedia.
Hypertext, Hypermedia, Human Factors, Theory
Schonhofen, Peter Identifying Document Topics Using the Wikipedia Category Network WI '06: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence 2006 [176]
In the last few years the size and coverage of Wikipe- dia, a freely available on-line encyclopedia has reached the point where it can be utilized similar to an ontology or tax- onomy to identify the topics discussed in a document. In this paper we will show that even a simple algorithm that exploits only the titles and categories of Wikipedia articles can characterize documents by Wikipedia categories sur- prisingly well. We test the reliability of our method by pre- dicting categories ofWikipedia articles themselves based on their bodies, and by performing classification and cluster- ing on 20 Newsgroups and RCV1, representing documents by their Wikipedia categories instead of their texts.
Retrieval models, Algorithms
Sangweon Suh and Harry Halpin and Ewan Klein Extracting Common Sense Knowledge from Wikipedia Proc. of the ISWC2006 Workshop on Web Content Mining with Human Language technology, 2006. 2006 [177]
Much of the natural language text found on the web contains various kinds of generic or “common sense” knowledge, and this information has long been recognized by artificial intelligence as an important supplement to more formal approaches to building Semantic Web knowledge bases. Consequently, we are exploring the possibility of automatically identifying “common sense” statements from unrestricted natural language text and mapping them to RDF. Our hypothesis is that common sense knowledge is often expressed in the form of generic statements such as Coffee is a popular beverage, and thus our work has focussed on the challenge of automatically identifying generic statements. We have been using the Wikipedia xml corpus as a rich source of common sense knowledge. For evaluation, we have been using the existing annotation of generic entities and relations in the ace 2005 corpus.
linguistics semantic text-mining wcmhlt2006, wikipedia
Gabriel Weaver, Barbara Strickland, Gregory Crane Quantifying the accuracy of relational statements in Wikipedia: a methodology JCDL '06: Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries 2006 [178]
An initial evaluation of the English Wikipedia indicates that it may provide accurate data for disambiguating and finding relations among named entities.
Wikipedia, link analysis, named-entity recognition
David Milne and Olena Medelyan and Ian H. Witten Mining Domain-Specific Thesauri from Wikipedia: A case study ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06) 2006 [179]
Domain-specific thesauri are high-cost, high-maintenance, high-value knowledge structures. We show how the classic thesaurus structure of terms and links can be mined automatically from Wikipedia, a vast, open encyclopedia. In a comparison with a professional thesaurus for agriculture (Agrovoc) we find that Wikipedia contains a substantial proportion of its domain-specific concepts and semantic relations; furthermore it has impressive coverage of a collection of contemporary documents in the domain. Thesauri derived using these techniques are attractive because they capitalize on existing public efforts and tend to reflect contemporary language usage better than their costly, painstakingly-constructed manual counterparts.
datamining information-retrieval semantic text-mining wikipedia
Wissner-Gross, A. D. Preparation of Topical Reading Lists from the Link Structure of Wikipedia Advanced Learning Technologies, 2006. Sixth International Conference on (2006), pp. 825-829. 2006 [180]
Personalized reading preparation poses an important challenge for education and continuing education. Using a PageRank derivative and graph distance ordering, we show that personalized background reading lists can be generated automatically from the link structure of Wikipedia. We examine the operation of our new tool in professional, student, and interdisciplinary researcher learning models. Additionally, we present desktop and mobile interfaces for the generated reading lists.
information-retrieval, link-mining, wikipedia
Spek, Sander and Postma, Eric and Herik, Jaap van den Wikipedia: organisation from a bottom-up approach Paper presented at the Research in Wikipedia-workshop of WikiSym 2006, Odense, Denmark. 2006 [181]
Wikipedia can be considered as an extreme form of a self-managing team, as a means of labour division. One could expect that this bottom-up approach, with the absence of top-down organisational control, would lead to a chaos, but our analysis shows that this is not the case. In the Dutch Wikipedia, an integrated and coherent data structure is created, while at the same time users succeed in distributing roles by self-selection. Some users focus on an area of expertise, while others edit over the whole encyclopedic range. This constitutes our conclusion that Wikipedia, in general, is a successful example of a self-managing team.
wikipedia
S. P. Ponzetto and M. Strube Exploiting semantic role labeling, WordNet and Wikipedia for coreference resolution Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, : 192--199, 2006. 2006 [182]
In this paper we present an extension of a machine learning based coreference resolution system which uses features induced from different semantic knowledge sources. These features represent knowledge mined from WordNet and Wikipedia, as well as information about semantic role labels. We show that semantic features indeed improve the performance on different referring expression types such as pronouns and common nouns.
coreference, semantic wikipedia
Krotzsch, Markus, Denny Vrandecic, Max Volkel Semantic Wikipedia International World Wide Web Conference. Proceedings of the 15th international conference on World Wide Web 2006 [183] no open content found
Wikipedia is the world's largest collaboratively edited source of encyclopaedic knowledge. But in spite of its utility, its contents are barely machine-interpretable. Structural knowledge, e.,g. about how concepts are interrelated, can neither be formally stated nor automatically processed. Also the wealth of numerical data is only available as plain text and thus can not be processed by its actual meaning. We provide an extension to be integrated in Wikipedia, that allows the typing of links between articles and the specification of typed data inside the articles in an easy-to-use manner. Enabling even casual users to participate in the creation of an open semantic knowledge base, Wikipedia has the chance to become a resource of semantic statements, hitherto unknown regarding size, scope, openness, and internationalisation. These semantic enhancements bring to Wikipedia benefits of today's semantic technologies: more specific ways of searching and browsing. Also, the RDF export, that gives direct access to the formalised knowledge, opens Wikipedia up to a wide range of external applications, that will be able to use it as a background knowledge base. In this paper, we present the design, implementation, and possible uses of this extension.
Denoyer, Ludovic, Patrick Gallinari The Wikipedia XML corpus SIGIR Conference Proceedings. Volume 40 , Issue 1 (June 2006). WORKSHOP SESSION: INEX. Pages: 64 - 69 Year of Publication: 2006 ISSN:0163-5840 2006 [184] no open content found
Wikipedia is a well known free content, multilingual encyclopedia written collaboratively by contributors around the world. Anybody can edit an article using a wiki markup language that offers a simplified alternative to HTML. This encyclopedia is composed of millions of articles in different languages.
Hypertext, Hypermedia, XML
Michael Strube and Simone Paolo Ponzetto WikiRelate! Computing Semantic Relatedness Using Wikipedia. 21. AAAI / 18. IAAI 2006, 2006. 2006 [185]
Wikipedia provides a knowledge base for computing word relatedness in a more structured fashion than a search engine and with more coverage than WordNet. In this work we present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google counts, and we show that Wikipedia outperforms WordNet when applied to the largest available dataset designed for that purpose. The best results on this dataset are obtained by integrating Google, WordNet and Wikipedia based measures. We also show that including Wikipedia improves the performance of an NLP application processing naturally occurring texts.
Wikipedia ontology relatedness semantic_web
Sergey Chernov and Tereza Iofciu and Wolfgang Nejdl and Xuan Zhou Extracting Semantic Relationships between Wikipedia Categories 1st Workshop on Semantic Wikis:, 2006. 2006 [186]
The Wikipedia is the largest online collaborative knowledge sharing system, a free encyclopedia. Built upon traditional wiki architectures, its search capabilities are limited to title and full-text search. We suggest that semantic information can be extracted from Wikipedia by analyzing the links between categories. The results can be used for building a semantic schema for Wikipedia which could improve its search capabilities and provide contributors with meaningful suggestions for editing theWikipedia pages.We analyze relevant measures for inferring the semantic relationships between page categories of Wikipedia. Experimental results show that Connectivity Ratio positively correlates with the semantic connection strength.
semantic wikipedia
McGuinness, Deborah L., Honglei Zeng, Paulo Pinheiro da Silva, Li Ding, Dhyanesh Narayanan, Mayukh Bhaowal Investigations into Trust for Collaborative Information Repositories: A Wikipedia Case Study Proceedings of the Workshop on Models of Trust for the Web 2006 [187]
As collaborative repositories grow in popularity and use, issues concerning the quality and trustworthiness of information grow. Some current popular repositories contain contributions from a wide variety of users, many of which will be unknown to a potential end user. Additionally the content may change rapidly and information that was previously contributed by a known user may be updated by an unknown user. End users are now faced with more challenges as they evaluate how much they may want to rely on information that was generated and updated in this manner. A trust management layer has become an important requirement for the continued growth and acceptance of collaboratively developed and maintained information resources. In this paper, we will describe our initial investigations into designing and implementing an extensible trust management layer for collaborative and/or aggregated repositories of information. We leverage our work on the Inference Web explanation infrastructure and exploit and expand the Proof Markup Language to handle a simple notion of trust. Our work is designed to support representation, computation, and visualization of trust information. We have grounded our work in the setting of Wikipedia. In this paper, we present our vision, expose motivations, relate work to date on trust representation, and present a trust computation algorithm with experimental results. We also discuss some issues encountered in our work that we found interesting.
Trust, Wikipedia, Inference Web, Proof Markup Language, Open Editing.
Gabrilovich, Evgeniy and Shaul Markovitch Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge. Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06), pp. 1301-1306. 2006 [188]
When humans approach the task of text categorization, they interpret the specific wording of the document in the much larger context of their background knowledge and experience. On the other hand, state-of-the-art information retrieval systems are quite brittle -- they traditionally represent documents as bags of words, and are restricted to learning from individual word occurrences in the (necessarily limited) training set. For instance, given the sentence “Wal-Mart supply chain goes real time”, how can a text categorization system know that Wal-Mart manages its stock with RFID technology? And having read that “Ciprofloxacin belongs to the quinolones group”, how on earth can a machine know that the drug mentioned is an antibiotic produced by Bayer? In this paper we present algorithms that can do just that. We propose to enrich document representation through automatic use of a vast compendium of human knowledge -- an encyclopedia. We apply machine learning techniques to Wikipedia, the largest encyclopedia to date, which surpasses in scope many conventional encyclopedias and provides a cornucopia of world knowledge. EachWikipedia article represents a concept, and documents to be categorized are represented in the rich feature space of words and relevant Wikipedia concepts. Empirical results confirm that this knowledge-intensive representation brings text categorization to a qualitatively new level of performance across a diverse collection of datasets.
information-retrieval, text-mining, wikipedia
Grassineau, Benjamin Wikipedia et le relativisme democratique OMNSH 2006 [189] French
Krizhanovsky, Andrew Synonym search in Wikipedia: Synarcher. 11-th International Conference "Speech and Computer" SPECOM'2006. Russia, St. Petersburg, June 25-29, pp. 474-477 2006 [190]
The program Synarcher for synonym (and related terms) search in the text corpus of special structure (Wikipedia) was developed. The results of the search are presented in the form of graph. It is possible to explore the graph and search for graph elements interactively. Adapted HITS algorithm for synonym search, program architecture, and program work evaluation with test examples are presented in the paper. The proposed algorithm can be applied to a query expansion by synonyms (in a search engine) and a synonym dictionary forming.
HITS, Semantic relatedness
Fissaha Adafre, Sisay and de Rijke, Maarten Finding Similar Sentences across Multiple Languages in Wikipedia EACL 2006 Workshop on New Text?Wikis and Blogs and Other Dynamic Text Sources 2006 [191]
We investigate whether theWikipedia corpus is amenable to multilingual analysis that aims at generating parallel corpora. We present the results of the application of two simple heuristics for the identification of similar text across multiple languages inWikipedia. Despite the simplicity of the methods, evaluation carried out on a sample ofWikipedia pages shows encouraging results.
nlp, wikipedia
Fissaha Adafre, Sisay and de Rijke, Maarten Exploratory Search in Wikipedia Proceedings SIGIR 2006 workshop on Evaluating Exploratory Search Systems (EESS) 2006 [192]
We motivate the need for studying the search, discovery and retrieval requirements of Wikipedia users. Based on a sample from an experimental Wikipedia search engine, we hypothesize that the fraction of Wikipedia searches that are exploratory in nature is at least the same as that of general web searches. We also describe a questionnaire for eliciting search, discovery and retrieval requirements from Wikipedia users.
Wikipedia, interfaces, exploratory search
Forte, Andrea, Amy Bruckman From Wikipedia to the classroom: exploring online publication and learning International Conference on Learning Sciences. Proceedings of the 7th international conference on Learning sciences 2006 [193]
Wikipedia represents an intriguing new publishing paradigm. Can it be used to engage students in authentic collaborative writing activities? How can we design wiki publishing tools and curricula to support learning among student authors? We suggest that wiki publishing environments can create learning opportunities that address four dimensions of authenticity: personal, real world, disciplinary, and assessment. We have begun a series of design studies to investigate links between wiki publishing experiences and writing-to-learn. The results of an initial study in an undergraduate government course indicate that perceived audience plays an important role in helping students monitor the quality of writing; however, students’ perception of audience on the Internet is not straightforward. This preliminary iteration resulted in several guidelines that are shaping efforts to design and implement new wiki publishing tools and curricula for students and teachers.
wikipedia, teaching
Maria R. Casado and Enrique Alfonseca and Pablo Castells From Wikipedia to Semantic Annotations: automatic relationship extraction 1st Workshop on Semantic Wikis:, 2006. 2006 [194] all links have rotted? annotation semantic text-mining wikipedia
Buriol L.S., Castillo C., Donato D., Leonardi S., Millozzi S. Temporal Analysis of the Wikigraph. Proceedings of the Web Intelligence Conference (WI), Hong Kong 2006. Published by IEEE CS Press. 2006 [195]
Wikipedia (www.wikipedia.org) is an online encyclopedia, available in more than 100 languages and comprising over 1 million articles in its English version. If we consider each Wikipedia article as a node and each hyperlink between articles as an arc we have a “Wikigraph”, a graph that represents the link structure of Wikipedia. The Wikigraph differs from other Web graphs studied in the literature by the fact that there are timestamps associated with each node. The timestamps indicate the creation and update dates of each page, and this allows us to do a detailed analysis of the Wikipedia evolution over time. In the first part of this study we characterize this evolution in terms of users, editions and articles; in the second part, we depict the temporal evolution of several topological properties of the Wikigraph. The insights obtained from the Wikigraphs can be applied to large Web graphs from which the temporal data is usually not available.
analysis, wiki
Caldarelli, Guido; Capocci, Andrea; Servedio, Vito; Buriol, Luciana; Donato, Debora; Leonardi, Stefano Preferential attachment in the growth of social networks: the case of Wikipedia American Physical Society. APS March Meeting, March 13-17, 2006 2006 [196]
Here we present experimental data and a model in order to describe the evolution of a socio-technological system. The case of study presented is that of the online free encyclopedia Wikipedia, for which we have the complete series of pages addition during time. The varioius entries and the hyperlinks between them can be described as a graph. We find scale-invariant behaviour in the distribution of the degree and a topology similar to that of the World Wide Web. By using the information on dynamics we are able to model and reproduce the features of this system. We also find that regardless the fact that any user has the possibility of global reshape, still Wikipedia has a growth described by local rules as that of the preferential attachment.
link mining, small world, web, wikipedia
Caldarelli, Guido; Capocci, Andrea; Servedio, Vito; Buriol, Luciana; Donato, Debora; Leonardi, Stefano Preferential attachment in the growth of social networks: the case of Wikipedia American Physical Society. APS March Meeting, March 13-17, 2006 2006 [197]
Here we present experimental data and a model in order to describe the evolution of a socio-technological system. The case of study presented is that of the online free encyclopedia Wikipedia, for which we have the complete series of pages addition during time. The varioius entries and the hyperlinks between them can be described as a graph. We find scale-invariant behaviour in the distribution of the degree and a topology similar to that of the World Wide Web. By using the information on dynamics we are able to model and reproduce the features of this system. We also find that regardless the fact that any user has the possibility of global reshape, still Wikipedia has a growth described by local rules as that of the preferential attachment.
Mehler, Alexander Text Linkage in the Wiki Medium - A Comparative Study Proceedings of the EACL 2006 Workshop on New Text - Wikis and blogs and other dynamic text sources, Trento, Italy, April 3-7, pp. 1-8 2006 [198] Despite not mentioning Wikipedia in title or abstract, the paper discusses it as one of the main examples.
We analyze four different types of document networks with respect to their small world characteristics. These characteristics allow distinguishing wiki-based systems from citation and more traditional text-based networks augmented by hyperlinks. The study provides evidence that a more appropriate network model is needed which better reflects the specifics of wiki systems. It puts emphasize on their topological differences as a result of wikirelated linking compared to other textbased networks.
wikipedia
Mainguy Gaell Wikipedia and science publishing. Has the time come to end the liaisons dangereuses? paper presented at the 3rd NATO-UNESCO Advanced Research Workshop Science Education: Talent Recruitment and Public Understanding. Balatonfured, Hungary, 20-22 October 2006 2006 [199]
Structuring information into knowledge is an important challenge for the 21st century. The emergence of internet and the diffusion of collaborative practices provide new tools with which to build and share knowledge. Scientists are seeking efficient ways to get recognition and to diffuse their work while Wikipedia is seeking well grounded contributors to shape in-depth articles. Science publishing and Wikipedia are thus profoundly modifying access to knowledge and may provide suitable conditions for a reorganization of the academic landscape.
Science publishing, Wikipedia, open access, knowledge management
Ma, Cathy The Social, Cultural, Economical Implications of the Wikipedia Paper submitted to Computers and Writing Online 2005 2005 [200]
Wikipedia is a non-profit online project that aims at building an encyclopedia for everyone. It has attracted thousands of users to contribute and collaborate on a voluntary base. In this paper I argue that Wikipedia poses a new model of collaboration founded on three assumptions trust, openness and reduced barrier of participation as opposed to more conventional models of collaboration based on authority and hierarchy. With this new-found social structure in mind, the cultural implications of the Wikipedia will be discussed in relation to the notion of Commons-Based Peer Production (CBPP) as proposed by Benkler in 2002, concluded with an analysis of the challenges that are facing the Wikipedia project, the problem of credibility building and vandalism control.
Denise Anthony, Sean Smith, & Tim Williamson Explaining Quality in Internet Collective Goods: Zealots and Good Samaritans in the Case of Wikipedia Fall 2005 Innovation & Enterpreneurship Seminar at MIT 2005 [201]
One important innovation in information and communication technology developed over the past decade was organizational rather than merely technological. Open source production is remarkable because it converts a private commodity (typically software) into a public good. A number of studies examine the factors motivating contributions to open source production goods, but we argue it is important to understand the causes of high quality contributions to such goods. In this paper, we analyze quality in the open source online encyclopedia Wikipedia. We find that, for users who create an online persona through a registered user name, the quality of contributions increases as the number of contributions increase, consistent with the idea of experts motivated by reputation and committed to the Wikipedia community. Unexpectedly, however, we find the highest quality contributions come from the vast numbers of anonymous “Good Samaritans” who contribute infrequently. Our findings that Good Samaritans as well as committed “Zealots” contribute high quality content to Wikipedia suggest that open source production is remarkable as much for its organizational as its technological innovation that enables vast numbers of anonymous one-time contributors to create high quality, essentially public goods.
Stvilia, B., Twidale, M. B., Gasser, L., Smith, L. C. Information quality in a community-based encyclopedia Knowledge Management: Nurturing Culture, Innovation, and Technology - Proceedings of the 2005 International Conference on Knowledge Management (pp. 101-113) 2005 [202]
We examine the Information Quality aspects of Wikipedia. By a study of the discussion pages and other process-oriented pages within the Wikipedia project, it is possible to determine the information quality dimensions that participants in the editing process care about, how they talk about them, what tradeoffs they make between these dimensions and how the quality assessment and improvement process operates. This analysis helps in understanding how high quality is maintained in a project where anyone may participate with no prior vetting. It also carries implications for improving the quality of more conventional datasets.
information quality, negotiations
Stvilia, B., Twidale, M. B., Gasser, L., Smith, L. C. Assessing information quality of a community-based encyclopedia Proceedings of the International Conference on Information Quality - ICIQ 2005. Cambridge, MA. 442-454 2005 [203]
Effective information quality analysis needs powerful yet easy ways to obtain metrics. The English version of Wikipedia provides an extremely interesting yet challenging case for the study of Information Quality dynamics at both macro and micro levels. We propose seven IQ metrics which can be evaluated automatically and test the set on a representative sample of Wikipedia content. The methodology of the metrics construction and the results of tests, along with a number of statistical characterizations of Wikipedia articles, their content construction, process metadata and social context are reported.
information quality
Ruiz M. Casado and Enrique Alfonseca and Pablo Castells Automatic Extraction of Semantic Relationships for WordNet by Means of Pattern Learning from Wikipedia Natural Language Processing and Information Systems: 10th International Conference on Applications of Natural Language to Information Systems, NLDB 2005, Alicante, Spain, June 15-17, 2005: Proceedings, 2005 2005 [204]
This paper describes an automatic approach to identify lexical patterns which represent semantic relationships between concepts, from an on-line encyclopedia. Next, these patterns can be applied to extend existing ontologies or semantic networks with new relations. The experiments have been performed with the Simple English Wikipedia and WordNet 1.7. A new algorithm has been devised for automatically generalising the lexical patterns found in the encyclopedia entries. We have found general patterns for the hyperonymy, hyponymy, holonymy and meronymy relations and, using them, we have extracted more than 1200 new relationships that did not appear in WordNet originally. The precision of these relationships ranges between 0.61 and 0.69, depending on the relation.
learning, semantic wikipedia
Emigh, William and Herring, Susan C. Collaborative Authoring on the Web: A Genre Analysis of Online Encyclopedias Paper presented at the 39th Hawaii International Conference on System Sciences. ≪ Collaboration Systems and Technology Track ≫, Hawai. 2005 [205]
This paper presents the results of a genre analysis of two web-based collaborative authoring environments, Wikipedia and Everything2, both of which are intended as repositories of encyclopedic knowledge and are open to contributions from the public. Using corpus linguistic methods and factor analysis of word counts for features of formality and informality, we show that the greater the degree of post-production editorial control afforded by the system, the more formal and standardized the language of the collaboratively-authored documents becomes, analogous to that found in traditional print encyclopedias. Paradoxically, users who faithfully appropriate such systems create homogeneous entries, at odds with the goal of open-access authoring environments to create diverse content. The findings shed light on how users, acting through mechanisms provided by the system, can shape (or not) features of content in particular ways. We conclude by identifying sub-genres of webbased collaborative authoring environments based on their technical affordances.
collaboration
Rafaeli, Sheizaf, Hayat, Tsahi and Ariel, Yaron Wikipedia Participants and "Ba": Knowledge Building and Motivations. Paper Presented at Cyberculture 3rd Global Conference. Prague, Czech Republic 2005 [206] English
Rafaeli, Sheizaf, Hayat, Tsahi and Ariel, Yaron Wikipedians' sense of community, motivations, and knowledge building. Proceedings of Wikimania 2005 - The First International Wikimedia Conference, Frankfurt, Germany 2005 [207] English
In this paper, we examine the discursive situation of Wikipedia. The primary goal is to explore principle ways of analyzing and characterizing the various forms of communicative user interaction using Foucault"s discourse theory. First, the communicative situation of Wikipedia is addressed and a list of possible forms of communication is compiled. Second, the current research on the linguistic features of Wikis, especially Wikipedia, is reviewed. Third, some key issues of Foucault"s theory are explored: the notion of "discourse", the discursive formation, and the methods of archaeology and genealogy, respectively. Finally, first steps towards a qualitative discourse analysis of the English Wikipedia are elaborated. The paper argues, that Wikipedia can be understood as a discursive formation that regulates and structures the production of statements. Most of the discursive regularities named by Foucault are established in the collaborative writing processes of Wikipedia, too. Moreover, the editing processes can be described in Foucault"s terms as discursive knowledge production.
Krotzsch, Markus, Denny Vrandecic, Max Volkel Wikipedia and the Semantic Web The Missing Links Wikimania'05 2005 [208] Follow-up? [209]
Wikipedia is the biggest collaboratively created source of encyclopaedic knowledge. Growing beyond the borders of any traditional encyclopaedia, it is facing new problems of knowledge management: The current excessive usage of article lists and categories witnesses the fact that 19th century content organization technologies like inter-article references and indices are no longer su#cient for today's needs. Rather, it is necessary to allow knowledge processing in a computer assisted way, for example to intelligently query the knowledge base. To this end, we propose the introduction of typed links as an extremely simple and unintrusive way for rendering large parts of Wikipedia machine readable. We provide a detailed plan on how to achieve this goal in a way that hardly impacts usability and performance, propose an implementation plan, and discuss possible difficulties on Wikipedia's way to the semantic future of the World Wide Web. The possible gains of thisendeavor are huge; we sketch them by considering some immediate applications that semantic technologies can provide to enhance browsing, searching, and editing Wikipedia.
Semantic web, Wikipedia
Buntine, Wray Static Ranking of Web Pages, and Related Ideas Open Source Web Information Retrieval 2005 [210] Link-based analysis
Voss, Jakob Measuring Wikipedia. Proceedings International Conference of the International Society for Scientometrics and Informetrics : 10th, Stockholm (Sweden) 2005 [211]
Wikipedia, an international project that uses Wiki software to collaboratively create an encyclopaedia, is becoming more and more popular. Everyone can directly edit articles and every edit is recorded. The version history of all articles is freely available and allows a multitude of examinations. This paper gives an overview on Wikipedia research. Wikipedia’s fundamental components, i.e. articles, authors, edits, and links, as well as content and quality are analysed. Possibilities of research are explored including examples and first results. Several characteristics that are found in Wikipedia, such as exponential growth and scale-free networks are already known in other context. However the Wiki architecture also possesses some intrinsic specialities. General trends are measured that are typical for all Wikipedias but vary between languages in detail.
social web, wikipedia
Bellomi, Francesco and Roberto Bonato Network Analysis for Wikipedia Proceedings of Wikimania 2005, Frankfurt, Germany. 2005 [212]
Network analysis is concerned with properties related to connectivity and distances in graphs, with diverse applications like citation indexing and information retrieval on the Web. HITS (Hyperlink-Induced Topic Search) is a network analysis algorithm that has been successfully used for ranking web pages related to a common topic according to their potential relevance. HITS is based on the notions of hub and authority: a good hub is a page that points to several good authorities; a good authority is a page that is pointed at by several good hubs. HITS exclusively relies on the hyperlink relations existing among the pages, to define the two mutually reinforcing measures of hub and authority. It can be proved that for each page these two weights converge to fixed points, the actual hub and authority values for the page. Authority is used to rank pages resulting from a given query (and thus potentially related to a given topic) in order of relevance. The hyperlinked structure of Wikipedia and the ongoing, incremental editing process behind it make it an interesting and unexplored target domain for network analysis techniques. In particular, we explored the relevance of the notion of HITS's authority on this encyclopedic corpus. We've developed a crawler that extensively scans through the structure of English language Wikipedia articles, and that keeps track for each entry of all other Wikipedia articles pointed at in its de ̄nition. The result is a directed graph (roughly 500000 nodes, and more than 8 millions links), which consists for the most part of a big loosely connected component. Then we applied the HITS algorithm to the latter, thus getting a hub and authority weight associated to every entry. First results seem to be meaningful in characterizing the notion of authority in this peculiar domain. Highest-rank authorities seem to be for the most part lexical elements that denote particular and concrete rather than universal and abstract entities. More precisely, at the very top of the authority scale there are concepts used to structure space and time like country names, city names and other geopolitical entities (such as United States and many European countries), historical periods and landmark events (World War II, 1960s). "Television", "scientifc classification" and "animal" are the first three most authoritative common nouns. We will also present the first results issued from the application of well-known PageRank algorithm (Google's popular ranking metrics detailed in [2]) to the Wikipedia entries collected by our crawler.
link-mining, wikipedia
Reagle, Joseph M. A Case of Mutual Aid: Wikipedia, Politeness, and Perspective Taking Proceedings of Wikimania 2005 -- The First International Wikimedia Conference, Frankfurt, Germany. 2005 [213]
The anarchist Peter Kropotkin once wrote that “Mutual aid is as much a law of animal life as mutual struggle” (1902). At the time, he was responding to arguments arising from Darwin's The Origin of Species: that in nature and society individual creatures ceaselessly struggle against each other for dominance. Kropotkin took pains to explain and provide examples of how animals and humans survive by cooperating with each other. Interestingly, Kropotkin also contributed the article on anarchism to the 1911 Encyclopadia Britannica, a collaborative product of the Scottish Enlightenment and a precursor to the Wikipedia, a collaborative, on-line, and free encyclopedia. This paper explores the character of “mutual aid” and interdependent decision making within the Wikipedia. I provide a brief introduction to Wikipedia, the key terms associated with group decision making, and the Wikipedia dispute resolution process. I then focus on the cultural norms (e.g., “good faith”) within Wikipedia that frame participation as a cooperative endeavor. In particular, I argue that the “neutral point of view policy” policy is not a source of conflict, as it is often perceived to be, but a resolution shaping norm. However, the naive understanding that this policy is about an unbiased neutrality is also problematic. I conclude by identifying some notions from negotiation literature that may be inappropriate or require adaptation to the Wikipedia case.
collaboration, collective action, mutual aid, wiki, wikipedia
Fissaha Adafre, Sisay and de Rijke, Maarten Discovering Missing Links in Wikipedia Proceedings of the Workshop on Link Discovery: Issues, Approaches and Applications (LinkKDD-2005) 2005 [214]
In this paper we address the problem of discovering missing hypertext links in Wikipedia. The method we propose consists of two steps: first, we compute a cluster of highly similar pages around a given page, and then we identify candidate links from those similar pages that might be missing on the given page. The main innovation is in the algorithm that we use for identifying similar pages, LTRank, which ranks pages using co-citation and page title information. Both LTRank and the link discovery method are manually evaluated and show acceptable results, especially given the simplicity of the methods and conservativeness of the evaluation criteria.
missing links, wikipedia, clustering, system issues
Bryant, Susan, Andrea Forte and Amy Bruckman Becoming Wikipedian: Transformation of participation in a collaborative online encyclopedia Proceedings of GROUP International Conference on Supporting Group Work, 2005. pp 1.-10. 2005 [215]
Traditional activities change in surprising ways when computermediated communication becomes a component of the activity system. In this descriptive study, we leverage two perspectives on social activity to understand the experiences of individuals who became active collaborators in Wikipedia, a prolific, cooperatively-authored online encyclopedia. Legitimate peripheral participation provides a lens for understanding participation in a community as an adaptable process that evolves over time. We use ideas from activity theory as a framework to describe our results. Finally, we describe how activity on the Wikipedia stands in striking contrast to traditional publishing and suggests a new paradigm for collaborative systems.
community, incentives, wikipedia
Ahn, David, Jijkoun, Valentin, Mishne, Gilad, Muller, Karin, de Rijke, Maarten, and Schlobach, Stefan Using Wikipedia at the TREC QA Track The Thirteenth Text Retrieval Conference (TREC 2004) 2005 [216]
We describe our participation in the TREC 2004 Question Answering track. We provide a detailed account of the ideas underlying our approach to the QA task, especially to the so-called "other" questions. This year we made essential use of Wikipedia, the free online encyclopedia, both as a source of answers to factoid questions and as an importance model to help us identify material to be returned in response to "other" questions.
question-answering, semantic text-mining, wikipedia
Augur, Naomi, Ruth Raitman and Wanlei Zhou Teaching and learning online with wikis 21st Annual Conference of the Australasian Society for Computers in Learning in Tertiary Education. Perth, Australia: Australasian Society for Computers in Learning in Tertiary Education (ASCILITE). (5th-8th Dec 2004). 95-104. 2004 [217] Despite Wikipedia not being mentioned in title or abstract, it is a common example and heavily discussed in article itself.
Wikis are fully editable websites; any user can read or add content to a wiki site. This functionality means that wikis are an excellent tool for collaboration in an online environment. This paper presents wikis as a useful tool for facilitating online education. Basic wiki functionality is outlined and different wikis are reviewed to highlight the features that make them a valuable technology for teaching and learning online. Finally, the paper discuses a wiki project underway at Deakin University. This project uses a wiki to host an icebreaker exercise which aims to facilitate ongoing interaction between members of online learning groups. Wiki projects undertaken in America are outlined and future wiki research plans are also discussed. These wiki projects illustrate how e-learning practitioners can and are moving beyond their comfort zone by using wikis to enhance the process of teaching and learning online.
wiki, teaching
Bellomi F., Bonato R. Lexical Authorities in an Encyclopedic Corpus: a Case Study with Wikipedia. Paper presented at the International Colloquium on ‘Word structure and lexical systems: models and applications’, December 16 - 18, 2004, University of Pavia, Pavia, Italy. 2004 [218] Blog description only? link-mining, wikipedia
Lih, Andrew Wikipedia as Participatory Journalism: Reliable Sources? Paper presented at the 5th International Symposium on Online Journalism, April 16 - 17, 2004, Austin, Texas, United States. 2004 [219]
Wikipedia is an Internet-based, user contributed encyclopedia that is collaboratively edited, and utilizes the wiki concept -- the idea that any user on the Internet can change any page within the Web site, even anonymously. Paradoxically, this seemingly chaotic process has created a highly regarded reference on the Internet. Wikipedia has emerged as the largest example of participatory journalism to date -- facilitating many-to-many communication among users editing articles, all working towards maintaining a neutral point of view -- Wikipedia’s mantra. This study examines the growth of Wikipedia and analyzes the crucial technologies and community policies that have enabled the project to prosper. It also analyzes Wikipedia’s articles that have been cited in the news media, and establishes a set of metrics based on established encyclopedia taxonomies and analyzes the trends in Wikipedia being used as a source.
wikipedia, journalism
Viegas, F. B., Wattenberg, M. and Dave, K. Studying cooperation and conflict between authors with history flow visualizations CHI 2004, 575-582. 2004 [220]
The Internet has fostered an unconventional and powerful style of collaboration: “wiki” web sites, where every visitor has the power to become an editor. In this paper we investigate the dynamics of Wikipedia, a prominent, thriving wiki. We make three contributions. First, we introduce a new exploratory data analysis tool, the history flow visualization, which is effective in revealing patterns within the wiki context and which we believe will be useful in other collaborative situations as well. Second, we discuss several collaboration patterns highlighted by this visualization tool and corroborate them with statistical analysis. Third, we discuss the implications of these patterns for the design and governance of online collaborative social spaces. We focus on the relevance of authorship, the value of community surveillance in ameliorating antisocial behavior, and how authors with competing perspectives negotiate their differences.
collaborative writing, social informatics, visualization, wikis
Smolenski, Nikola Wikipedia in Serbian language and Cyrillic script. Presentation at scientific-technical conference "Contemporary informatic technologies - Internet and Cyrillic script", November 25, Bijeljina. 2003 [221] Serbian?
Moller, Erik Loud and clear: How Internet media can work. Presentation at Open Cultures conference, June 5 - 6, Vienna. 2003 [222] Video and no abstract?
Winkler, Stefan Selbstorganisation der Kommunikation Wissenschaft - Offentlichkeit im virtuellen Raum, Koblenz, Forschungsstelle Wissenstransfer. Un­known 2003 German
Primo, Alex Fernando Teixeira and Recuero, Raquel da Cunha Hipertexto cooperativo: Uma analise da escrita coletiva a partir dos blogs e da Wikipedia. Paper presented at Seminario Internacional da Comunicacao. "Da aldeia global ao ciberespaco: Tecnologias do imaginario como extensao do homem", Porto Alegre 2003 [223] Portuguese
O artigo tem o objetivo de analisar e discutir as caracteristicas da escrita coletiva, segundo o conceito de hipertexto cooperativo. A partir disso, discute-se como os blogs e a wikipedia (uma enciclopedia digital construida online) viabilizam a concretizacao de uma uma "web viva", ou seja, redigida e interligada pelos proprios internautas.

Journal articles[edit]

This table is sortable.
Authors Title Source Year Online Notes Abstract Keywords
Lucky, Robert W. A Billion Amateurs. IEEE Spectrum Volume 44 Pages 96 2007
The author reflects on the positive impact and potential of the internet. He believes that the technology has unleashed creativity and generosity from amateurs around the world. He cites {YouTube,} Facebook, Wikipedia, Internet Movie Database and Flickr, among others, as examples of websites whose content is provided free by amateurs.
Pekarek, Martin & Potzsch, Stefanie A comparison of privacy issues in collaborative workspaces and social networks Identity in the Information Society Volume 2 2009 [224]
Johnson, P.T.; Chen, J.K.; Eng, J.; Makary, M.A. & Fishman, E.K. A comparison of World Wide Web resources for identifying medical information Academic Radiology Volume 15 2008 [225]
{{{2}}}
Morse, G A conversation with Jimmy Wales HARVARD BUSINESS REVIEW Volume 86 2008 [226]
The founder of Wikipedia analyzes why wikis are becoming popular tools for sharing knowledge in the workplace. He encourages managers to provide institutional support for these highly practical forums but to be judicious about direct participation.
Judd, Terry & Kennedy, Gregor A five-year study of on-campus Internet use by undergraduate biomedical students Computers and Education Volume 55 2010 [227]
This paper reports on a five-year study (2005-2009) of biomedical students' on-campus use of the Internet. Internet usage logs were used to investigate students' sessional use of key websites and technologies. The most frequented sites and technologies included the university's learning management system, Google, email and Facebook. Email was the primary method of electronic communication. However, its use declined over time, with a steep drop in use during 2006 and 2007 appearing to correspond with the rapid uptake of the social networking site Facebook. Both Google and Wikipedia gained in popularity over time while the use of other key information sources, including the library and biomedical portals, remained low throughout the study. With the notable exception of Facebook, most {'Web} 2.0' technologies attracted little use. The {'Net} Generation' students involved in this study were heavy users of generalist information retrieval tools and key online university services, and prefered to use externally hosted tools for online communication. These and other findings have important implications for the selection and provision of services by universities. 2010 Elsevier Ltd. All rights reserved.
Murugeshan, Meenakshi Sundaram; Lakshmi, K. & Mukherjee, Saswati A negative category based approach for Wikipedia document classification International Journal of Knowledge Engineering and Data Mining Volume 1 2010 [228]
Castelluccio, M. A New Year, a New Internet Strategic Finance Volume 89 Pages 59 2008
A wiki, according to the guy who invented them, is the simplest online database that could possibly work. Ward Cunningham launched his first wiki in 1995, and the format has been widely adopted since by academics, artists, hackers, and business professionals. The most famous wiki is Wikipedia, the online encyclopedia. Like other wikis, Wikipedia has an open editing system where the readers are the contributing editors and proofreaders. The readers write the articles. One of the problems with defining wikis is that the word, which actually means quick" in Hawaiian can refer to the software the community or the database. The community can be seen and operated as an intranet or a common workspace for collaborators. The reality is a little amorphous so why not go to the wiki {(Wikipedia)} for their take on it -- they should know."
Farhoodi, M.; Yari, A. & Mahmoudi, M. A Persian Web Page Classifier Applying a Combination of Content-Based and Context-Based Features International Journal of Information Studies Volume 1 2009
There are many automatic classification methods and algorithms that have been propose for content-based or context-based features of web pages. In this paper we analyze these features and try to exploit a combination of features to improve categorization accuracy of Persian web page classification. In this work we have suggested a linear combination of different features and adjusting the optimum weighing during application. To show the outcome of this approach, we have conducted various experiments on a dataset consisting of all pages belonging to Persian Wikipedia in the field of computer. These experiments demonstrate the usefulness of using content-based and context-based web page features in a linear weighted combination.
Ward, Rod A request for help to improve the coverage of the NHS and UK healthcare issues on Wikipedia He@lth Information on the Internet Volume 53 2006
Lawler, Cormac A resource review of Wikipedia. Counselling & Psychotherapy Research Volume 6 2006
The article offers information on Wikipedia, an online encyclopedia. The articles and definitions published in Wikipedia can be edited. Articles usually start as a single sentence and they grow over time through collaborative writing and editing. A discussion page for every article is also provided for people interested in or concerned with the content of that article.


Ray, Santosh Kumar; Singh, Shailendra & Joshi, B.P. A semantic approach for question classification using WordNet and Wikipedia Pattern Recognition Letters Volume 31 2010 [229]
Question Answering Systems, unlike search engines, are providing answers to the users' questions in succinct form which requires the prior knowledge of the expectation of the user. Question classification module of a Question Answering System plays a very important role in determining the expectations of the user. In the literature, incorrect question classification has been cited as one of the major factors for the poor performance of the Question Answering Systems and this emphasizes on the importance of question classification module designing. In this article, we have proposed a question classification method that exploits the powerful semantic features of the {WordNet} and the vast knowledge repository of the Wikipedia to describe informative terms explicitly. We have trained our system over a standard set of 5500 questions (by {UIUC)} and then tested it over five {TREC} question collections. We have compared our results with some standard results reported in the literature and observed a significant improvement in the accuracy of question classification. The question classification accuracy suggests the effectiveness of the method which is promising in the field of open-domain question classification. Judging the correctness of the answer is an important issue in the field of question answering. In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain Question Answering Systems can be validated using online resources such as Wikipedia and Google. We have applied several heuristics for answer validation task and tested them against some popular web based open-domain Question Answering Systems over a collection of 500 questions collected from standard sources such as {TREC,} the Worldbook, and the Worldfactbook. The proposed method seems to be promising for automatic answer validation task. 2010 Elsevier {B.V.} All rights reserved.
Cress, Ulrike & Kimmerle, Joachim A systemic and cognitive view on collaborative knowledge building with wikis International Journal of Computer-Supported Collaborative Learning Volume 3 2008 [230]
Spence, Des A wicked encyclopaedia. BMJ: British Medical Journal Volume 339 Pages 700 2009
The author reflects on the use of Wikipedia, a free online encyclopedia by doctors and patients. He states that Wikipedia is the common source of doctors and patients when searching for medical topics. According to research, half of doctors have used Wikipedia and the site is increasingly becoming the standard medical textbook. The author also mentions a debate on whether there should be a specific medical Wiki.
Pak, Alexander N. & Chung, Chin-Wan A wikipedia matching approach to contextual advertising World Wide Web Volume 13 2010 [231]
Contextual advertising is an important part of today's Web. It provides benefits to all parties: Web site owners and an advertising platform share the revenue, advertisers receive new customers, and Web site visitors get useful reference links. The relevance of selected ads for a Web page is essential for the whole system to work. Problems such as homonymy and polysemy, low intersection of keywords and context mismatch can lead to the selection of irrelevant ads. Therefore, a simple keyword matching technique gives a poor accuracy. In this paper, we propose a method for improving the relevance of contextual ads. We propose a novel {Wikipedia} matching" technique that uses Wikipedia articles as "reference points" for ads selection. We show how to combine our new method with existing solutions in order to increase the overall performance. An experimental evaluation based on a set of real ads and a set of pages from news Web sites is conducted. Test results show that our proposed method performs better than existing matching strategies and using the Wikipedia matching in combination with existing approaches provides up to 50\% lift in the average precision. {TREC} standard measure bpref-10 also confirms the positive effect of using Wikipedia matching for the effective ads selection. 2010 Springer {Science+Business} Media {LLC.}"
Eijkman, H. Academics and Wikipedia: Reframing Web 2.0+as a disruptor of traditional academic power-knowledge arrangements Campus-Wide Information Systems Volume 27 2010 [232]
Purpose - There is much hype about academics' attitude to Wikipedia. This paper seeks to go beyond anecdotal evidence by drawing on empirical research to ascertain how academics respond to Wikipedia and the implications these responses have for the take-up of Web 2.0+. It aims to test the hypothesis that Web 2.0+, as a platform built around the socially constructed nature of knowledge, is inimical to conventional power-knowledge arrangements in which academics are traditionally positioned as the key gatekeepers to knowledge. Design/methodology/approach - The research relies on quantitative and qualitative data to provide an evidence-based analysis of the attitudes of academics towards the student use of Wikipedia and towards Web 2.0+. These data were provided via an online survey made available to a number of universities in Australia and abroad. As well as the statistical analysis of quantitative data, qualitative data were subjected to thematic analysis using relational coding. Findings - The data by and large demonstrate that Wikipedia continues to be a divisive issue among academics, particularly within the soft sciences. However, Wikipedia is not as controversial as popular publicity would lead one to believe. Many academics use it extensively though cautiously themselves, and therefore tend to support a cautious approach to its use by students. However, evidence supports the assertion that there is an implicit if not explicit awareness among academics that Wikipedia, and possibly by extension Web 2.0+, are disruptors of conventional academic power-knowledge arrangements. Practical implications - It is clear that academics respond differently to the disruptive effects that Web 2.0+has on the political economy of academic knowledge construction. Contrary to popular reports, responses to Wikipedia are not overwhelmingly focused on resistance but encompass both cautious and creative acceptance. It is becoming equally clear that the increasing uptake of Web 2.0+in higher education makes it inevitable that academics will have to address the political consequences of this reframing of the ownership and control of academic knowledge production. Originality/value - The paper demonstrates originality and value by providing a unique, evidence-based insight into the different ways in which academics respond to Wikipedia as an archetypal Web 2.0+application and by positioning Web 2.0+within the political economy of academic knowledge construction.
Smith, David M. D.; Onnela, Jukka-Pekka & Johnson, Neil F. Accelerating networks New Journal of Physics Volume 9 2007 [233]
Evolving out-of-equilibrium networks have been under intense scrutiny recently. In many real-world settings the number of links added per new node is not constant but depends on the time at which the node is introduced in the system. This simple idea gives rise to the concept of accelerating networks, for which we review an existing definition and-after finding it somewhat constrictive-offer a new definition. The new definition provided here views network acceleration as a time dependent property of a given system as opposed to being a property of the specific algorithm applied to grow the network. The definition also covers both unweighted and weighted networks. As time-stamped network data becomes increasingly available, the proposed measures may be easily applied to such empirical datasets. As a simple case study we apply the concepts to study the evolution of three different instances of Wikipedia, namely, those in English, German, and Japanese, and find that the networks undergo different acceleration regimes in their evolution. {IOP} Publishing Ltd and Deutsche Physikalische Gesellschaft.
Veltman, Kim H. Access, claims and quality on the internet - Future challenges Progress in Informatics 2005 [234]
The vision of access to human knowledge has existed explicitly at least since the time of Aristotle In 1934, Otlet outlined a vision of comprehensive access to knowledge. Progress towards this vision entailed initial visions of hypertext, markup languages, the semantic web, Wikipedia and more recently a series of developments with respect to Open Source. A brief survey of these developments is provided. The rhetoric of the Internet insists that everything should be accessible by everyone at anytime. This poses obvious technical challenges and serious philosophical problems of method. If everything is accessible then how do we separate the chaff from the grain and how do we identify quality? Following a survey of important developments, this essay suggests five dimensions that need to be included in a future web: 1) variants and multiple claims; 2) levels of certainty in making a claim; 3) levels of authority in defending a claim; 4) levels of significance in assessing a claim; 5) levels of thoroughness in dealing with a claim. j 2005 National Instiute of Informatics.
Lizorkin, Dmitry; Velikhov, Pavel; Grinev, Maxim & Turdakov, Denis Accuracy estimate and optimization techniques for SimRank computation VLDB Journal Volume 19 2010 [235]
The measure of similarity between objects is a very useful tool in many areas of computer science, including information retrieval. {SimRank} is a simple and intuitive measure of this kind, based on a graph-theoretic model. {SimRank} is typically computed iteratively, in the spirit of {PageRank.} However, existing work on {SimRank} lacks accuracy estimation of iterative computation and has discouraging time complexity. In this paper, we present a technique to estimate the accuracy of computing {SimRank} iteratively. This technique provides a way to find out the number of iterations required to achieve a desired accuracy when computing {SimRank.} We also present optimization techniques that improve the computational complexity of the iterative algorithm from O(n4) in the worst case to {min(O(nl),} O(n3/ log2n)), with n denoting the number of objects, and l denoting the number object-to-object relationships. We also introduce a threshold sieving heuristic and its accuracy estimation that further improves the efficiency of the method. As a practical illustration of our techniques, we computed {SimRank} scores on a subset of English Wikipedia corpus, consisting of the complete set of articles and category links. {Springer-Verlag} 2009.
Lawler, C. Action research as a congruent methodology for understanding wikis: the case of Wikiversity Journal of Interactive Media in Education 2008 [236]
Zhou, Aoying; Zhang, Rong; Qian, Weining; Vu, Quang Hieu & Hu, Tianming Adaptive indexing for content-based search in P2P systems Data and Knowledge Engineering Volume 67 2008 [237]
One of the major challenges in {Peer-to-Peer} {(P2P)} file sharing systems is to support content-based search. Although there have been some proposals to address this challenge, they share the same weakness of using either servers or super-peers to keep global knowledge, which is required to identify importance of terms to avoid popular terms in query processing. As a result, they are not scalable and are prone to the bottleneck problem, which is caused by the high visiting load at the global knowledge maintainers. To that end, in this paper, we propose a novel adaptive indexing approach for content-based search in {P2P} systems, which can identify importance of terms without keeping global knowledge. Our method is based on an adaptive indexing structure that combines a Chord ring and a balanced tree. The tree is used to aggregate and classify terms adaptively, while the Chord ring is used to index terms of nodes in the tree. Specifically, at each node of the tree, the system classifies terms as either important or unimportant. Important terms, which can distinguish the node from its neighbor nodes, are indexed in the Chord ring. On the other hand, unimportant terms, which are either popular or rare terms, are aggregated to higher level nodes. Such classification enables the system to process queries on the fly without the need for global knowledge. Besides, compared to the methods that index terms separately, term aggregation reduces the indexing cost significantly. Taking advantage of the tree structure, we also develop an efficient search algorithm to tackle the bottleneck problem near the root. Finally, our extensive experiments on both benchmark and Wikipedia datasets validated the effectiveness and efficiency of the proposed method. 2008.
Jordan, Chris & Watters, Carolyn Addressing gaps in knowledge while reading Journal of the American Society for Information Science and Technology Volume 60 2009 [238]
Reading is a common everyday activity for most of us. In this article, we examine the potential for using Wikipedia to fill in the gaps in one's own knowledge that may be encountered while reading. If gaps are encountered frequently while reading, then this may detract from the reader's final understanding of the given document. Our goal is to increase access to explanatory text for readers by retrieving a single Wikipedia article that is related to a text passage that has been highlighted. This approach differs from traditional search methods where the users formulate search queries and review lists of possibly relevant results. This explicit search activity can be disruptive to reading. Our approach is to minimize the user interaction involved in finding related information by removing explicit query formulation and providing a single relevant result. To evaluate the feasibility of this approach, we first examined the effectiveness of three contextual algorithms for retrieval. To evaluate the effectiveness for readers, we then developed a functional prototype that uses the text of the abstract being read as context and retrieves a single relevant Wikipedia article in response to a passage the user has highlighted. We conducted a small user study where participants were allowed to use the prototype while reading abstracts. The results from this initial study indicate that users found the prototype easy to use and that using the prototype significantly improved their stated understanding and confidence in that understanding of the academic abstracts they read. 2009 {ASIS} T.
Konieczny, Piotr Adhocratic Governance in the Internet Age: A Case of Wikipedia Journal of Information Technology \& Politics Volume 7 2010 [239]
In recent years, a new realm has appeared for the study of political and sociological phenomena: the Internet. This article will analyze the decision-making processes of one of the largest online communities, Wikipedia. Founded in 2001, Wikipedia”now among the top-10 most popular sites on the Internet”has succeeded in attracting and organizing millions of volunteers and creating the world's largest encyclopedia. To date, however, little study has been done of Wikipedia's governance. There is substantial confusion about its decision-making structure. The organization's governance has been compared to many decision-making and political systems”from democracy to dictatorship, from bureaucracy to anarchy. It is the purpose of this article to go beyond the earlier simplistic descriptions of Wikipedia's governance in order to advance the study of online governance, and of organizations more generally. As the evidence will show, while Wikipedia's governance shows elements common to many traditional governance models, it appears to be closest to the organizational structure known as adhocracy.


Chua, Alton Y. K.; Kaynak, Selcan & Foo, Schubert S. B. An analysis of the delayed response to hurricane katrina through the lens of knowledge management Journal of the American Society for Information Science and Technology Volume 58 2007 [240]
In contrast to many recent large-scale catastrophic events, such as the Turkish earthquake in 1999, the 9/11 attack in New York in 2001, the Bali Bombing in 2002, and the Asian Tsunami in 2004, the initial rescue effort towards Hurricane Katrina in the {U.S.} in 2005 had been sluggish. Even as Congress has promised to convene a formal inquiry into the response to Katrina, this article offers another perspective by analyzing the delayed response through the lens of knowledge management {(KM).} A {KM} framework situated in the context of disaster management is developed to study three distinct but overlapping {KM} processes, namely, knowledge creation, knowledge transfer, and knowledge reuse. Drawing from a total of more than 400 documents - including local, national, and foreign news articles, newswires, congressional reports, and television interview transcripts, as well as Internet resources such as wikipedia and blogs - 14 major delay causes in Katrina are presented. The extent to which the delay causes were a result of the lapses in {KM} processes within and across the government agencies are discussed. 2006 Wiley Periodicals. Inc.
Halavais, A. & Lackaff, D. An analysis of topical coverage of Wikipedia Journal of Computer Mediated Communication Volume 13 2008
Many have questioned the reliability and accuracy of Wikipedia. This article looks at a different but closely related one in the following: How broad is the coverage of Wikipedia? Differences in the interests and attention of Wikipedia's editors mean that some areas, in the traditional sciences, for example, are better covered than others. Two approaches to measuring this coverage are presented. The first maps the distribution of topics on Wikipedia to the distribution of books published. The second compares the distribution of topics in three established, field-specific academic encyclopedias to the articles found in Wikipedia. Unlike the top-down construction of traditional encyclopedias, Wikipedia's topical coverage is driven by the interests of its users, and as a result, the reliability and completeness of Wikipedia is likely to be different depending on the subject area of the article.
Rahman, M. An Analysis of Wikipedia JITTA : Journal of Information Technology Theory and Application Volume 9 Pages 81 2008
Wikipedia is defined by its founders as the free encyclopedia that anyone can edit." This property we argue makes Wikipedia a public good and hence subject to under-provision. A puzzling feature of Wikipedia however is its enormous size at roughly seven times that of its commercial counterparts. What is driving this growth? And how can we assess the reliability of this giant encyclopedia arising solely from free-editing? We model contribution to Wikipedia and its reliability. We demonstrate that Wikipedia is indeed subject to free-riding and offer a novel explanation for the mitigation of under-provision under such circumstances. We also find that the public-good feature of Wikipedia and free-riding introduce a lower-bound in the quality of Wikipedia. This finding is consistent with a previous empirical study that established Wikipedia's surprisingly high level of quality. We identify Wikipedia as part of a general Internet phenomenon that we call the Collaborative Net and that includes features such as citizen journalism and online reviews. {[PUBLICATION} {ABSTRACT]}"


Stokes, N; Li, Y; Moffat, A & Rong, JW An empirical study of the effects of NLP components on Geographic IR performance INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE Volume 22 2008 [241]
Natural language processing {(NLP)} techniques, such as toponym detection and resolution, are an integral part of most geographic information retrieval {(GIR)} architectures. Without these components, synonym detection, ambiguity resolution and accurate toponym expansion would not be possible. However, there are many important factors affecting the success of an {NLP} approach to {GIR,} including toponym detection errors, toponym resolution errors and query overloading. The aim of this paper is to determine how severe these errors are in state-of-the-art systems, and to what extent they affect {GIR} performance. We show that a careful choice of weighting schemes in the {IR} engine can minimize the negative impact of these errors on {GIR} accuracy. We provide empirical evidence from the {GeoCLEF} 2005 and 2006 datasets to support our observations.
Friedlin, Jeff & McDonald, Clement J An evaluation of medical knowledge contained in Wikipedia and its use in the LOINC database Journal of the American Medical Informatics Association: JAMIA Volume 17 2010 [242]
The logical observation identifiers names and codes {(LOINC)} database contains 55 000 terms consisting of more atomic components called parts. {LOINC} carries more than 18 000 distinct parts. It is necessary to have definitions/descriptions for each of these parts to assist users in mapping local laboratory codes to {LOINC.} It is believed that much of this information can be obtained from the internet; the first effort was with Wikipedia. This project focused on 1705 laboratory analytes (the first part in the {LOINC} laboratory name). Of the 1705 parts queried, 1314 matching articles were found in Wikipedia. Of these, 1299 (98.9\%) were perfect matches that exactly described the {LOINC} part, 15 (1.14\%) were partial matches (the description in Wikipedia was related to the {LOINC} part, but did not describe it fully), and 102 (7.76\%) were mis-matches. The current release of {RELMA} and {LOINC} include Wikipedia descriptions of {LOINC} parts obtained as a direct result of this project.
Ha, Jae Kyung & Kim, Yong-Hak An Exploration on On-line Mass Collaboration: focusing on its motivation structure. International Journal of Social Sciences Volume 4 2009
The Internet has become an indispensable part of our lives. Witnessing recent web-based mass collaboration, e.g. Wikipedia, people are questioning whether the Internet has made fundamental changes to the society or whether it is merely a hyperbolic fad. It has long been assumed that collective action for a certain goal yields the problem of free-riding, due to its non-exclusive and non-rival characteristics. Then, thanks to recent technological advances, the on-line space experienced the following changes that enabled it to produce public goods: 1) decrease in the cost of production or coordination 2) externality from networked structure 3) production function which integrates both self-interest and altruism. However, this research doubts the homogeneity of on-line mass collaboration and argues that a more sophisticated and systematical approach is required. The alternative that we suggest is to connect the characteristics of the goal to the motivation. Despite various approaches, previous literature fails to recognize that motivation can be structurally restricted by the characteristic of the goal. First we draw a typology of on-line mass collaboration with 'the extent of expected beneficiary' and 'the existence of externality', and then we examine each combination of motivation using Benkler's framework. Finally, we explore and connect such typology with its possible dominant participating motivation.
Francke, H. & Sundin, O. An inside view: credibility in Wikipedia from the perspective of editors Information Research Volume 15 2010
Introduction. The question of credibility in participatory information environments, particularly Wikipedia, has been much debated. This paper investigates how editors on Swedish Wikipedia consider credibility when they edit and read Wikipedia articles. Method. The study builds on interviews with 11 editors on Swedish Wikipedia, supported by a document analysis of policies on Swedish Wikipedia. Analysis. The interview transcripts have been coded qualitatively according to the participants' use of Wikipedia and what they take into consideration in making credibility assessments. Results. The participants use Wikipedia for purposes where it is not vital that the information is correct. Their credibility assessments are mainly based on authorship, verifiability, and the editing history of an article. Conclusions. The situations and purposes for which the editors use Wikipedia are similar to other user groups, but they draw on their knowledge as members of the network of practice of wikipedians to make credibility assessments, including knowledge of certain editors and of the {MediaWiki} architecture. Their assessments have more similarities to those used in traditional media than to assessments springing from the wisdom of crowds.


Iba, Takashi; Nemoto, Keiichi; Peters, Bernd & Gloor, Peter A. Analyzing the Creative Editing Behavior of Wikipedia Editors: Through Dynamic Social Network Analysis Procedia - Social and Behavioral Sciences Volume 2 2010 [243]
Kinsella, Sheila; Breslin, John G.; Passant, Alexandre & Decker, Stefan Applications of Semantic Web Methodologies and Techniques to Social Networks and Social Websites Reasoning Web 2008 [244]
One of the most visible trends on the Web is the emergence of {Social} Web" sites which facilitate the creation and gathering of knowledge through the simplification of user contributions via blogs tagging and folksonomies wikis podcasts and the deployment of online social networks. The Social Web has enabled community-based knowledge acquisition with efforts like the Wikipedia demonstrating the "wisdom of the crowds" in creating the world's largest online encyclopaedia. Although it is difficult to define the exact boundaries of what structures or abstractions belong to the Social Web a common property of such sites is that they facilitate collaboration and sharing between users with low technical barriers although usually on single sites. As more social websites form around the connections between people and their objects of interest and as these "object-centred networks" grow bigger and more diverse more intuitive methods are needed for representing and navigating the content items in these sites: both within and across social websites. Also to better enable user access to multiple sites interoperability among social websites is required in terms of both the content objects and the person-to-person networks expressed on each site. This requires representation mechanisms to interconnect people and objects on the Social Web in an interoperable and extensible way. The Semantic Web provides such representation mechanisms: it can be used to link people and objects by representing the heterogeneous ties that bind us all to each other (either directly or indirectly). In this paper we will describe methods that build on agreed-upon Semantic Web formats to describe people content objects and the connections that bind them together explicitly or implicitly enabling social websites to interoperate by appealing to some common semantics. We will also focus on how developers can use the Semantic Web to augment the ways in which they createreuse and link content on social networking sites and social websites."
Tann, Chadwyn & Sanderson, Mark Are web-based informational queries changing? Journal of the American Society for Information Science and Technology Volume 60 2009 [245]
This brief communication describes the results of a questionnaire examining certain aspects of the Web-based information seeking practices of university students. The results are contrasted with past work showing that queries to Web search engines can be assigned to one of a series of categories: navigational, informational, and transactional. The survey results suggest that a large group of queries, which in the past would have been classified as informational, have become at least partially navigational. We contend that this change has occurred because of the rise of large Web sites holding particular types of information, such as Wikipedia and the Internet Movie Database.
Ferriter, Meghan M Arguably the Greatest": Sport Fans and Communities at Work on Wikipedia" Sociology of Sport Journal Volume 26 2009
This article explores the socially constructed space of Wikipedia and how the process and structure of Wikipedia enable it to act both as a vehicle for communication between sport fans and to subtly augment existing public narratives about sport. As users create article narratives, they educate fellow fans in relevant social and sport meanings. This study analyzes two aspects of Wikipedia for sports fans, application of statistical information and connecting athletes with other sports figures and organizations, through a discourse analysis of article content and the discussion pages of ten sample athletes. These pages of retired celebrity athletes provide a means for exploring the multidirectional production processes used by the sport fan community to celebrate recorded events of sporting history in clearly delineated and verifiable ways, thus maintaining the sport fans' community social values. Adapted from the source document.
Chen, Ching-Jung Art history: a guide to basic research resources Collection Building Volume 28 2009 [246]
The purpose of this paper is to present basic resources and practical strategies for undergraduate art history research. The paper is based on the author's experience as both an art librarian and instructor for a core requirement art history course. The plan detailed in this paper covers every step of the research process, from exploring the topic to citing the sources. The resources listed, which include subscription databases as well as public Web sites, are deliberately limited to a manageable number. Additional topics include defining the scope of inquiry and making appropriate use of Internet resources such as Wikipedia. The paper provides the academic librarian with clear guidance on basic research resources in art history.


Turdakov, D.Yu. & Kuznetsov, S.D. Automatic word sense disambiguation based on document networks Programming and Computer Software Volume 36 2010 [247]
In this paper, a survey of works on word sense disambiguation is presented, and the method used in the Texterra system [1] is described. The method is based on calculation of semantic relatedness of Wikipedia concepts. Comparison of the proposed method and the existing word sense disambiguation methods on various document collections is given. 2010 Pleiades Publishing, Ltd.


Pamkowska, M. Autopoiesis in virtual organizations Informatica Economica Volume 12 2008
Virtual organizations continuously gain popularity because of the benefits created by them. Generally, they are defined as temporal adhocracies, project oriented, knowledge-based network organizations. The goal of this paper is to present the hypothesis that knowledge system developed by virtual organization is an autopoietic system. The term autopoiesis" was introduced by Maturana for self-productive systems. In this paper Wikipedia is described as an example of an autopoietic system. The first part of the paper covers discussion on virtual organizations. Next autopoiesis' interpretations are delivered and the value of autopoiesis for governance of virtual organizations is presented. The last parts of the work comprise short presentation of Wikipedia its principles and conclusions of Wikipedia as an autopoietic system."
George, A. Avoiding tragedy in the wiki-commons Virginia Journal of Law and Technology Volume 12 2007
Thousands of volunteers contribute to Wikipedia, with no expectation of remuneration or direct credit and with the constant risk of their work being altered As a voluntary public good it seems that Wikipedia ought to face a problem of noncontribution Yet Wikipedia overcomes this problem, like much of the open- source movement, by locking in a core group of dedicated volunteers who are motivated by a desire to join and gain status within the Wikipedia community. Still, undesirable contribution is just as significant a risk to Wikipedia as noncontribution Bad informational inputs, including vandalism and anti-intellectualism, put the project at risk because Wikipedia requires a degree of credibility to maintain its lock-in effect. At the same time, Wikipedia is so dependent on the work of its core community that governance strategies to exclude these bad inputs must be delicately undertaken. This article argues that to maximize useful participation, Wikipedia must carefully combat harmful inputs while preserving the zeal of its core community, as failure to do either may result in tragedy.
Jr., Joseph M. Reagle Be Nice": Wikipedia norms for supportive communication" New Review of Hypermedia and Multimedia Volume 16 Pages 01/02/2011 2010 [248]
Wikipedia is acknowledged to have been home to some bitter disputes. Indeed, conflict at Wikipedia is said to be as addictive as cocaine". Yet such observations are not cynical commentary but motivation for a collection of social norms. These norms speak to the intentional stance and communicative behaviors Wikipedians should adopt when interacting with one another. In the following pages I provide a survey of these norms on the English Wikipedia and argue that they can be characterized as supportive based on Jack Gibb's classic communication article {"Defensive} Communication". 2010 Taylor Francis."
Achterman, D. Beyond Wikipedia? Teacher Librarian Volume 34 2006 [249]
{WHILE} {POLARIZED} {VIEWS} {OF} {READING} {METHODOLOGIES,} {FILTERING,} {DIGITAL} {RIGHTS} {MANAGEMENT} {(DRM),} {OPEN} {SOURCE,} {COPYRIGHT/COPYLEFT,} {CONSTRUCTIVISM,} {E-BOOKS,} {COMPUTER} {LABS,} {FIXED} {SCHEDULES,} {MAC/PC/LINUX,} {AND} {THE} {ONE} {LAPTOP} {PER} {CHILD} {PROJECT} {ALL} {MAKE} {FOR} {ENTERTAINING} {READING} {AND} A {RAISED} {BLOOD} {PRESSURE,} I {OFTEN} {WONDER} {IF} {RADICAL} {STANCES} {ACTUALLY} {CREATE} {EDUCATIONAL} {CHANGE} {OR} {AFFECT} {EDUCATIONAL} {INS-UTIONS} {ENOUGH} {TO} {CHANGE} {KIDS'} {CHANCES} {FOR} {SUCCESS.} Separate or integrate tech/info lit curriculum Encyclopedia Britannica or Wikipedia Evolutionary or revolutionary change Content knowledge or process skills Testing or assessment Mandated skills or teacher choice Print or online Libraries or technology Fixed or flex scheduling It is this sort of black and white thinking that makes stimulating reading and engenders reader outpourings of love or hate.
Haider, J. & Sundin, O. Beyond the legacy of the Enlightenment? Online encyclopaedias as digital heterotopias First Monday Volume 15 2010
This article explores how we can understand contemporary participatory online encyclopaedic expressions, particularly Wikipedia, in their traditional role as continuation of the Enlightenment ideal, as well as in the distinctly different space of the Internet. Firstly we position these encyclopaedias in a historical tradition. Secondly, we assign them a place in contemporary digital networks which marks them out as sites in which Enlightenment ideals of universal knowledge take on a new shape. We argue that the Foucauldian concept of heterotopia, that is special spaces which exist within society, transferred online, can serve to understand Wikipedia and similar participatory online encyclopaedias in their role as unique spaces for the construction of knowledge, memory and culture in late modern society.
Shachaf, Pnina & Hara, Noriko Beyond vandalism: Wikipedia trolls Journal of Information Science Volume 36 2010 [250]
Research on trolls is scarce, but their activities challenge online communities; one of the main challenges of the Wikipedia community is to fight against vandalism and trolls. This study identifies Wikipedia trolls behaviours and motivations, and compares and contrasts hackers with trolls; it extends our knowledge about this type of vandalism and concludes that Wikipedia trolls are one type of hacker. This study reports that boredom, attention seeking, and revenge motivate trolls; they regard Wikipedia as an entertainment venue, and find pleasure from causing damage to the community and other people. Findings also suggest that trolls behaviours are characterized as repetitive, intentional, and harmful actions that are undertaken in isolation and under hidden virtual identities, involving violations of Wikipedia policies, and consisting of destructive participation in the community. The Author(s), 2010.
Hwang, Heasoo; Balmin, Andrey; Reinwald, Berthold & Nijkamp, Erik BinRank: Scaling dynamic authority-based search using materialized subgraphs IEEE Transactions on Knowledge and Data Engineering Volume 22 2010 [251]
Dynamic authority-based keyword search algorithms, such as {ObjectRank} and personalized {PageRank,} leverage semantic link information to provide high quality, high recall search in databases, and the Web. Conceptually, these algorithms require a query-time {PageRank-style} iterative computation over the full graph. This computation is too expensive for large graphs, and not feasible at query time. Alternatively, building an index of precomputed results for some or all keywords involves very expensive preprocessing. We introduce {BinRank,} a system that approximates {ObjectRank} results by utilizing a hybrid approach inspired by materialized views in traditional query processing. We materialize a number of relatively small subsets of the data graph in such a way that any keyword query can be answered by running {ObjectRank} on only one of the subgraphs. {BinRank} generates the subgraphs by partitioning all the terms in the corpus based on their co-occurrence, executing {ObjectRank} for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive non-negligible scores. The intuition is that a subgraph that contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. We demonstrate that {BinRank} can achieve subsecond query execution time on the English Wikipedia data set, while producing high-quality search results that closely approximate the results of {ObjectRank} on the original graph. The Wikipedia link graph contains about 108 edges, which is at least two orders of magnitude larger than what prior state of the art dynamic authority-based search systems have been able to demonstrate. Our experimental evaluation investigates the trade-off between query execution time, quality of the results, and storage requirements of {BinRank.


Xiang, Evan Wei; Cao, Bin; Hu, Derek Hao & Yang, Qiang Bridging domains using world wide knowledge for transfer learning IEEE Transactions on Knowledge and Data Engineering Volume 22 2010 [252]
A major problem of classification learning is the lack of ground-truth labeled data. It is usually expensive to label new data instances for training a model. To solve this problem, domain adaptation in transfer learning has been proposed to classify target domain data by using some other source domain data, even when the data may have different distributions. However, domain adaptation may not work well when the differences between the source and target domains are large. In this paper, we design a novel transfer learning approach, called {BIG} {(Bridging} Information Gap), to effectively extract useful knowledge in a worldwide knowledge base, which is then used to link the source and target domains for improving the classification performance. {BIG} works when the source and target domains share the same feature space but different underlying data distributions. Using the auxiliary source data, we can extract a bridge that allows cross-domain text classification problems to be solved using standard semisupervised learning algorithms. A major contribution of our work is that with {BIG,} a large amount of worldwide knowledge can be easily adapted and used for learning in the target domain. We conduct experiments on several real-world cross-domain text classification tasks and demonstrate that our proposed approach can outperform several existing domain adaptation approaches significantly.
Cantador, Ivan; Konstas, Ioannis & Jose, Joemon M. Categorising Social Tags to Improve Folksonomy-based Recommendations Web Semantics: Science, Services and Agents on the World Wide Web Pages Accepted Manuscript 2010
Ratkiewicz, Jacob; Fortunato, Santo; Flammini, Alessandro; Menczer, Filippo & Vespignani, Alessandro Characterizing and modeling the dynamics of online popularity Physical Review Letters Volume 105 2010 [253]
Online popularity has an enormous impact on opinions, culture, policy, and profits. We provide a quantitative, large scale, temporal analysis of the dynamics of online content popularity in two massive model systems: the Wikipedia and an entire country's Web space. We find that the dynamics of popularity are characterized by bursts, displaying characteristic features of critical systems such as fat-tailed distributions of magnitude and interevent time. We propose a minimal model combining the classic preferential popularity increase mechanism with the occurrence of random popularity shifts due to exogenous factors. The model recovers the critical features observed in the empirical analysis of the systems analyzed here, highlighting the key factors needed in the description of popularity dynamics. 2010 The American Physical Society.
Korosec, L; Limacher, P A; Luthi, H P & Brandle, M P Chemical Information Media in the Chemistry Lecture Hall: A Comparative Assessment of Two Online Encyclopedias CHIMIA 2010
The chemistry encyclopedia Rompp Online and the German universal encyclopedia Wikipedia were assessed by first-year university students on the basis of a set of 30 articles about chemical thermodynamics. Criteria with regard to both content and form were applied in the comparison; 619 ratings (48\% participation rate) were returned. While both encyclopedias obtained very good marks and performed nearly equally with regard to their accuracy, the average overall mark for Wikipedia was better than for Rompp Online, which obtained lower marks with regard to completeness and length. Analysis of the results and participants' comments shows that students attach importance to completeness, length and comprehensibility rather than accuracy, and also attribute less value to the availability of sources which validate an encyclopedia article. Both encyclopedias can be promoted as a starting reference to access a topic in chemistry. However, it is recommended that instructors should insist that students do not rely solely on encyclopedia texts, but use and cite primary literature in their reports.
Shaw, Donna Citing Wikipedia. American Journalism Review Volume 30 Pages 43 2008
The article presents several several scenarios where the online encyclopedia Wikipedia was cited in a newspaper story. The author highlights the editors, the stories, and the editors' explanations as to why Wikipedia was cited as a source. John Leach of the {Arizona} Republic" states that a citation of Wikipedia led to the creation of rules governing its use. Lois Wilson of the {"Star-Gazette"} in Elmira New York states that she decided whether Wikipedia can be used on a case by case basis and that generally reporters use it as a part of their research."
Guo, Tao; Schwartz, D.G.; Burstein, F. & Linger, H. Codifying collaborative knowledge: using Wikipedia as a basis for automated ontology learning Knowledge Management Research \& Practice Volume 7 2009 [254]
In the context of knowledge management, ontology construction can be considered as a part of capturing of the body of knowledge of a particular problem domain. Traditionally, ontology construction assumes a tedious codification of the domain experts knowledge. In this paper, we describe a new approach to ontology engineering that has the potential of bridging the dichotomy between codification and collaboration turning to Web 2.0 technology. We propose to shift the primary source of ontology knowledge from the expert to socially emergent bodies of knowledge such as Wikipedia. Using Wikipedia as an example, we demonstrate how core terms and relationships of a domain ontology can be distilled from this socially constructed source. As an illustration, we describe how our approach achieved over 90\% conceptual coverage compared with Gold standard hand-crafted ontologies, such as Cyc. What emerges is not a folksonomy, but rather a formal ontology that has nonetheless found its roots in social knowledge.


Tumlin, M; Harris, SR; Buchanan, H; Schmidt, K & Johnson, K Collectivism vs. individualism in a wiki world: Librarians respond to Jaron Lanier's essay Digital Maoism: The Hazards of the New Online Collectivism"" SERIALS REVIEW Volume 33 2007 [255]
Jaron Lanier's essay {Digital} Maoism: The Hazards of the New Online Collectivism" is a self-described rant of the dangers of the hive mentality in suppressing individual human intelligence as demonstrated in online resources such as Wikipedia and {MySpace.} He sees merit in collective decision-making and problem-solving if evaluation is uncontroversial but argues that individuals are essential in providing judgment taste and user experiences in many situations. Lanier's essay appeared in the online progressive publication Edge and received responses from a variety of technologists academics and writers. In this {"Balance} Point" column four academic librarians provide a library public services viewpoint in responding to Lanier's essay."


Poderi, G. Comparing featured article groups and revision patterns correlations in Wikipedia First Monday Volume 14 2009
Collaboratively written by thousands of people, Wikipedia produces entries which are consistent with criteria agreed by Wikipedians and of high quality. This article focuses on Wikipedia's featured articles and shows that not every contribution can be considered as being of equal quality. Two groups of articles are analysed by focusing on the edits distribution and the main editors' contribution. The research shows how these aspects of the revision patterns can change dependent upon the category to which the articles belong.
Stone, B.; Dennis, S. & Kwantes, P. J. Comparing Methods for Single Paragraph Similarity Analysis Topics in Cognitive Science 2010 [256]
The focus of this paper is two-fold. First, similarities generated from six semantic models were compared to human ratings of paragraph similarity on two datasets”23 World Entertainment News Network paragraphs and 50 {ABC} newswire paragraphs. Contrary to findings on smaller textual units such as word associations {(Griffiths,} Tenenbaum, \& Steyvers, 2007), our results suggest that when single paragraphs are compared, simple nonreductive models (word overlap and vector space) can provide better similarity estimates than more complex models {(LSA,} Topic Model, {SpNMF,} and {CSM).} Second, various methods of corpus creation were explored to facilitate the semantic models similarity estimates. Removing numeric and single characters, and also truncating document length improved performance. Automated construction of smaller Wikipedia-based corpora proved to be very effective, even improving upon the performance of corpora that had been chosen for the domain. Model performance was further improved by augmenting corpora with dataset paragraphs.
Rector, L.H. Comparison of Wikipedia and other encyclopedias for accuracy, breadth, and depth in historical articles Reference Services Review Volume 36 2008 [257]
This paper seeks to provide reference librarians and faculty with evidence regarding the comprehensiveness and accuracy of Wikipedia articles compared with respected reference resources. This content analysis evaluated nine Wikipedia articles against comparable articles in Encyclopaedia Britannica, The Dictionary of American History and American National Biography Online in order to compare Wikipedia's comprehensiveness and accuracy. The researcher used a modification of a stratified random sampling and a purposive sampling to identify a variety of historical entries and compared each text in terms of depth, accuracy, and detail. The study did reveal inaccuracies in eight of the nine entries and exposed major flaws in at least two of the nine Wikipedia articles. Overall, Wikipedia's accuracy rate was 80 percent compared with 95-96 percent accuracy within the other sources. This study does support the claim that Wikipedia is less reliable than other reference resources. Furthermore, the research found at least five unattributed direct quotations and verbatim text from other sources with no citations. More research must be undertaken to analyze Wikipedia entries in other disciplines in order to judge the source's accuracy and overall quality. This paper also shows the need for analysis of Wikipedia articles' histories and editing process. This research provides a methodology for further content analysis of Wikipedia articles. Although generalizations cannot be made from this paper alone, the paper provides empirical data to support concerns regarding the accuracy and authoritativeness of Wikipedia.


Wang, Rui-Qin & Kong, Fan-Sheng Computing semantic relatedness using structured information of Wikipedia Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science) Volume 43 2009 [258]
A novel semantic relatedness measurement technique based on the link information of Wikipedia was presented. Comparing with {WordNet} repository, Wikipedia has wider range, more comprehensive knowledge and faster update speed, which makes it become an ideal resource in semantic management. Unlike other Wikipedia based semantic relatedness computing approaches, the new technique uses only Wikipedia's link structures rather than its full text content, which avoids from burdensome text processing. During the process of relatedness computation, the positive effects of incoming links and outcoming links were taken into account, meanwhile the link number adjustment factor was considered to eliminate the bias. Using several widely used test sets of manual defined measures of semantic relatedness as bench-mark, the proposed method resulted in substantial improvement in the correlation of computed relatedness score with the human judgments comparing with the previous {WordNet-based} methods and other Wikipedia-based methods.
Zeng, Honglei; Alhossaini, Maher A; Ding, Li; Fikes, Richard & McGuinness, Deborah L Computing trust from revision history. STAR. Vol. 44 Volume 44 2006
A new model of distributed, collaborative information evolution is emerging. As exemplified in Wikipedia, online collaborative information repositories are being generated, updated, and maintained by a large and diverse community of users. Issues concerning trust arise when content is generated and updated by diverse populations. Since these information repositories are constantly under revision, trust determination is not simply a static process. In this paper, we explore ways of utilizing the revision history of an article to assess the trustworthiness of the article. We then present an experiment where we used this revision history-based trust model to assess the trustworthiness of a chain of successive versions of articles in Wikipedia and evaluated the assessments produced by the model.
GUNNELS, CLAIRE B. & SISSON, AMY Confessions of a Librarian or: How I Learned to Stop Worrying and Love Google. Community \& Junior College Libraries Volume 15 2009
Have you ever stopped to think about life before Google? We will make the argument that Google is the first manifestation of Web 2.0, of the power and promise of social networking and the ubiquitous wiki. We will discuss the positive influence of Google and how Google and other social networking tools afford librarians leading-edge technologies and new opportunities to teach information literacy. Finally, we will include a top seven list of googlesque tools that no librarian should be without.
Liao, Han-Teng Conflict and consensus in the Chinese version of Wikipedia IEEE Technology and Society Magazine Volume 28 2009 [259]
It is not easy to initiate a new language version of Wikipedia. Although anyone can propose a new language version without financial cost, certain Wikipedia policies for establishing a new language version must be followed [30]. Once approved and created, the new language version needs tools to facilitate writing and reading in the new language. Even if a team tackles these technical and linguistic issues, a nascent community has to then develop its own editorial and administrative policies and guidelines, sometimes by translating and ratifying the policies in another language version (usually English). Given that Wikipedia does not impose an universal set of editorial and administrative policies and guidelines, the cultural and political nature of such communities remains open-ended.
Letia, Mihai; Preguica, Nuno & Shapiro, Marc Consistency without concurrency control in large, dynamic systems SOSP Workshop on Large Scale Distributed Systems and Middleware (LADIS)Volume 44 2010 [260]
Replicas of a commutative replicated data type {(CRDT)} eventually converge without any complex concurrency control. We validate the design of a non-trivial {CRDT,} a replicated sequence, with performance measurements in the context of Wikipedia. Furthermore, we discuss how to eliminate a remaining scalability bottleneck: Whereas garbage collection previously required a system-wide consensus, here we propose a flexible two-tier architecture and a protocol for migrating between tiers. We also discuss how the {CRDT} concept can be generalised, and its limitations.
Madison, MJ; Frischmann, BM & Strandburg, KJ CONSTRUCTING COMMONS IN THE CULTURAL ENVIRONMENT CORNELL LAW REVIEW Volume 95 2010 [261]
This Article sets out a. framework for investigating sharing and resource-pooling arrangements for information- and knowledge-based works. We argue that adapting the approach pioneered by Elmor Ostrom and her collaborators to commons arrangements in the natural environment provides a template for examining the construction of commons in the cultural environment. The approach promises to lead to a better understanding of how participants in commons and pooling arrangements structure their interactions in relation to the environments in which they are embedded, in relation to information and knowledge resources that they produce and use, and in relation to one another Some examples of the types of arrangements we have in. mind are patent pools (such as the Manufacturer's Aircraft Association), open source software development projects (such as Linux), Wikipedia, the Associated Press, certain jamband communities, medieval guilds, and modern research universities. These examples are illustrative and far from exhaustive. Each involves a constructed cultural commons worth of independent study, but independent studies get us only so far. A more systematic approach is needed. An improved understanding of cultural commons is critical for obtaining a more complete perspective on intellectual property doctrine and its interactions with other legal and social mechanisms for governing creativity and innovation, in particular, and information and knowledge production, conservation, and consumption, generally. We propose and initial framework for evaluating and comparing the contours of different commons arrangements. The framework will allow us to develop an inventory of structural similarities and differences among cultural commons in different industries, disciplines, and knowledge domains and shed light on the underlying contextual reasons for such differences. Structural inquiery into a series of case studies will provide a basis from developing theories to exploan the emergence, form, and stability of the observed variety of cultural commons and eventually, to design models to explicate and infrorm institutional desing. The proposed approach would draw upon case studies from a while range of disciplines Among other things, we argue that theoretical apporaches to constructed cultural and use of pooled resources, internal licensing conditions, management of external relationships, and institutional forms, along with the degree of collaboration among members, sharing of human capital, degrees of integration among participants, and any specified purposed to the arrangement.
Page, James CO-ORDINATING PEACE RESEARCH AND EDUCATION IN AUSTRALIA: A REPORT ON THE CANBERRA FORUM OF 2 MAY, 2008. International Review of Education / Internationale Zeitschrift für Erziehungswissenschaft Volume 55 Pages 02/03/2011 2009
Information about several papers discussed during the Australian university teachers forum on peace and conflict studies in Canberra, Australian Capital Territory on May 2, 2008 is presented. The forum highlights the discussion on how to better organize and co-ordinate university-level peace education in Australia. It further features the issue concerning peace education through Wikipedia networking and innovative teaching methods.
Urdaneta, Guido; Pierre, Guillaume & van Steen, Maarten Corrigendum to Wikipedia workload analysis for decentralized hosting" [Computer Networks 53 (11) (2009) 1830-1845] (DOI:10.1016/j.comnet.2009.02.019)" Computer Networks Volume 54 2010 [262]
Maracke, Catharina Creative Commons International The International License Porting Project jipitec Volume 1 2010 [263]
When Creative Commons {(CC)} was founded in 2001, the core Creative Commons licenses were drafted according to United States Copyright Law. Since their first introduction in December 2002, Creative Commons licenses have been enthusiastically adopted by many creators, authors, and other content producers “ not only in the United States, but in many other jurisdictions as well. Global interest in the {CC} licenses prompted a discussion about the need for national versions of the {CC} licenses. To best address this need, the international license porting project {(œCreative} Commons International? “ formerly known as {œInternational} Commons?) was launched in 2003. Creative Commons International works to port the core Creative Commons licenses to different copyright legislations around the world. The porting process includes both linguistically translating the licenses and legally adapting the licenses to a particular jurisdiction such that they are comprehensible in the local jurisdiction and legally enforceable but concurrently retain the same key elements. Since its inception, Creative Commons International has found many supporters all over the world. With Finland, Brazil, and Japan as the first completed jurisdiction projects, experts around the globe have followed their lead and joined the international collaboration with Creative Commons to adapt the licenses to their local copyright. This article aims to present an overview of the international porting process, explain and clarify the international license architecture, its legal and promotional aspects, as well as its most recent challenges.
Hoorn, E. & van Hoorn, D. Critical assessment of using wikis in legal education JILT-Journal of Information Law \& Technology 2008
Wikis serve to support collaborative writing on the Web. The best known example of a wiki is Wikipedia, an open encyclopedia on the web. This paper is meant to explore possible uses of a wiki-environment in legal education. Firstly, it describes the actual use of a closed wiki environment in a class on Cybercrime in the Netherlands. Secondly, the paper explores the possibilities for international collaboration of students without face to face contact. Innovative use of wikis in learning situations demands insights in educational design and best practices of educators. We will show that for students as well as educators the use of a wiki is an easy and effective way of using technology in order to get engaged in new forms of learning. The paper is intended for legal educators who share an interest in innovative approaches to legal education.
Hara, Noriko; Shachaf, Pnina & Hew, Khe Foon Cross-cultural analysis of the Wikipedia community Journal of the American Society for Information Science and Technology Volume 61 2010 [264]
This article reports a cross-cultural analysis of four Wikipedias in different languages and demonstrates their roles as communities of practice {(CoPs).} Prior research on {CoPs} and on the Wikipedia community often lacks cross-cultural analysis. Despite the fact that over 75\% of Wikipedia is written in languages other than English, research on Wikipedia primarily focuses on the English Wikipedia and tends to overlook Wikipedias in other languages. This article first argues that Wikipedia communities can be analyzed and understood as {CoPs.} Second, norms of behaviors are examined in four Wikipedia languages {(English,} Hebrew, Japanese, and Malay), and the similarities and differences across these four languages are reported. Specifically, typical behaviors on three types of discussion spaces (talk, user talk, and Wikipedia talk) are identified and examined across languages. Hofstede's dimensions of cultural diversity as well as the size of the community and the function of each discussion area provide lenses for understanding the similarities and differences. As such, this article expands the research on online {CoPs} through an examination of cultural variations across multiple {CoPs} and increases our understanding of Wikipedia communities in various languages.
Ah-Pine, Julien; Bressan, Marco; Clinchant, Stephane; Csurka, Gabriela; Hoppenot, Yves & Renders, Jean-Michel Crossing textual and visual content in different application scenarios Multimedia Tools and Applications Volume 42 2009 [265]
This paper deals with multimedia information access. We propose two new approaches for hybrid text-image information processing that can be straightforwardly generalized to the more general multimodal scenario. Both approaches fall in the trans-media pseudo-relevance feedback category. Our first method proposes using a mixture model of the aggregate components, considering them as a single relevance concept. In our second approach, we define trans-media similarities as an aggregation of monomodal similarities between the elements of the aggregate and the new multimodal object. We also introduce the monomodal similarity measures for text and images that serve as basic components for both proposed trans-media similarities. We show how one can frame a large variety of problem in order to address them with the proposed techniques: image annotation or captioning, text illustration and multimedia retrieval and clustering. Finally, we present how these methods can be integrated in two applications: a travel blog assistant system and a tool for browsing the Wikipedia taking into account the multimedia nature of its content. 2008 Springer {Science+Business} Media, {LLC.
Potthast, Martin; Barrón-Cedeño, Alberto; Stein, Benno & Rosso, Paolo Cross-language plagiarism detection Language Resources and Evaluation 2010 [266]
Holley, R. Crowdsourcing: How and Why Should Libraries Do It? D-Lib Magazine Volume 16 Pages 03/04/2011 2010 [267]
The definition and purpose of crowdsourcing and its relevance to libraries is discussed with particular reference to the Australian Newspapers service, {FamilySearch,} Wikipedia, Distributed Proofreaders, Galaxy Zoo and The Guardian {MP's} Expenses Scandal. These services have harnessed thousands of digital volunteers who transcribe, create, enhance and correct text, images and archives. Known facts about crowdsourcing are presented and helpful tips and strategies for libraries beginning to crowdsource are given.


Daughton, Suzanne M. Cursed with self-awareness": gender-bending subversion Pages Women's Studies in Communication 2010
Larsson, G. Cyber-Islamophobia?: The case of WikiIslam Contemporary Islam Volume 1, Number 1, 53-67 2007 [268]
Rizzo, Skip CyberSightings. CyberPsychology \& Behavior Volume 10 2007
The article presents a list of websites focusing on the topic of exergaming. Informations are given on the current status in Mental Health, Rehabilitation and Internet, Multimedia, Virtual Reality, and Technology arena. They include the wikipedia entry for exergaming, energy expenditure of sedentary screen time, and other sites.
Birbal, Ria; Maharajh, Hari D; Birbal, Risa; Clapperton, Maria; Jarvis, Johnathan; Ragoonath, Anushka & Uppalapati, Kali Cybersuicide and the adolescent population: challenges of the future? International Journal of Adolescent Medicine and Health Volume 21 2009 [269]
Cybersuicide is a term used in reference to suicide and its ideations on the Internet. Cybersuicide is associated with websites that lure vulnerable members of society and empower them with various methods and approaches to deliberate self-harm. Ease of accessibility to the Internet and the rate at which information is dispersed contribute to the promotion of 'offing' one's self which is particularly appealing to adolescents. This study aims to explore this phenomenon, which seems to be spreading across generations, cultures, and races. Information and articles regarding Internet suicide and other terminology, as well as sub-classifications concerning this new form of suicide, were reviewed. Through search engines such as Google, Yahoo and Wikipedia, we investigated the differentiations between 'web cam' suicide, 'net suicide packs', sites that merely offer advice on how to commit suicide and sites that are essential in providing the means of performing the act. Additionally, materials published in scientific journals and data published by the Public Health Services, Centers for Disease Control, and materials from private media agencies were reviewed. Resources were also sourced from The Faculty of Medical Sciences Library, {UWI} at Mt. Hope. Cybersuicide is a worldwide problem among adolescents and a challenge of the future.
Bizer, Christian; Lehmann, Jens; Kobilarov, Georgi; Auer, Soren; Becker, Christian; Cyganiak, Richard & Hellmann, Sebastian DBpedia - A crystallization point for the Web of Data Journal of Web Semantics Volume 7 2009 [270]
The {DBpedia} project is a community effort to extract structured information from Wikipedia and to make this information accessible on the Web. The resulting {DBpedia} knowledge base currently describes over 2.6 million entities. For each of these entities, {DBpedia} defines a globally unique identifier that can be dereferenced over the Web into a rich {RDF} description of the entity, including human-readable definitions in 30 languages, relationships to other resources, classifications in four concept hierarchies, various facts as well as data-level links to other Web data sources describing the entity. Over the last year, an increasing number of data publishers have begun to set data-level links to {DBpedia} resources, making {DBpedia} a central interlinking hub for the emerging Web of Data. Currently, the Web of interlinked data sources around {DBpedia} provides approximately 4.7 billion pieces of information and covers domains such as geographic information, people, companies, films, music, genes, drugs, books, and scientific publications. This article describes the extraction of the {DBpedia} knowledge base, the current status of interlinking {DBpedia} with other data sources on the Web, and gives an overview of applications that facilitate the Web of Data around {DBpedia.} 2009 Elsevier {B.V.} All rights reserved.
Forte, Andrea; Larco, Vanesa & Bruckman, Amy Decentralization in wikipedia governance Journal of Management Information Systems Volume 26 2009 [271]
How does self-governance" happen in Wikipedia? Through in-depth interviews with 20 individuals who have held a variety of responsibilities in the English-language Wikipedia we obtained rich descriptions of how various forces produce and regulate social structures on the site. Although Wikipedia is sometimes portrayed as lacking oversight our analysis describes Wikipedia as an organization with highly refined policies norms and a technological architecture that supports organizational ideals of consensus building and discussion. We describe how governance on the site is becoming increasingly decentralized as the community grows and how this is predicted by theories of commons-based governance developed in offline contexts. We also briefly examine local governance structures called {WikiProjects} through the example of {WikiProject} Military History one of the oldest and most prolific projects on the site. 2009 {M.E.} Sharpe Inc."
McElligott, T. Defining service management [data services] Telephony Volume 246 2005
A certain ill-defined slice of the operations support system portfolio has been in need of a Webster's fix since the first non-voice service was introduced onto the telecom network. Now that {IPTV,} the most complex data service to date, is about to hit the market, this field, known loosely as service management" needs more than Webster or Wikipedia; it needs someone to use it in a sentence. Syndesis deployed its service delivery management {(SDM)} solution for {IPTV.} With triple-play services becoming a reality and requiring more than conceptual solutions we see if {SDM} makes the grade"


ERZSEBET, B.; MARIA, C.; DUMITRU, Z.; ADELINA, D.; ADRIAN, Z.; GEORGETA, S. & MIHAI, B. DESCRIPTION OF SOME SPONTANEUS SPECIES AND THE POSSIBILITIES OF USE THEM IN THE ROCKY GARDENS Journal of Plant Development Volume 16 2009
Ng, Khar Thoe; Fong, Soon Fook & Soon, Seng Thah Design and Development of a Fluid Intelligence Instrument for a technology-enhanced PBL Programme Global Learn Asia Pacific 2010 [272]
Herr, Bruce W; Huang, Weixia; Penumarthy, Shashikant & Börner, Katy Designing highly flexible and usable cyberinfrastructures for convergence Annals of the New York Academy of Sciences Volume 1093 2006 [273]
This article presents the results of a 7-year-long quest into the development of a dream tool" for our research in information science and scientometrics and more recently network science. The results are two cyberinfrastructures {(CI):} The Cyberinfrastructure for Information Visualization and the Network Workbench that enjoy a growing national and interdisciplinary user community. Both {CIs} use the cyberinfrastructure shell {(CIShell)} software specification which defines interfaces between data sets and algorithms/services and provides a means to bundle them into powerful tools and {(Web)} services. In fact {CIShell} might be our major contribution to progress in convergence. Just as Wikipedia is an "empty shell" that empowers lay persons to share text a {CIShell} implementation is an "empty shell" that empowers user communities to plug-and-play share compare and combine data sets algorithms and compute resources across national and disciplinary boundaries. It is argued here that {CIs} will not only transform the way science is conducted but also will play a major role in the diffusion of expertise data sets algorithms and technologies across multiple disciplines and business sectors leading to a more integrative science."
Alemu, G. A Development and maintenance of the Ethiopian legal information website Afrika Focus Volume 20 2007
Hickerson, C. A & Thompson, S. R Dialogue through wikis: A pilot exploration of dialogic public relations and wiki websites PRism Online PR Journal Volume 6 2009
Lange, Kathy Differences Between Statistics and Data Mining DM Review Volume 16 Pages 32 2006
From a business perspective, it doesn't really matter what you call it: statistics, data mining or predictive analytics. Competitive advantage comes from making better decisions faster and more confidently. A deceptively simple question triggers lively debate among analytical professionals: What is the difference between statistics and data mining? Wikipedia defines statistics as, {A} mathematical science pertaining to collection analysis interpretation and presentation of data.""
Poyntz, Nick Digital history: all contributions welcome. History Today Volume 60 Pages 53 2010
This article looks at the opportunities and potential perils for historians brought about by the enormous growth in user-generated content on the internet. Developments such as the wiki enable the sharing of information and resources in new ways, one example being the {YourArchives} site provided by the National Archives since 2007. In terms of both its size and the amount of controversy it generates, Wikipedia, the online encyclopedia, surpasses all other secondary sources, and in using it historians need to be as cautious and as careful as they are when assessing the reliability of information contained in any primary source. Databases of photos and moving images such as Flickr and {YouTube} are certain to become essential tools for historians seeking sources on life in the early {21C} but effective use of them depends on accurate written descriptions provided with the images. The system known as Captcha, which ensures comments are not generated by computer programmes, is capable of digitising a huge volume of printed primary sources. {(Quotes} from original text)
Chandler-Olcott, Kelly Digital Literacies. A Tale of Two Tasks: Editing in the Era of Digital Literacies Journal of Adolescent \& Adult Literacy Volume 53 2009
This article argues that editing in the era of digital literacies is a complex, collaborative endeavor that requires a sophisticated awareness of audience and purpose and a knowledge of multiple conventions for conveying meaning and ensuring accuracy. It compares group editing of an article about the New York Yankees baseball team on Wikipedia, the popular online encyclopedia, to the decontextualized proofreading task required of seventh graders on a state-level examination. It concludes that literacy instruction in schools needs to prepare students for the multiple dimensions of editing in both print and online environments, which means teaching them to negotiate meanings with others, not merely to correct surface-feature errors. {(Contains} 1 figure.)
Liu, W H; Yang, W; Wu, X Q & Lin, Z X Direct determination of ethanol by laser Raman spectra with internal standard method CHINESE JOURNAL OF ANALYTICAL CHEMISTRY 2007
Immediate quantitative analysis of ethanol using internal standard by laser Raman spectra has been studied. Good linear correlation between the intensity ratios of the 884 cm(-1) (the {-CCO} hand of ethanol) to the 3200 cm(-1) (band of' water) and concentration of ethanol was obtained. The linear range of ethanol concentration are 4\%-40\%, the correlation coefficient was 0.9975, the detection limit of ethanol is 1.02\%. The method has been used to determine distilled spit-it, wikipedia, waxberry vino and alcohol cotton, the results show the alcohol concentration are 36.14\% 15.50\% 23.71\% 79.10\%, and the {RSD} are 0.2\%, 1.8\%, 2.5\% and 2.8\%, respectively. Non-destructive, non-intrusive nature of the method makes internal standard-laser Raman spectra as a convenience, accurate quantitative analysis method for ethanol.
Lin, Tan Disco as operating system, part one Criticism Volume 50 Pages 83 2008 [274]
Chute, C G Distributed biomedical terminology development: from experiments to open process Yearbook of Medical Informatics 2010 [275]
{OBJECTIVE:} Can social computing efforts materially alter the distributed creation and maintenance of complex biomedical terminologies and ontologies; a review of distributed authoring history and status. {BACKGROUND:} Social computing projects, such as Wikipedia, have dramatically altered the perception and reality of large-scale content projects and the labor required to create and maintain them. Health terminologies have become large, complex, interdependent content artifacts of increasing importance to biomedical research and the communities understanding of biology, medicine, and optimal healthcare practices. The question naturally arises as to whether social computing models and distributed authoring platforms can be applied to the voluntary, distributed authoring of high-quality terminologies and ontologies. {METHODS:} An historical review of distributed authoring developments. {RESULTS:} The trajectory of description logic-driven authoring tools, group process, and web-based platforms suggests that public distributed authoring is likely feasible and practical; however, no compelling example on the order of Wikipedia is yet extant. Nevertheless, several projects, including the Gene Ontology and the new revision of the International Classification of Disease {(ICD-11)} hold promise.
Sagy, Ornit & Hazzan, Orit Diversity in Excellence Fostering Programs: The Case of the Informatics Olympiad Journal of Computers in Mathematics and Science Teaching Volume 26 2007 [276]
Travis, John Do Wandering Albatrosses Care about Math? Science Volume 318 2007 [277]
Repudiating a decade-old study of sea birds, a new report questions a popular model of how animals--as well as fishing boats and people--search for food.
Skiba, Diane J. Do Your Students Wiki? Nursing Education Perspectives Volume 26 2005
The article presents information about wiki, which is the Hawaiian word for quick. The term was first coined in 1995 by Ward Cunningham, when he designed the Portland Pattern Repository as a community to discuss and share ideas about pattern languages. Wikipedia defines wiki as a website that allows users to add content, as on an Internet forum, but also allows anyone to edit the content. Wiki also refers to the collaborative software used to create such a website. The defining characteristics of a wiki are: social software that allows the ability to edit and add to a wiki document with relative ease; a simplified hypertext markup language for creating documents; and an open editing philosophy in which the community can edit and add to the document.
Muhlhauser, Ingrid & Oser, Friederike Does WIKIPEDIA provide evidence-based health care information? A content analysis Zeitschrift Fuhr Evidenz, Fortbildung Und Qualitat Im Gesundheitswesen Volume 102 2008 [278]
Patients and consumers are increasingly searching the Internet for medical and healthcare information. Using the criteria of evidence-based medicine the present study analyses the websites of Wikipedia and two major German statutory health insurances for content and presentation of patient information. 22 senior students of health sciences and education evaluated one topic each. In a first step, they identified the evidence for their specific question. Afterwards they used their results as reference for the evaluation of the three websites. Using a check list each student and a second researcher independently rated content and presentation of the information offered. All these websites failed to meet relevant criteria, and key information such as the presentation of probabilities of success on patient-relevant outcomes, probabilities of unwanted effects, and unbiased risk communication was missing. On average items related to the objectives of interventions, the natural course of disease and treatment options were only rated as partially fulfilled". Overall there were only minor differences between the three providers except for items related to the specific nature of the websites such as disclosure of authorship conflict of interest and support offers. In addition the Wikipedia information tended to achieve lower comprehensibility. In conclusion the quality of the healthcare information provided by Wikipedia and two major German statutory health insurances is comparable. They do not meet important criteria of evidence-based patient and consumer information though."
Locander, W. & Luechauer, D. Dr. Seuss's Sneetches Marketing Management Volume 17 Pages 44 2007 [279]
The first Dr. Seuss book to which a sales and marketing executive might turn for business lessons in this era of diversity initiatives is The Sneetches and Other Stories. A contributor to online encyclopedia Wikipedia suggests that the story is an obvious parable for the cycle of fashion and how snobbery and insecurity drive consumerism to consumers' own detriment. Although it has a powerful message to those in marketing, the leadership lesson in this story relates to the divisiveness" of creating divisions (formal or informal) between people. Perhaps the most deleterious Sneetch-like division experienced in organizations is the artificial distinction between the two kindred functions of marketing and sales."
O'Brien, Katerine Drop everything and read American Printer Pages 2006
An overview of the major printing processes and some suggestions provided to opt for the best processes are discussed. The New Medium of Print offers concise descriptions of gravure, offset, screen, digital and other print processes. It provides an introduction to the underlying systems for the creation and distribution of print and an exploration of its contemporary uses. Wikipedia is the free encyclopedia that anyone can edit and verify any material prior to sharing it with customers. Howstuffworks.com offset printing provides a lucid description of the printing process. Some books on sales techniques are also being provided which include Little Red Book of Selling, The New Strategic Selling, Spin Selling and Price Doesn't Count.
Zhang, Ying; Jones, Gareth J. & Zhang, Ke Dublin City University at CLEF 2007: Cross-Language Speech Retrieval Experiments Advances in Multilingual and Multimodal Information Retrieval 2008 [280]
The Dublin City University participation in the {CLEF} 2007 {CL-SR} English task concentrated primarily on issues of topic translation. Our retrieval system used the {BM25F} model and pseudo relevance feedback. Topics were translated into English using the Yahoo! {BabelFish} free online service combined with domain-specific translation lexicons gathered automatically from Wikipedia. We explored alternative topic translation methods using these resources. Our results indicate that extending machine translation tools using automatically generated domain-specific translation lexicons can provide improved {CLIR} effectiveness for this task.
Service, Robert F. DuPont Scientist Accused of Stealing Company's Trade Secrets Science Volume 325 Pages 1485 2009 [281]
Aron, D. Dynamic collaboration: a personal reflection Journal of Information Technology Volume 24 2009 [282]
This paper explores the nature of, and possibilities arising from, dynamic collaboration, where large numbers of people can collaborate on an evolving set of initiatives, without prior knowledge of each other. It references early examples of dynamic collaboration including Topcoder, Innocentive, Zopa, and Wikipedia. It then speculates about the future of dynamic collaboration.
Otto, P. & Simon, M. Dynamic perspectives on social characteristics and sustainability in online community networks System Dynamics Review Volume 24 2008 [283]
Online community networks can help organizations improve collaboration. However, in spite of their potential value, there has been little empirical research into two important network factors that determine their success: social characteristics of users and changes in operations that result from network evolution. Our research addresses these deficiencies by using a cultural framework. Derived from anthropology, it extends previous system dynamics research on online community networks. The framework acts as a lens, enabling a better understanding of the effects that changes in these factors bring to online community networks. Using data collected from Wikipedia for model calibration, our findings suggest that, contrary to conventional wisdom, removing policies that focus on building group commitment does not lower performance. The results also show that online networks need structural control, otherwise their attractiveness, credibility and, subsequently, content value might all decrease. To ensure sustainability the network must be monitored, especially during the early stages of its evolution, so that rules and regulations that ensure value and validity can be selectively employed. Copyright 2008 John Wiley Sons, Ltd.
Jahnke, Isa Dynamics of social roles in a knowledge management community Computers in Human Behavior Volume 26 2010 [284]
With the emergence of community-oriented Information and Communication Technology {(ICT)} applications, e.g., Wikipedia, the popularity of socio-technical phenomena in society has increased. This development emphasises the need to further our understanding of how computer-supported social group structures change over time and what forms emerge. This contribution presents the results of a qualitative field study of a {Socio-Technical} Community {(STC).} The {STC} is described from its founding (in 2001) to its sustainable development (in 2006) as well as its transformation phase (2007-2008). The design-based research approach revealed changes of social structures by social roles within the {STC} over time. The central conclusion is that such {STC's} - networks of computer-mediated communication and human interaction - evolve a specific kind of social structure, which is formal rather than informal. The results indicate that a group evolves from an informal trust-based community with few formal roles to a {STC} where the social mechanisms, and not the software architecture, supports knowledge management processes. 2009 Elsevier Ltd. All rights reserved.
Magnus, P.D. Early response to false claims in Wikipedia First Monday Volume 13 2008
A number of studies have assessed the reliability of entries in Wikipedia at specific times. One important difference between Wikipedia and traditional media, however, is the dynamic nature of its entries. An entry assessed today might be substantially extended or reworked tomorrow. This study paper assesses the frequency with which small, inaccurate changes are quickly corrected.
Knapp, Margaret M. eBay, Wikipedia, and the future of the footnote Theatre History Studies Volume 28 Pages 36 2008 [285]
Peter, Martina Ectomycorrhizal Fungi: Fairy Rings and the Wood-Wide Web New Phytologist Volume 171 2006 [286]
Ross, Jeffrey & Shanty, Frank Editing Encyclopedias for Fun and Aggravation. Publishing Research Quarterly Volume 25 2009
This collaborative, retrospective autoethnography begins by offering an overview of the encyclopedias with which we have been involved, as both contributors and consulting editors, over the past decade. We then review our strategies for recruiting authors and maintaining their interest to ensure the highest quality entries; it also covers the mechanics of processing these entries. Next, we discuss the actual and perceived benefits of editing an encyclopedia, the most significant issues we encountered, and our solutions. Finally, we contextualize the previous information in light of recent changes in the scholarly publishing industry.
Buzzi, Marina & Leporini, Barbara Editing Wikipedia content by screen reader: easier interaction with the Accessible Rich Internet Applications suite Disability and Rehabilitation. Assistive Technology Volume 4 2009 [287]
{PURPOSE:} This study aims to improve Wikipedia usability for the blind and promote the application of standards relating to Web accessibility and usability. {METHOD:} First, accessibility and usability of Wikipedia home, search result and edit pages are analysed using the {JAWS} screen reader; next, suggestions for improving interaction are proposed and a new Wikipedia editing interface built. Most of the improvements were obtained using the Accessible Rich Internet Applications {(WAI-ARIA)} suite, developed by the World Wide Web Consortium {(W3C)} within the framework of the Web Accessibility Initiative {(WAI).} Last, a scenario of use compares interaction of blind people with the original and the modified interfaces. {RESULTS:} Our study highlights that although all contents are accessible via screen reader, usability issues exist due to the user's difficulties when interacting with the interface. The scenario of use shows how building an editing interface with the {W3C} {WAI-ARIA} suite eliminates many obstacles that can prevent blind users from actively contributing to Wikipedia. {CONCLUSION:} The modified Wikipedia editing page is simpler to use via a screen reader than the original one because {ARIA} ensures a page overview, rapid navigation, and total control of what is happening in the interface.
Crawford, Diane Editorial Pointers. Communications of the ACM Volume 50 Pages 5 2007
The article introduces several features contained in the current issue, including contributions from Oded Nov on what inspires people to provide content to Wikipedia, Matt Bishop and David Wagner on the state of California's e-voting machines, and David Lorge Parnas on the shortcomings of evaluating researchers according to how many papers they publish.
Noruzi, A. Editorial-Wikipedia popularity from a citation analysis point of view Webology Volume 6 2009
Haas, Christina Editor's Introduction: Writing and New Media Special Issue. Written Communication Volume 25 2008
The article discusses various reports published within the article, including one by Jeff Bezemer, and Gunther Kress on social semiotics of writing, and another by John Jones on revision patterns in Wikipedia articles.
Boyer, C Education and consumer informatics Yearbook of Medical Informatics 2010 [288]
{OBJECTIVES:} To evaluate the extent to which the Internet is accessed for health information and perceived as useful to varying groups classified primarily according to age. {METHOD:} Synopsis of the articles on education and consumer health informatics selected for the {IMIA} Yearbook of Medical Informatics 2010. {RESULTS:} A growing number of individuals are actively seeking health information through a varying selection of resources. The Internet is now seen as a major source of health information alongside with books and other means of paper-based literature. However, it is not clear how the Internet is perceived by varied groups such as those coming from differing age groups. {CONCLUSION:} The papers selected attempt to obtain a better understanding about how the public perceives and uses the Internet as an information gathering tool-especially for health information. The papers also explore into how the Internet is used by different groups of people. As all online health information is not of uniform quality, it is important to access and rely on quality medical information. This issue is also dealt with, where the popularity of Wikipedia is measured with the popularity of reliable web sources such as Medline Plus and {NHS} Direct.
Chen, Nian-Shing & Hsieh, Sheng-Wen Kinshuk Effects of short-term memory and content representation type on mobile language learning Language, Learning \& Technology Volume 12 Pages 93 2008 [289]
Rosales, R. Eight Simple Ways to Embrace the Froom"" EDUCAUSE Quarterly Volume 32 2009
College students have become active participants in the learning process. The concept of collaboration, for example, is now considered central to their learning {DNA,} whether it's building an online wiki or doing multi-user editing using Google docs and other platforms. {(For} many insights into the learning styles of children and young adults, see the research series published by the {MacArthur} Foundation on Digital Media and Learning.) The new generation of college students could be considered a living manifestation of the Google Age. Nicholas Carr, citing major studies in his extensive piece titled {Is} Google Making Us Stupid?" in the The Atlantic {(July/August} 2008) asserted that the "media and other technologies we use in learning...play an important part in shaping the neural circuits of our brains." Digital media technologies have engendered a new way for many of us especially the young to learn: in bits and pieces and in a continuous download mode. If one has to juxtapose this new cognitive reality with the dominant campus learning environment - that is the old industrial block-by-block schedule - it should be no surprise to educators if we find students who are a little bored. How do we motivate students to learn and do class work a little more effectively? And how do we get them more excited about the courses they're taking? The answer lies in the froom - forever in a classroom. As educators we must find ways to use digital technology and establish a 360-degree classroom environment where students have the option to engage and tinker with ideas not just in class but also online and at any time. All kinds of popular new media must be deployed to motivate students to do course work. Being in the froom frame of mind requires getting used to the idea of nonlinear hybrid learning. Using computer lingo classrooms must become a place for "rendering" knowledge for problem solving and analysis and for fostering creativity. More specifically in the froom world professors must be willing to offer options for students to send text messages do instant messaging and chat via {iChat} or Skype. Instructors must also set up a faculty or social networking site so that students have several avenues to get feedback."
Gonzalez, Pedro Urra El enfoque de colaboración de Wikipedia y el proyecto Wikiprofesional. ACIMED Volume 18 2008
En este artículo se considera el proyecto de colaboración científica {WikiProteínas,} enfocado a recolectar e integrar los conocimientos sobre proteínas y su importancia en la biología y la medicina. Dicha iniciativa se realiza en el marco proyectual de las tecnologías Wiki y {WikiProfesional} {(Http://www.wikiprofessional.org/conceptweb).
R.M. Harden E-Learning and all that jazz. Medical Teacher Volume 28 Pages 98 2006
The article presents various reports in online education. These include: using learning objects, and online problem-based learning wherein Wheeler et al. studied the function of online problem-based learning in a Masters Module for teachers. The teacher in the virtual classroom provides tips on the difference between a face-to-face teacher and virtual teaching. This also mentions educational quotes that was published by the University of Ohio Office of Faculty Development. In the Universal Wiki, Wikipedia allows the reader to edit and contribute to the articles. The Moodle Online Learning Environment is an open source software package from the {UK} Open University in Great Britain.
Stoltenkamp, Juliet; Taliep, Tasneem; Braaf, Norina & Kasuto, Okasute eLearning at a higher education institution: Exponential growth and pain Global Learn Asia Pacific 2010 [290]
Pena-Bandalaria, Melinda M. Dela & Pena-Bandalaria, Melinda M. Dela E-Learning in the Philippines: Trends, Directions, and Challenges International Journal on E-Learning Volume 8 2009 [291]


Godwin-Jones, Robert Emerging technologies focusing on form: tools and strategies Language, Learning \& Technology Volume 13 Pages 5 2009 [292]
Iandoli, L.; Klein, M. & Zollo, G. Enabling on-line deliberation and collective decision-making through large-scale argumentation: a new approach to the design of an Internet-based mass collaboration platform International Journal of Decision Support System Technology Volume 1 2009
The successful emergence of on-line communities, such as open source software and Wikipedia, seems due to an effective combination of intelligent collective behavior and Internet capabilities. However, current Internet technologies, such as forum, wikis and blogs appear to be less supportive for knowledge organization and consensus formation. In particular very few attempts have been done to support large, diverse, and geographically dispersed groups to systematically explore and come to decisions concerning complex and controversial systemic challenges. In order to overcome the limitations of current collaborative technologies, in this article, we present a new large-scale collaborative platform based on argumentation mapping. To date argumentation mapping has been effectively used for small-scale, co-located groups. The main research questions this work faces are: can argumentation scale? Will large-scale argumentation outperform current collaborative technologies in collective problem solving and deliberation? We present some preliminary results obtained from a first field test of an argumentation platform with a moderate-sized (few hundreds) users community.
Witzleb, Norman ENGAGING WITH THE WORLD: STUDENTS OF COMPARATIVE LAW WRITE FOR WIKIPEDIA. Legal Education Review Volume 19 Pages 01/02/2011 2009
Improving students' computer literacy, instilling a critical approach to Internet resources and preparing them for collaborative work are important educational aims today. This article examines how a writing exercise in the style of a Wikipedia article can be used to develop these skills. Students in an elective unit in Comparative Law were asked to create, and review, a Wikipedia entry on an issue, concept or scholar in this field. This article describes the rationale for adopting this writing task, how it was integrated into the teaching and assessment structure of the unit, and how students responded to the exercise. In addition to critically evaluating the potential of this novel teaching tool, the article aims to provide some practical guidance on when Wikipedia assignments might be usefully employed.
Pehcevski, J.; Thom, J.; Vercoustre, A. & Naumovski, V. Entity ranking in Wikipedia: utilising categories, links and topic difficulty prediction Information Retrieval Volume 13 Pages 568 2010 [293]
Entity ranking has recently emerged as a research field that aims at retrieving entities as answers to a query. Unlike entity extraction where the goal is to tag names of entities in documents, entity ranking is primarily focused on returning a ranked list of relevant entity names for the query. Many approaches to entity ranking have been proposed, and most of them were evaluated on the {INEX} Wikipedia test collection. In this paper, we describe a system we developed for ranking Wikipedia entities in answer to a query. The entity ranking approach implemented in our system utilises the known categories, the link structure of Wikipedia, as well as the link co-occurrences with the entity examples (when provided) to retrieve relevant entities as answers to the query. We also extend our entity ranking approach by utilising the knowledge of predicted classes of topic difficulty. To predict the topic difficulty, we generate a classifier that uses features extracted from an {INEX} topic definition to classify the topic into an experimentally pre-determined class. This knowledge is then utilised to dynamically set the optimal values for the retrieval parameters of our entity ranking system. Our experiments demonstrate that the use of categories and the link structure of Wikipedia can significantly improve entity ranking effectiveness, and that topic difficulty prediction is a promising approach that could also be exploited to further improve the entity ranking {performance.[PUBLICATION} {ABSTRACT]
Antoch, Jaromir Environment for statistical computing Computer Science Review Volume 2 2008 [294]
Dobozy, Eva & Gross, Julia e-Partnerships: Library information acquisition in the comfort of students' digital homes Global Learn Asia Pacific 2010 [295]
Fallis, D. & Whitcomb, D. Epistemic values and information management Information Society Volume 25 2009
In contemporary life, some of the most important decisions that people must make are about the management of information (e.g., about the collection, organization, distribution, and evaluation of information). Legislatures have to decide which privacy and intellectual property laws to adopt, libraries have to decide which information resources (book, journals, databases, etc.) to collect and how to organize them, and individuals have to decide whether to trust the information that they find on Wikipedia or on the Internet in general. This article combines epistemology and decision analysis in an attempt to better equip people to make such information management decisions.
Brown, James J. Essjay's Ethos": Rethinking Textual Origins and Intellectual Property" College Composition and Communication Volume 61 2009
Discussions of intellectual property are often the focus of rhetoric and composition research, and the question of textual origins grounds these discussions. Through an examination of Wikipedia, the online encyclopedia anyone can edit, this essay addresses disciplinary concerns about textual origins and intellectual property through a discussion of situated and constructed ethos." {(Contains} 8 notes.)"
Inceoglu, Mustafa M. Establishing a K-12 circuit design program IEEE Transactions on Education Volume 53 2010 [296]
Outreach, as defined by Wikipedia, is an effort by an organization or group to connect its ideas or practices to the efforts of other organizations, groups, specific audiences, or the general public. This paper describes a computer engineering outreach project of the Department of Computer Engineering at Ege University, Izmir, Turkey, to a local elementary school. A group of 14 K-12 students was chosen by a four-stage selection method to participate in this project. This group was then taught discrete mathematics and logic design courses from the core curriculum of the Computer Engineering program. The two 11-week courses have a total of 132 contact h. The course contents are conveyed through both theoretical lessons and laboratory sessions. All of the laboratory sessions were carried out by K-12 students. Volunteer teachers from the elementary school participated in the project. The evaluations carried out during and at the end of project indicated the degree of satisfaction on the part of students and teachers. The project is still ongoing with the same methodology in its third year.
Lu, Jianguo & Li, Dingding Estimating deep web data source size by capture---recapture method Information Retrieval Volume 13 2010 [297]
Access critical reviews of computing literature. Become a reviewer for Computing Reviews


Lindsey, D. Evaluating quality control of Wikipedia's feature articIes First Monday Volume 15 2010
The purpose of this study was to evaluate the effectiveness of Wikipedia's premier internal quality control mechanism, the featured article" process which assesses articles against a stringent set of criteria. To this end scholars were asked to evaluate the quality and accuracy of Wikipedia featured articles within their area of expertise. A total of 22 usable responses were collected from a variety of disciplines. Out of the Wikipedia articles assessed only 12 of 22 were found to pass Wikipedia's own featured article criteria indicating that Wikipedia's process is ineffective. This finding suggests both that Wikipedia must take steps to improve its featured article process and that scholars interested in studying Wikipedia should be careful not to naively believe its assertions of quality."
Brokowski, Laurie & Sheehan, Amy Heck Evaluation of pharmacist use and perception of Wikipedia as a drug information resource The Annals of Pharmacotherapy Volume 43 2009 [298]
Chander, Anupam & Sunder, Madhavi Everyone's a Superhero: A Cultural Theory of Mary Sue" Fan Fiction as Fair Use" California Law Review Volume 95 Pages 597 2007
Fan fiction spans all genres of popular culture, from anime to literature. In every fan lierature, there is the Mary Sue. According to Wikipedia, a {Mary} Sue" is a fictional character who is portrayed in an idealized way and lacks noteworthy flaws and appears in the form of a new character beamed into the story or a marginal character brought out from the shadows. {"Mary} Sue" is often a pejorative expression used to deride fan fiction perceived as narcissistic. In this essay Mary Sue is rehabilitated as a figure of Subaltern critique and empowerment."
Menzies, Tim & Hihn, Jairus Evidence-Based Cost Estimation for Better-Quality Software IEEE Software Volume 23 2006 [299]
Evidence-based reasoning is becoming common in many fields. It's widely enshrined in the practice and teaching of medicine, law, and management, for example. Evidence-based approaches demand that, among other things, practitioners systematically track down the best evidence relating to some practice; critically appraise that evidence for validity, impact, and applicability; and carefully document it. One proponent of evidence-based software engineering is David Budgen of Durham University. In the Internet age, he argues, many sources of supposed {knowledge--Google,} Wikipedia, digg.com, and so on--surround us. At his keynote address at the 2006 Conference on Software Engineering Education and Training, Budgen asks, how should we train students to assess all that information and to separate the sense from the nonsense? In his view, before we can denounce some inaccuracy in, say, Wikipedia, we must first look to our own work and audit our own results.
Grayson, George W.; Klesner, Joseph L.; Wuhs, Steven T. & González, Francisco E. Evolution of Mexico and Other Single-Party States International Studies Review Volume 9 2007 [300]
Mehler, Andrew & Skiena, Steven Expanding network communities from representative examples ACM Transactions on Knowledge Discovery from Data Volume 3 2009 [301]
We present an approach to leverage a small subset of a coherent community within a social network into a much larger, more representative sample. Our problem becomes identifying a small conductance subgraph containing many (but not necessarily all) members of the given seed set. Starting with an initial seed set representing a sample of a community, we seek to discover as much of the full community as possible. We present a general method for network community expansion, demonstrating that our methods work well in expanding communities in real world networks starting from small given seed groups (20 to 400 members). Our approach is marked by incremental expansion from the seeds with retrospective analysis to determine the ultimate boundaries of our community. We demonstrate how to increase the robustness of the general approach through bootstrapping multiple random partitions of the input set into seed and evaluation groups. We go beyond statistical comparisons against gold standards to careful subjective evaluations of our expanded communities. This process explains the causes of most disagreement between our expanded communities and our gold-standards - arguing that our expansion methods provide more reliable communities than can be extracted from reference sources/gazetteers such as Wikipedia.
Hicks, Troy Expanding the Conversation: A Commentary Toward Revision of Swenson, Rozema, Young, McGrail, and Whitin Contemporary Issues in Technology and Teacher Education Volume 6 2006 [302]
Judd, T. & Kennedy, G. Expediency-based practice? Medical students' reliance on Google and Wikipedia for biomedical inquiries British Journal of Educational Technology 2009 [303]
Abstract Internet usage logs captured during self-directed learning sessions were used to determine how undergraduate medical students used five popular sites to locate and access biomedical resources. Students' perceptions of each site's usefulness and reliability were determined through a survey. Google and Wikipedia were the most frequently used sites despite students rating them as the least reliable of the five sites investigated. The library”the students' primary point of access to online journals”was the least used site, and when using Google less than 40\% of pages or resources located by students were from ˜high quality sources. Students' use of all sites' search tools was unsophisticated. Despite being avid users of online information and search tools, the students targeted in this study appeared to lack the requisite information-seeking skills to make the most of online resources. Although there is evidence that these skills improved over time, a greater emphasis on information literacy skills training may be required to ensure that graduates are able to locate the best available evidence to support their professional practice.
Wannemacher, Klaus Experiences and perspectives of Wikipedia use in higher education International Journal of Management in Education Volume 5 2011
University teaching is confronted with strong challenges through the emergence of new participatory web applications. While social software tools are strongly applied by students, instructors are often reluctant to adapt them in their teaching practice. Only recently have instructors begun to more strongly apply one of the most commonly used Web 2.0 applications, the online encyclopaedia Wikipedia, in university teaching. Based on an overview of international university projects, this contribution presents general data on the background, objectives, teaching approaches, assignment and feedback forms of Wikipedia-related courses and discusses adequate methods of enabling instructors to apply wiki systems within their teaching.
Markham, Selby; Krishnaswami, Shonali; Hurst, John; Cunningham, Steven; Saeedzadeh, Behrang; Gillick, Brett & Labbe, Cyril Experiencing a Context Aware Learning and Teaching Tool Global Learn Asia Pacific 2010 [304]
Gurevych, Iryna & Wolf, Elisabeth Expert-Built and Collaboratively Constructed Lexical Semantic Resources Language and Linguistics Compass Volume 4 2010 [305]
Prasarnphanich, P & Wagner, C Explaining the Sustainability of Digital Ecosystems based on the Wiki Model through Critical Mass Theory Industrial Electronics, IEEE Transactions on Pages 1 2009
This research investigates the sustainability of a type of digital ecosystem, namely knowledge sharing communities built on the wiki model. Sustainability is hypothesized to result from the participation of contributors with varying levels of resources and interests. The differences in resources and interests, according to critical mass theory, enable such communities to overcome typical start-up and growth problems. The article describes a preliminary empirical test of critical mass theory in this context, with Wikipedia as test case that demonstrates sustainability as well as resource and interest heterogeneity, based on a survey of 78 Wikipedians. The characteristic patterns of success exhibited in Wikipedia are expected to inform the management of other wiki based information assets.
Aneesh, T. Exploit A Major Breakthrough of Your Lifetime: Learn The Secrets of Anti-Aging Science Journal of Pharmacy Research 2009
Ferrandez, Sergio; Toral, Antonio; Ferrandez, Oscar; Ferrandez, Antonio & Munoz, Rafael Exploiting Wikipedia and EuroWordNet to solve Cross-Lingual Question Answering Information Sciences Volume 179 2009 [306]
This paper describes a new advance in solving {Cross-Lingual} Question Answering {(CL-QA)} tasks. It is built on three main pillars: (i) the use of several multilingual knowledge resources to reference words between languages (the Inter Lingual Index {(ILI)} module of {EuroWordNet} and the multilingual knowledge encoded in Wikipedia); (ii) the consideration of more than only one translation per word in order to search candidate answers; and (iii) the analysis of the question in the original language without any translation process. This novel approach overcomes the errors caused by the common use of Machine Translation {(MT)} services by {CL-QA} systems. We also expose some studies and experiments that justify the importance of analyzing whether a Named Entity should be translated or not. Experimental results in bilingual scenarios show that our approach performs better than an {MT} based {CL-QA} approach achieving an average improvement of 36.7\%. 2009 Elsevier Inc. All rights reserved.
Roussinov, Dmitri & Turetken, Ozgur Exploring models for semantic category verification Information Systems Volume 34 2009 [307]
Many artificial intelligence tasks, such as automated question answering, reasoning, or heterogeneous database integration, involve verification of a semantic category (e.g. coffee" is a drink "red" is a color while "steak" is not a drink and "big" is not a color). In this research we explore completely automated on-the-fly verification of a membership in any arbitrary category which has not been expected a priori. Our approach does not rely on any manually codified knowledge (such as {WordNet} or Wikipedia) but instead capitalizes on the diversity of topics and word usage on the World Wide Web thus can be considered "knowledge-light" and complementary to the "knowledge-intensive" approaches. We have created a quantitative verification model and established (1) what specific variables are important and (2) what ranges and upper limits of accuracy are attainable. While our semantic verification algorithm is entirely self-contained (not involving any previously reported components that are beyond the scope of this paper) we have tested it empirically within our fact seeking engine on the well known {TREC} conference test questions. Due to our implementation of semantic verification the answer accuracy has improved by up to 16\% depending on the specific models and metrics used. 2009 Elsevier {B.V.} All rights reserved."


Kay, Robin & Lauricella, Sharon Exploring the Benefits and Challenges of Using Laptops in Higher Education Classrooms Global Learn Asia Pacific 2010 [308]
SCHACKMAN, DANIEL Exploring the new frontiers of collaborative community. New Media \& Society Volume 11 2009
The article reviews several books on collaborative structures in online communication, including {Coming} of Age in Second Life: An Anthropologist Explores the Virtually Human by Tom Boellstorff, {Wikipedia} Second Life and Beyond: From Production to Produsage by Axel Bruns, and {The} Second Life Herald: The Virtual Tabloid That Witnessed the Dawn of the Metaverse by Peter Ludlow and Mark Wallace.
Stephens, Michael Exploring Web 2.0 and Libraries. Library Technology Reports Volume 42 2006
The article presents information on Web 2.0. Web 2.0 is the most recent incarnation of the World Wide Web which allows users to create, change, and publish dynamic web content using digital tools. The article discusses blogs, suggesting that they can help librarians create a communication channel with patrons by creating and maintaining a library blog. The article includes an in-depth discussion of the website Wikipedia's entry on Web 2.0. The author also discusses a workshop that he runs for librarians around the country, which showcases new technologies for librarians and helps them to create user-centered services for patrons. A glossary of selected terms is offered.
Li, Y; Huang, K Y; Ren, F J & Zhong, Y X Exploring Words with Semantic Relations from Chinese Wikipedia INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL 2008
This paper introduces a way of exploring words with semantic relations from Chinese Wikipedia documents. A corpus with structured documents is generated from Chinese Wikipedia pages. Then considering of the hyperlinks, text overlaps and word frequencies, word pairs with semantic relations are explored. Words can be self clustered into groups with tight semantic relations. We roughly measure the semantic relatedness with different document based algorithms and analyze the reliability of our measures in comparing experiment.
Saito, K.; Yamada, T. & Kazama, K. Extracting communities from complex networks by the k-dense method IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2008 [309]
To understand the structural and functional properties of large-scale complex networks, it is crucial to efficiently extract a set of cohesive subnetworks as communities. There have been proposed several such community extraction methods in the literature, including the classical k-core decomposition method and, more recently, the k-clique based community extraction method. The k-core method, although computationally efficient, is often not powerful enough for uncovering a detailed community structure and it produces only coarse-grained and loosely connected communities. The k-clique method, on the other hand, can extract fine-grained and tightly connected communities but requires a substantial amount of computational load for large-scale complex networks. In this paper, we present a new notion of a subnetwork called k-dense, and propose an efficient algorithm for extracting k-dense communities. We applied our method to the three different types of networks assembled from real data, namely, from blog trackbacks, word associations and Wikipedia references, and demonstrated that the k-dense method could extract communities almost as efficiently as the k-core method, while the qualities of the extracted communities are comparable to those obtained by the k-clique method.
Nadamoto, Akiyo; Aramaki, Eiji; Abekawa, Takeshi & Murakami, Yohei Extracting content holes by comparing community-type content with Wikipedia International Journal of Web Information Systems Volume 6 2010 [310]


Yu, Yang; Lin, Zhangxi & Xia, Guoping Extracting thematic communities from Wikipedia Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics Volume 35 2009
The current search module in Wikipedia has low search efficiency due to the search method, which is built on simple keywords matching. To improve the efficiency of knowledge retrieval from the Wikipedia spheres with more accurate links among them, the algorithm named term distance based on linkage {(TDL)} was proposed. {TDL} defines a new measure of distance between two keywords, which reorients and organizes those keywords into clusters. It is based on link structure analysis underpinned by computational models. The mechanism of ranking and recommending was imported. The experiment, which based on the snapshot of Wikipedia {(May} 2009), indicates that {TDL} would significantly increase the accuracy of knowledge retrieval in Wikipedia and this new algorithm can improve the users' satisfaction by 7\% compared with the present one.
Dorji, Tshering; sayed Atlam, El; Yata, Susumu; Fuketa, Masao; Morita, Kazuhiro & ichi Aoe, Jun Extraction, selection and ranking of Field Association (FA) Terms from domain-specific corpora for building a comprehensive FA terms dictionary Knowledge and Information Systems 2010 [311]
Vechtomova, Olga Facet-based opinion retrieval from blogs Information Processing and Management Volume 46 2010 [312]
The paper presents methods of retrieving blog posts containing opinions about an entity expressed in the query. The methods use a lexicon of subjective words and phrases compiled from manually and automatically developed resources. One of the methods uses the {Kullback-Leibler} divergence to weight subjective words occurring near query terms in documents, another uses proximity between the occurrences of query terms and subjective words in documents, and the third combines both factors. Methods of structuring queries into facets, facet expansion using Wikipedia, and a facet-based retrieval are also investigated in this work. The methods were evaluated using the {TREC} 2007 and 2008 Blog track topics, and proved to be highly effective. 2009 Elsevier Ltd. All rights reserved.
Marie, Janyne Ste Favorite Reference Books. Key Words Volume 15 2007
The article highlights several medical references used in the discussion of indexing and suitable for the medical profession in the {U.S.} These include {Taber's} Cyclopedic Medical Dictionary {Dorland's} Illustrated Medical Dictionary {Stedman's} Medical Dictionary Wikipedia, {PubMed,} Google Inc., Sigma catalogs, Merck manuals, {MediLexicon,} {ChemNetBase,} Toxnet, and {SciMed} indexing Web site. The Web sites provided offer various information aside from the medical field and are accessible and easy to use.
Wei, Z. Feminist invitational collaboration in a digital age: Looking over disciplinary and national borders Gender, Communication, and Technology Volume 31 2008 [313]
Anonymous Fernanda Bertini Viegas and Martin Wattenberg. Issues in Science & Technology Volume 24 Pages 57 2008
The article provides information on the history flow designed by Martin Wattenberg and Fernanda Bertini Viegas, that presents visualization of the flow of editing that takes place on all Wikipedia entries. An example is presented that show the history of the popular entry for chocolate in 2003 that each color corresponds to a different contributor. The revision lines corresponds to the beginning of changed or updated text, and a line's length shows the length of the text. Furthermore, the visualization therefore briefly takes the level of debate and controversy surrounding a topic.
Kupiainen, Reijo; Suoranta, Juha & Vaden, Tere Fire Next Time: Or Revisioning Higher Education in the Context of Digital Social Creativity E-Learning Volume 4 2007
This article presents an idea of digital social creativity" as part of social media and examines an approach emphasising openness and experimentation and collaborative learning in the world of information and communication technologies. Wikipedia and similar digital tools provide both challenges to and possibilities for building learning sites in higher education and other forms of education and socialisation that recognise various forms of information and knowledge creation. The dialogical nature of knowledge and the emphasis on social interaction create a tremendous opportunity for education but at the same time form new hegemonic battlegrounds in terms of various uses of social media. {(Contains} 1 table.)"
Lee, Kangpyo; Kim, Hyunwoo; Jang, Chungsu & Kim, Hyoung-Joo FolksoViz: A subsumption-based folksonomy visualization using the wikipedia Journal of KISS: Computing Practices Volume 14 2008
Folksonomy, which is created through the collaborative tagging from many users, is one of the driving factors of Web 2.0. Tags are said to be the web metadata describing a web document. If we are able to find the semantic subsumption relationships between tags created through the collaborative tagging, it can help users understand the metadata more intuitively. In this paper, targeting del.icio.us tag data, we propose a method named {FolksoViz} for deriving subsumption relationships between tags by using Wikipedia texts. For this purpose, we propose a statistical model for deriving subsumption relationships based on the frequency of each tag on the Wikipedia texts, and {TSD} {(Tag} Sense Disambiguation) method for mapping each tag to a corresponding Wikipedia text. The derived subsumption pairs are visualized effectively on the screen. The experiment shows that our proposed algorithm managed to find the correct subsumption pairs with high accuracy.
Jr., James Brown From Friday to Sunday: The hacker ethic and shifting notions of labour, leisure and intellectual property Leisure Studies Volume 27 2008 [314]
Leisure studies scholars have theorised how the Web is changing leisure experiences, and this essay continues that work by discussing the Web and shifting notions of leisure, labour and intellectual property. Much online activity is described under the umbrella term of 'piracy'. By discussing online cultural production in terms of what Pekka Himanen calls the hacker ethic, we can rethink rhetorics of piracy and better understand the positive and negative aspects of online activities. Rather than thinking of online activity as derivative, we can reframe Web texts as doing what all cultural texts do - build upon the past. The ethic of the Web is built on a hacker approach to work, play, collaboration, intellectual property. Facebook applications and Wikipedia entries are just two examples of Web users' embrace of the hacker ethic. But is this labour or leisure? Is Wikipedia, a text edited and maintained by volunteers, the result of work or play? Himanen provides a new way to view online activities that sit in between the categories of labour and leisure. Further, the hacker ethic allows us to understand the contested terms of labour and leisure alongside a third contested term: intellectual property. This paper provides a framework to help us better understand the new immaterial aspects of leisure activity happening on the Web. A discussion of these activities in terms of the hacker ethic allows scholars to explore shifting notions of labour, leisure and intellectual property without resorting to rhetorics of piracy.
NJ Klemp From Town-Halls to Wikis: Exploring Wikipedia's Implications for Deliberative Democracy. Journal of Public Deliberation Volume 6 2010
wales, Jimmy Future Web Setting knowledge free. Index on Censorship Volume 36 Pages 165 2007
The author discusses the challenges facing the Internet industry concerning the implementation of censorship by governments worldwide. He stresses that the Internet offers various advantages to end users which include easy access to information and provision of informal education. He states that Wikipedia is one of the numerous services found in the Internet which gives an opportunity for people to search for information for free. The author argues that the action carried out by governments to limit Internet access will prohibit individuals to utilize its benefits.


Lim, S & Kwon, N Gender differences in information behavior concerning Wikipedia, an unorthodox information source? LIBRARY \& INFORMATION SCIENCE RESEARCH Volume 32 2010 [315]
This study examined gender differences in information behavior concerning Wikipedia. Data were collected using a Web survey in spring 2008. The study used a convenient sample that consisted of students who had taken an introductory undergraduate course at a large public university in the Midwestern United States. A total of 134 out of 409 students participated in the study. As information consumers, male students used Wikipedia more frequently than their female counterparts did. With respect to the purposes of Wikipedia use, male students used Wikipedia for entertainment or idle reading more than their female counterparts, while there were no gender differences regarding Wikipedia use for other purposes. Male students were more likely to discount the risks involved when using Wikipedia information compared to their female counterparts. Furthermore, male students had higher ratings than female students regarding most aspects of Wikipedia, including outcome expectations, perceptions about its information quality, belief in the Wikipedia project itself, emotional states while using Wikipedia, confidence in evaluating information quality, and further exploration. Finally, there was no gender difference regarding the number of years of Wikipedia use. However, male students reported having more positive experiences with the information quality of Wikipedia than their female counterparts. Overall, the findings of this study were consistent with those of previous studies concerning gender. Given the acknowledgment of the knowledge value of Wikipedia in recent literature, it seems that there are more advantages to using Wikipedia than there are disadvantages. The current study shows that male students seem to enjoy such benefits more than female students and may have more opportunities to develop their information literacy skills than female students by actively using Wikipedia. This suggests that educators need to encourage female students in particular to explore Wikipedia strategically as an initial information source so that they can develop their information literacy skills for unconventional sources. {(C)} 2010 Elsevier Inc. All rights reserved.
Mehler, Alexander; Pustylnikov, Olga & Diewald, Nils Geography of social ontologies: Testing a variant of the Sapir-Whorf Hypothesis in the context of Wikipedia Computer Speech & LanguageVolume 25, Issue 3 Pages 716-740 2010 [316]
In this article, we test a variant of the {Sapir-Whorf} Hypothesis in the area of complex network theory. This is done by analyzing social ontologies as a new resource for automatic language classification. Our method is to solely explore structural features of social ontologies in order to predict family resemblances of languages used by the corresponding communities to build these ontologies. This approach is based on a reformulation of the {Sapir-Whorf} Hypothesis in terms of distributed cognition. Starting from a corpus of 160 Wikipedia-based social ontologies, we test our variant of the {Sapir-Whorf} Hypothesis by several experiments, and find out that we outperform the corresponding baselines. All in all, the article develops an approach to classify linguistic networks of tens of thousands of vertices by exploring a small range of mathematically well-established topological indices. 2010 Elsevier Ltd. All rights reserved.
Rowe, Sylvia & Alexander, Nick Getting It Right in the Coming Communications Twilight Zone. Nutrition Today Volume 43 2008
The article evaluates the evolution of technologies that affect consumer understanding of nutrition science, both by quickening the pace of research itself and by allowing communication. It also discusses the paradigm changes in science communications from traditional closed models to systems more likened to a Wikipedia model where Internet surfers themselves become the experts.
Plaza, Beatriz Google Analytics for measuring website performance Tourism Management Pages Corrected Proof 2010
Choolhun, Natasha Google: to use, or not to use. What is the question? Legal Information Management Volume 9 Pages 168 2009 [317]
Wielsch, Dan Governance of Massive Multiauthor Collaboration “ Linux, Wikipedia, and Other Networks: Governed by Bilateral Contracts, Partnerships, or Something in Between? jipitec Volume 1 2010 [318]
Open collaborative projects are moving to the foreground of knowledge production. Some online user communities develop into longterm projects that generate a highly valuable and at the same time freely accessible output. Traditional copyright law that is organized around the idea of a single creative entity is not well equipped to accommodate the needs of these forms of collaboration. In order to enable a peculiar network-type of interaction participants instead draw on public licensing models that determine the freedoms to use individual contributions. With the help of these access rules the operational logic of the project can be implemented successfully. However, as the case of the Wikipedia {GFDL-CC} license transition demonstrates, the adaptation of access rules in networks to new circumstances raises collective action problems and suffers from pitfalls caused by the fact that public licensing is grounded in individual copyright. Legal governance of open collaboration projects is a largely unexplored field. The article argues that the license steward of a public license assumes the position of a fiduciary of the knowledge commons generated under the license regime. Ultimately, the governance of decentralized networks translates into a composite of organizational and contractual elements. It is concluded that the production of global knowledge commons relies on rules of transnational private law.


Zhang, Xiaoquan (Michael) & Zhu, Feng Group Size and Incentives to Contribute: A Natural Experiment at Chinese Wikipedia American Economic Review 2011 [319]
The literature on the private provision of public goods suggests an inverse relationship between incentives to contribute and group size. We find, however, that after an exogenous reduction of group size at Chinese Wikipedia, the nonblocked contributors decrease their contributions by 42.8\% on average. We attribute the cause to social effects: Contributors receive social benefits that increase with both the amount of their contributions and group size, and the shrinking group size weakens these social benefits. Consistent with our explanation, we find that the more contributors value social benefits, the more they reduce their contributions after the block.group size, incentives to contribute, Internet censorship, public goods, social effects, Wikipedia
Belden, Dreanna; Stephens, Michael (Editor & Cox, Christopher N. (Editor Harnessing Social Networks to Connect with Audiences : If You Build It, Will They Come 2.0? Internet reference services quarterly Volume 13 2008
Digital libraries offer users a wealth of online resources, but most of these materials remain hidden to potential users. Established strategies for outreach and promotion bring limited success when trying to connect with users accustomed to Googling their way through research. Social Networks provide an opportunity for connecting with audiences in the places they habitually seek information. The University of North Texas Libraries' Portal to Texas History (http://texashistory.unt. edu/) has experienced dramatic increases in Web usage and reference requests by harnessing the power of social networks such as Wikipedia and My Space.


Harouni, H High School Research and Critical Literacy: Social Studies With and Despite Wikipedia HARVARD EDUCATIONAL REVIEW Volume 79 2009 [320]
Drawing on experiences in his social studies classroom, Houman Harouni evaluates both the challenges and possibilities of helping high school students develop critical research skills. The author describes how he used Wikipedia to design classroom activities that address issues of authorship, neutrality, and reliability in information gathering. The online encyclopedia is often lamented by teachers, scholars, and librarians, but its widespread use necessitates a new approach to teaching research. In describing the experience, Harouni concludes that teaching research skills in the contemporary context requires ongoing observations of the research strategies and practices students already employ as well as the active engagement of student interest and background knowledge.


de Laat, P. B How can contributors to open-source communities be trusted? On the assumption, inference, and substitution of trust Ethics and Information Technology Volume 12 2010
Open-source communities that focus on content rely squarely on the contributions of invisible strangers in cyberspace. How do such communities handle the problem of trusting that strangers have good intentions and adequate competence? This question is explored in relation to communities in which such trust is a vital issue: peer production of software {(FreeBSD} and Mozilla in particular) and encyclopaedia entries {(Wikipedia} in particular). In the context of open-source software, it is argued that trust was inferred from an underlying ˜hacker ethic, which already existed. The Wikipedian project, by contrast, had to create an appropriate ethic along the way. In the interim, the assumption simply had to be that potential contributors were trustworthy; they were granted ˜substantial trust. Subsequently, projects from both communities introduced rules and regulations which partly substituted for the need to perceive contributors as trustworthy. They faced a design choice in the continuum between a high-discretion design (granting a large amount of trust to contributors) and a low-discretion design (leaving only a small amount of trust to contributors). It is found that open-source designs for software and encyclopaedias are likely to converge in the future towards a mid-level of discretion. In such a design the anonymous user is no longer invested with unquestioning trust.


Grzega, Joachim How Onomasiologists Can Help with Contributing to Wikipedia Onomasiology Online Volume 7 2006
In this article Wikipedia is presented as the most important everyday venue for knowledge management. The three different main styles are described: articles, article talk pages, and user pages. Then several aspects are commented on from an onomasiologist's perspective: (1) content management on talk pages (e.g. thematic structures should be preferred over linear structures); (2) evaluation of cited sources (e.g. authors should be experts, results should have appeared in acknowledged venues, facts and opinions should be distinguished); (3) expert-layperson communication (e.g. different types of definitions including the use of examples should be used, jargon can be used if explained, contents should be structured from the general to the specific, description instead of evaluation should be used); (4) linking (including setting links to one's own article in other articles); and (5) categorizing into conceptual fields. Examples are taken from the English version of Wikipedia \& are generalizable to other versions. The final section of the paper gives a few ideas for integrating the observations of the article into high-school and university education: In every subject students should be encouraged to practice expert-novice communication through collaborating in Wikipedia; students are offered guidelines on contributing (to) articles (e.g. concerning the creation and understanding of definitions, text structure, jargon, neutral point of view, linking and categorizing) and guidelines on contributing to talk pages (e.g. the use of an integrative style" which aims at achieving consensus between contributors and not at having administrators decide on the content of articles). Adapted from the source document"
Head, A.J. & Eisenberg, M.B. How today's college students use Wikipedia for course-related research First Monday Volume 15 2010
Findings are reported from student focus groups and a large-scale survey about how and why students (enrolled at six different {U.S.} colleges) use Wikipedia during the course-related research process. A majority of respondents frequently used Wikipedia for background information, but less often than they used other common resources, such as course readings and Google. Architecture, engineering, and science majors were more likely to use Wikipedia for course-related research than respondents in other majors. The findings suggest Wikipedia is used in combination with other information resources. Wikipedia meets the needs of college students because it offers a mixture of coverage, currency, convenience, and comprehensibility in a world where credibility is less of a given or an expectation from today's students.
Scott R. Sailor I Thought Wikis Were Creatures in Star Wars!"" Athletic Therapy Today Volume 11 2006
This article reports on the practice and use of Wiki. The author defines the practice of Wiki as a piece of server software that allows users to freely create and edit Web page content using any Web browser." Wikipedia an internationally web-based free-content encyclopedia project is an example of a Wiki."
Kostakis, V. Identifying and understanding the problems of Wikipedia's peer governance: The case of inclusionists versus deletionists First Monday Volume 15 2010
Wikipedia has been hailed as one of the most prominent peer projects that led to the rise of the concept of peer governance. However, criticism has been levelled against Wikipedia's mode of governance. This paper, using the Wikipedia case as a point of departure and building upon the conflict between inclusionists and deletionists, tries to identify and draw some conclusions on the problematic issue of peer governance.
Silva, F.N.; Travencolo, B.A.N.; Viana, M.P. & da Fontoura Costa, L. Identifying the borders of mathematical knowledge Journal of Physics A: Mathematical and Theoretical Volume 43 2010 [321]
Based on a divide and conquer approach, knowledge about nature has been organized into a set of interrelated facts, allowing a natural representation in terms of graphs: each `chunk' of knowledge corresponds to a node, while relationships between such chunks are expressed as edges. This organization becomes particularly clear in the case of mathematical theorems, with their intense cross-implications and relationships. We have derived a web of mathematical theorems from Wikipedia and, thanks to the powerful concept of entropy, identified its more central and frontier elements. Our results also suggest that the central nodes are the oldest theorems, while the frontier nodes are those recently added to the network. The network communities have also been identified, allowing further insights about the organization of this network, such as its highly modular structure.
Rahurkar, M.; Tsai, S.-F.; Dagli, C. & Huang, T.S. Image Interpretation Using Large Corpus: Wikipedia Proceedings of the IEEE Volume 98 2010 [322]
Image is a powerful medium for expressing one's ideas and rightly confirms the adage, One picture is worth a thousand words. In this work, we explore the application of world knowledge in the form of Wikipedia to achieve this objective-literally. In the first part, we disambiguate and rank semantic concepts associated with ambiguous keywords by exploiting link structure of articles in Wikipedia. In the second part, we explore an image representation in terms of keywords which reflect the semantic content of an image. Our approach is inspired by the desire to augment low-level image representation with massive amounts of world knowledge, to facilitate computer vision tasks like image retrieval based on this information. We represent an image as a weighted mixture of a predetermined set of concrete concepts whose definition has been agreed upon by a wide variety of audience. To achieve this objective, we use concepts defined by Wikipedia articles, e.g., sky, building, or automobile. An important advantage of our approach is availability of vast amounts of highly organized human knowledge in Wikipedia. Wikipedia evolves rapidly steadily increasing its breadth and depth over time.
Pentzold, Christian Imagining the Wikipedia community: what do Wikipedia authors mean when they write about their ˜community? new media & society XX(X) 1–18 2011 [323]
This article examines the way Wikipedia authors write their ˜community into being. Mobilizing concepts regarding the communicative constitution of communities, the computer-mediated conversation between editors were investigated using Grounded Theory procedures. The analysis yielded an empirically grounded theory of the users self-understanding of the Wikipedia community as ethos-action community. Hence, this study contributes to research on online community-building as it shifts the focus from structural criteria for communities to the discursive level of community formation.
Moy, CL; Locke, JR; Coppola, BP & McNeil, AJ Improving Science Education and Understanding through Editing Wikipedia JOURNAL OF CHEMICAL EDUCATION Volume 87 2010 [324]
Erdmann, Maike; Nakayama, Kotaro; Hara, Takahiro & Nishio, Shojiro Improving the extraction of bilingual terminology from Wikipedia ACM Transactions on Multimedia Computing, Communications and Applications Volume 5 2009 [325]
Research on the automatic construction of bilingual dictionaries has achieved impressive results. Bilingual dictionaries are usually constructed from parallel corpora, but since these corpora are available only for selected text domains and language pairs, the potential of other resources is being explored as well. In this article, we want to further pursue the idea of using Wikipedia as a corpus for bilingual terminology extraction. We propose a method that extracts term-translation pairs from different types of Wikipedia link information. After that, an {SVM} classifier trained on the features of manually labeled training data determines the correctness of unseen term-translation pairs. 2009 {ACM.


Luyt, Brendan & Tan, Daniel Improving wikipedia's credibility: References and citations in a sample of history articles Journal of the American Society for Information Science and Technology Volume 61 2010 [326]
This study evaluates how well the authors of Wikipedia history articles adhere to the site's policy of assuring verifiability through citations. It does so by examining the references and citations of a subset of country histories. The findings paint a dismal picture. Not only are many claims not verified through citations, those that are suffer from the choice of references used. Many of these are from only a few {US} government Websites or news media and few are to academic journal material. Given these results, one response would be to declare Wikipedia unsuitable for serious reference work. But another option emerges when we jettison technological determinism and look at Wikipedia as a product of a wider social context. Key to this context is a world in which information is bottled up as commodities requiring payment for access. Equally important is the problematic assumption that texts are undifferentiated bearers of knowledge. Those involved in instructional programs can draw attention to the social nature of texts to counter these assumptions and by so doing create an awareness for a new generation of Wikipedians and Wikipedia users of the need to evaluate texts (and hence citations) in light of the social context of their production and use. {2010ASIST.
Murley, D. In defense of Wikipedia Law Library Journal Volume 100 2008
Ms. Murley explains how Wikipedia articles are created and edited and how to use Wikipedia's tools to evaluate articles. She argues that research instructors should teach students to use Wikipedia properly, rather than trying to convince them not to use it. Finally, she suggests ways in which Wikipedia can be used to help teach the importance of evaluating sources.
Sundin, O. & Francke, H. In search of credibility: pupils' information practices in learning environments Information Research Volume 14 2009
Introduction. We aim to create an in-depth understanding of how pupils in upper secondary school negotiate the credibility and authority of information as part of their practices of learning. Particular focus is on the use of user-created resources, such as Wikipedia, where authorship is collective and/or hard to determine. Method. An ethnographic study was conducted in an upper secondary school class. Methods included observation, group interviews and information seeking diaries in the form of blogs. Analysis. The empirical material from the class room study was categorised and aggregated into five themes, which emerged as a result of the interplay between the empirical material and a perspective based in socio-cultural theory. Results. The pupils make credibility assessments based on methods developed for traditional media where, for instance, origin and authorship are important. They employ some user-created sources, notably Wikipedia, because these are easily available, but they are uncertain about when these sources should be considered credible. Conclusions. In an increasingly diverse media world, pupils' credibility assessments need to be informed by a socio-technical understanding of sources which takes both social and material aspects into account. The diversity of resources requires that pupils assess credibility for the particular situation in which they use information.
Garud, R; Jain, S & Tuertscher, P Incomplete by design and designing for incompleteness ORGANIZATION STUDIES Volume 29 2008 [327]
The traditional scientific approach to design extols the virtues of completeness. However, in environments characterized by continual change, there are challenges in adopting such an approach. We examine Linux and Wikipedia as two exemplary cases to explore the nature of design in such a protean world. Our observations highlight a pragmatic approach to design in which incompleteness is harnessed in a generative manner. This suggests a change in the meaning of the word 'design' itself - from one that separates the process of design from its outcome, to one that considers design as both the medium and outcome of action.
Adamic, L.A.; Wei, Xiao; Yang, Jiang; Gerrish, S.; Nam, K.K. & Clarkson, G.S. Individual focus and knowledge contribution First Monday Volume 15 2010
Before contributing new knowledge, individuals must attain requisite background knowledge or skills through schooling, training, practice, and experience. Given limited time, individuals often choose either to focus on few areas, where they build deep expertise, or to delve less deeply and distribute their attention and efforts across several areas. In this paper we measure the relationship between the narrowness of focus and the quality of contribution across a range of both traditional and recent knowledge sharing media, including scholarly articles, patents, Wikipedia, and online question and answer forums. Across all systems, we observe a small but significant positive correlation between focus and quality.
Davis, Chris; Nikolic, Igor & Dijkema, Gerard P.J. Industrial ecology 2.0 Journal of Industrial Ecology Volume 14 2010 [328]
Summary: Industrial ecology {(IE)} is an ambitious field of study where we seek to understand systems using a wide perspective ranging from the scale of molecules to that of the planet. Achieving such a holistic view is challenging and requires collecting, processing, curating, and sharing immense amounts of data and knowledge. We are not capable of fully achieving this due to the current state of tools used in {IE} and current community practices. Although we deal with a vastly interconnected world, we are not so good at efficiently interconnecting what we learn about it. This is not a problem unique to {IE,} and other fields have begun to use tools supported by the World Wide Web to meet these challenges. We discuss these sets of tools and illustrate how community driven data collection, processing, curation, and sharing is allowing people to achieve more than ever before. In particular, we discuss standards that have been created to allow for interlinking of data dispersed across multiple Web sites. This is currently visible in the Linking Open Data initiative, which among others contains interlinked datasets from the {U.S.} and {U.K.} governments, biology databases, and Wikipedia. Since the types of technologies and standards involved are outside the normal scope of work by many industrial ecologists, we attempt to explain the relevance, implications, and benefits through a discussion of many real examples currently on the Web. From these, we discuss several best practices, which can be enabling factors for how {IE} and the community can more efficiently and effectively meet its ambitions-an agenda for Industrial Ecology 2.0. 2010 by Yale University.
Arazy, A; Nov, O; Patterson, R & Yeo, L Information Quality in Wikipedia: The Effects of Group Composition and Task Conflict Journal of Management Information Systems 2011


Hahn, J. Information seeking with Wikipedia on the iPod Touch Reference Services Review Volume 38 2010 [329]
Purpose - The purpose of this paper is to present the results of a usability study which inquired into undergraduate student information seeking with Wikipedia on the {iPod} touch. Design/methodology/approach - Data are drawn from {iPod} search logs and student survey responses. Search log data are coded with {FRBR} subject entities (group 3 entity sets) for analysis. Findings - Students characterize the overall nature of information searched for with the Wikipedia app to be for recreational and for short factual information. Recreational searching as a way in which undergraduate students utilize mobile technology is an earlier finding of Wikipedia {iPod} usage, and is verified as a trend of undergraduate student search using the {iPod.} All undergraduate student participants of the Wikipedia app on a mobile interface report this tool as helping to become more efficient in their research. Students viewed Wikipedia articles about people and concepts more so than other article types. Originality/value - Undergraduate student mobile search log analysis over a specific type of information resource on the {iPod} Touch is an original usability project. Previous mobile search log analysis analyzes thousands of unknown users and millions of anonymous queries, where the devices used for searching are not always identifiable and trends about touch screens cannot be ascertained.
Nov, O Information Sharing and Social Computing: Why, What, and Where? ADVANCES IN COMPUTERS 2009
Why do people share content, metainformation, and programming knowledge with people they don't know, in return for no money? In a series of studies, the different drivers for information sharing in social computing systems are identified, and the effect of these drivers on actual levels of sharing is estimated, using a combination of survey and system data from Wikipedia, Flickr and a number of open source software projects. This way, we gain deeper understanding of why people share information, what types of information they share, and what are the venues used for the different types of sharing.
Rubin, A. & Rubin, E. Informed Investors and the Internet Journal of Business Finance \& Accounting Volume 37 Pages 07/08/2011 2010 [330]
During the last decade the Internet has become an increasingly important source for gathering company related information. We employ Wikipedia editing frequency as an instrument that captures the degree in which the population is engaged with the processing of company-related information. We find that firms whose information is processed by the population more frequently are associated with lower analysts' forecast errors, smaller analysts' forecast dispersions, and significant changes in bid-ask spreads on analysts' recommendation days. These results indicate that information processing over the Internet is related to the degree to which investors and analysts are informed about companies.
Abilock, Debbie INQUIRY EVALUATION. Knowledge Quest Volume 38 2010
The article focuses on series of judgment calls that end in a summative assessment of credibility for librarians for teaching evaluation on students as inquiry. It presents a model for credibility assessment which is an iterative process and is based on several factors. It refers to Wikipedia's list of projects, Wikimedia 2009c, for ideas on teaching evaluation from educators.
Lehmann, Simon; Schwanecke, Ulrich & Dorner, Ralf Interactive visualization for opportunistic exploration of large document collections Information Systems Volume 35 2010 [331]
Finding relevant information in a large and comprehensive collection of cross-referenced documents like Wikipedia usually requires a quite accurate idea where to look for the pieces of data being sought. A user might not yet have enough domain-specific knowledge to form a precise search query to get the desired result on the first try. Another problem arises from the usually highly cross-referenced structure of such document collections. When researching a subject, users usually follow some references to get additional information not covered by a single document. With each document, more opportunities to navigate are added and the structure and relations of the visited documents gets harder to understand. This paper describes the interactive visualization Wivi which enables users to intuitively navigate Wikipedia by visualizing the structure of visited articles and emphasizing relevant other topics. Combining this visualization with a view of the current article results in a custom browser specially adapted for exploring large information networks. By visualizing the potential paths that could be taken, users are invited to read up on subjects relevant to the current point of focus and thus opportunistically finding relevant information. Results from a user study indicate that this visual navigation can be easily used and understood. A majority of the participants of the study stated that this method of exploration supports them finding information in Wikipedia. 2009 Elsevier {B.V.} All rights reserved.
Wales, Jimmy Internet encyclopaedias go head to head Nature Volume 438 2005 [332]
Jimmy Wale's Wikipedia comes close to Britannica in terms of the accuracy of its science entries, a Nature investigation finds.
Rogers, Richard Internet Research: The Question of Method”A Keynote Address from the YouTube and the 2008 Election Cycle in the United States Conference Journal of Information Technology \& Politics Volume 7 2010 [333]
Digital studies on culture may be distinguished from cultural studies of the digital, at least in terms of method. This lecture takes up the question of the distinctiveness of œdigital methods? for researching Internet cultures. It asks, initially, should the methods of study change, however slightly or wholesale, given the specificity of the new medium? The larger digital methods project thereby engages with œvirtual methods,? the current, dominant œe-science? approach to the study of the Internet, and the consequences for research of importing standard methods from the social sciences in particular. What kinds of contributions are made to digital media studies, and the Internet in particular, when traditional methods are imported from the social sciences and the humanities onto the medium? Which research opportunities are foreclosed? Second, I ask, what kinds of new approaches are worthwhile, given an emphasis on the œnatively digital? as opposed to digitization? The goal is also to change the focus of humanities and humanities computing away from the opportunities afforded by transforming ink into bits. The effort is to develop the study of natively digital objects (the link, the tag, etc.) and devices (engines and other recommendation machines) that make use of them. After critically reviewing existing approaches to the study of the digital, which largely import method onto the medium, I subsequently propose research strategies that follow the medium. How can one learn from methods in the medium, and repurpose them for social and cultural research? The lecture launches a novel strand of study: digital methods.
Okoli, Chitu & Oh, Wonseok Investigating recognition-based performance in an open content community: A social capital perspective Information and Management Volume 44 2007 [334]
As the open source movement grows, it becomes important to understand the dynamics that affect the motivation of participants who contribute their time freely to such projects. One important motivation that has been identified is the desire for formal recognition in the open source community. We investigated the impact of social capital in participants' social networks on their recognition-based performance; i.e., the formal status they are accorded in the community. We used a sample of 465 active participants in the Wikipedia open content encyclopedia community to investigate the effects of two types of social capital and found that network closure, measured by direct and indirect ties, had a significant positive effect on increasing participants' recognition-based performance. Structural holes had mixed effects on participants' status, but were generally a source of social capital. 2007 Elsevier {B.V.} All rights reserved.
Muller-Seitz, G. & Reger, G. Is open source software living up to its promises? Insights for open innovation management from two open source software-inspired projects R \& D Management Volume 39 2009 [335]
At present, several virtual initiatives claim to be acting according to the open source software {(OSS)} arena, which is often deemed a role model for open innovation. Against this background, this research focuses on a comparative case study of two non-profit project networks that attempt to operate in line with the {OSS} phenomenon: Wikipedia, the online encyclopedia, and the development of an automobile, Open Source car. We show that many parallels to the {OSS} arena can be drawn in both cases. However, this analysis must be performed cautiously, as several factors limit the applicability of {OSS} principles to non-software-related arenas. We conclude with a discussion of implications for open innovation research and managerial practice.
Sullivan, Francis Is this the party to whom I am speaking? Computing in Science and Engineering Volume 9 Pages 96 2007 [336]
Francis Sullivan has shared his views regarding the evils of Internet technologies that make is easy to send out several of unwanted emails and the information presented on the Wikipedia. Despite the fact that entities like Wikipedia are prone to error, they offer a better chance of getting the information right because they are constantly checked and updated. Sullivan also discussed the different responses from different fields to Wikipedia. He revealed that articles on physics range from good to excellent, while articles on the literature are more varied and more contentious.
Pender, Michael P; Lasserre, Kaye E; Mar, Christopher Del; Kruesi, Lisa & Anuradha, Satyamurthy Is Wikipedia unsuitable as a clinical information resource for medical students? Medical Teacher Volume 31 2009 [337]


McFedries, Paul It's a Wiki, Wiki World. IEEE Spectrum Volume 43 Pages 88 2006
The article offers information on a method being used for easy access of information. Ward Cunningham first used the wiki-prefix in a software context back in the mid-'90s when he developed a site called {WikiWikiWeb.} It is noted that Wikipedia is by far the most known wiki, but there are thousands of others. Relative to this, {The} Los Angeles Times" launched the Wikitorial on June 19 2005. However it shut down after three days because the site was flooded with pornographic images and obscene language. Wikis are a part of larger phenomenon labeled as crowdsourcing."
Lin, C C; Wang, Y C & Tsai, R T H Japanese-Chinese Information Retrieval With an Iterative Weighting Scheme JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 2010
This paper describes our {Japanese-Chinese} cross language information retrieval system. We adopt query-translation" approach and employ both a conventional {Japanese-Chinese} bilingual dictionary and Wikipedia to translate query terms. We propose that Wikipedia can be regarded as a good dictionary for named entity translation. According to the nature of Japanese writing system we propose that query terms should be processed differently based on their written forms. We use an iterative method for weight-tuning and term disambiguation which is based on the {PageRank} algorithm. When evaluating on the {NTCIR-5} test set our system achieves as high as 0.2217 and 0.2276 in relax {MAP} {(Mean} Average Precision) measurement of T-runs and D-runs."
Hughes, Benjamin; Joshi, Indra; Lemonde, Hugh & Wareham, Jonathan Junior physician's use of Web 2.0 for information seeking and medical education: A qualitative study International Journal of Medical Informatics Volume 78 2009 [338]
Background: Web 2.0 internet tools and methods have attracted considerable attention as a means to improve health care delivery. Despite evidence demonstrating their use by medical professionals, there is no detailed research describing how Web 2.0 influences physicians' daily clinical practice. Hence this study examines Web 2.0 use by 35 junior physicians in clinical settings to further understand their impact on medical practice. Method: Diaries and interviews encompassing 177 days of internet use or 444 search incidents, analyzed via thematic analysis. Results: Results indicate that 53\% of internet visits employed user-generated or Web 2.0 content, with Google and Wikipedia used by 80\% and 70\% of physicians, respectively. Despite awareness of information credibility risks with Web 2.0 content, it has a role in information seeking for both clinical decisions and medical education. This is enabled by the ability to cross check information and the diverse needs for background and non-verified information. Conclusion: Web 2.0 use represents a profound departure from previous learning and decision processes which were normally controlled by senior medical staff or medical schools. There is widespread concern with the risk of poor quality information with Web 2.0 use, and the manner in which physicians are using it suggest effective use derives from the mitigating actions by the individual physician. Three alternative policy options are identified to manage this risk and improve efficiency in Web 2.0's use. 2009 Elsevier Ireland Ltd. All rights reserved.


Li, Decong; Li, Sujian; Li, Wenjie; Gu, Congyun & Li, Yun Keyphrase extraction based on topic relevance and term association Journal of Information and Computational Science Volume 7 2010
Keyphrases are concise representation of documents and usually are extracted directly from the original text. This paper proposes a novel approach to extract keyphrases. This method proposes two metrics, named topic relevance and term association respectively, for determining whether a term is a keyphrase. Using Wikipedia knowledge and betweenness computation, we compute these two metrics and combine them to extract important phrases from the text. Experimental results show the effectiveness of the proposed approach for keyphrases extaction. Copyright 2010 Binary Information Press.


Leinonen, T.; Vaden, T. & Suoranta, J. Learning in and with an open Wiki project: Wikiversity's potential in global capacity building First Monday Volume 14 2009
There is a chance that Wikiversity will become the Internet's free university just as Wikipedia is the free encyclopedia on the Internet. The building of an educational entity demands considering a number of philosophical and practical questions such as pedagogy and organization. In this paper we will address some of these, starting by introducing several earlier approaches and ideas related to wikis' potential for education. We continue by presenting three commonly used metaphors of learning: acquisition, participation and knowledge creation. Then we will present the main principles of two existing alternative educational approaches: free {aduH} education and free school movement. To test these educational approaches and practices on Wikiversity and increase our understanding of the possibilities of this initiative, in the spring of 2008 we implemented an experimental course in Wikiversity. We conclude with several recommendations essentially advocating for Wikiversity and the use of wikis in education. However, more than just presenting our opinions, as authors we aim to make an educated - traditionally and in the wiki way - contribution to the international discussion about the future of education for all in the digital era.
Bai, Bing; Weston, Jason; Grangier, David; Collobert, Ronan; Sadamasa, Kunihiko; Qi, Yanjun; Chapelle, Olivier & Weinberger, Kilian Learning to rank with (a lot of) word features Information Retrieval Volume 13 2010 [339]
In this article we present Supervised Semantic Indexing which defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing {(LSI),} our models take account of correlations between words (synonymy, polysemy). However, unlike {LSI} our models are trained from a supervised signal directly on the ranking task of interest, which we argue is the reason for our superior results. As the query and target texts are modeled separately, our approach is easily generalized to different retrieval tasks, such as cross-language retrieval or online advertising placement. Dealing with models on all pairs of words features is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing and sparsification. We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods.
Mika, P.; Ciaramita, M.; Zaragoza, H. & Atserias, J. Learning to tag and tagging to learn: a case study on Wikipedia IEEE Intelligent Systems Volume 23 2008 [340]
The problem of semantically annotating Wikipedia inspires a novel method for dealing with domain and task adaptation of semantic taggers in cases where parallel text and metadata are available.
Lin, Chu-Cheng; Wang, Yu-Chun; Yeh, Chih-Hao; Tsai, Wei-Chi & Tsai, Richard Tzong-Han Learning weights for translation candidates in Japanese-Chinese information retrieval Expert Systems with Applications Volume 36 2009 [341]
This paper describes our {Japanese-Chinese} information retrieval system. Our system takes the query-translation" approach. Our system employs both a more conventional bilingual {Japanese-Chinese} dictionary and Wikipedia for translating query terms. We propose that Wikipedia can be used as a good {NE} bilingual dictionary. By exploiting the nature of Japanese writing system we propose that query terms be processed differently based on the forms they are written in. We use an iterative method for weight-tuning and term disambiguation which is based on the {PageRank} algorithm. When evaluating on the {NTCIR-5} test set our system achieves as high as 0.2217 and 0.2276 in relax {MAP} (mean average precision) measurement of T-runs and D-runs. 2008 Elsevier Ltd. All rights reserved."
Messner, Marcus & South, Jeff LEGITIMIZING WIKIPEDIA -- How US national newspapers frame and use the online encyclopedia in their coverage Journalism Practice 2010 [342]
Within only a few years, the collaborative online encyclopedia Wikipedia has become one of the most popular websites in the world. At the same time, Wikipedia has become the subject of much controversy because of inaccuracies and hoaxes found in some of its entries. Journalists, therefore, have remained skeptical about the reliability and accuracy of Wikipedia's information, despite the fact that research has consistently shown an overall high level of accuracy compared to traditional encyclopedia. This study analyzed the framing of Wikipedia and its use as a news source by five US national newspapers over an eight-year period. A content analysis of 1486 Wikipedia references in The New York Times, The Washington Post, The Wall Street Journal, USA Today and The Christian Science Monitor found that Wikipedia is framed predominantly neutral and positive, and that it is increasingly used as a news source. By framing Wikipedia as credible and accurate, the newspapers help legitimize the use of the online encyclopedia. By allowing Wikipedia to influence their news agendas as a source, the newspapers confirm the growing reliability of Wikipedia.
White, John S. Let's Get it Right: Prismatic Habit and Other Fusses. Rocks \& Minerals Volume 82 2007
The article focuses on the common usage of prismatic to describe crystals, the color photographs and captions in a calendar, and Wikipedia's information on quartz. The habitual use of prismatic to describe crystals is misleading, except when used for crystal cleavage. Criticism on the color photographs and wrong captions of quartz and minerals in {Rocks} and Crystals 2007" are detailed. The article states that Wikipedia's information on quartz is inadequate disorganized and questionable."


Luyt, B; Ally, Y; Low, NH & Ismail, NB Librarian Perception of Wikipedia: Threats or Opportunities for Librarianship? LIBRI Volume 60 2010 [343]
The rapid rise of Wikipedia as an information source has placed the traditional role of librarians as information gatekeepers and guardians under scrutiny with much of the professional literature suggesting that librarians are polarized over the issue of whether Wikipedia is a useful reference tool. This qualitative study examines the perceptions and behaviours of National Library Board {(NLB)} of Singapore librarians with regards to information seeking and usage of Wikipedia. It finds that instead of polarized attitudes, most librarians, although cautious about using Wikipedia in their professional capacity, hold a range of generally positive attitudes towards the online en-cyclopaedia, believing that it has a valid role to play in the information seeking of patrons today. This is heartening because it suggests the existence within the librarian population of attitudes that can be tapped to engage constructively with Wikipedia. Three of these in particular are briefly discussed at the end of the article: Wikipedia's ability to appeal to the socalled digital natives its role as a source of {non-Western} information, and its potential to enable a revitalization of the role of librarians as public intellectuals contributing to a democratic information commons.
Gunnels, C. Librarians on the verge of an epistemological breakdown Community \& Junior College Libraries Volume 14 2007 [344]
During the enlightenment of eighteenth-century France, the encyclopedists created a systematic compilation of all human knowledge in order to dispel current disinformation imposed by kings and clergy. The resultant Encyclopedie has been considered the turning point of the enlightenment, where knowledge became power and the power was made accessible to the people. This article explores the digital phenomenon of Web 2.0 and questions whether we are experiencing another epistemological shift similar to the Encyclopedie. It then discusses teaching information literacy and gives practical ways for community college librarians to incorporate Wikipedia, Google, and other digital sources into their instruction to teach research skills and critical thinking.
Jacobs, M.L. Libraries and the mobile revolution: remediation=relevance Reference Services Review Volume 37 2009 [345]
Purpose - The purpose of this paper is to look at the big picture of where academic libraries fit into the mobile revolution. Design/methodology/approach - Using Jim Hahn's accompanying article, On the remediation of Wikipedia and the {iPod,} the author comments on what remediation means for the academic library culture as a whole. The reflections are based on observations of current trends in technology and the emergence of a mobile culture. A definition of this generation of library users is suggested - the {ING} (information now generation). Editorial in nature, the paper also discusses some new technologies and how they might be applicable to the technological growth of libraries. Findings - This reflection of current trends encourages librarians to look/listen, explore, apply, prevail when it comes to applying emerging technologies to the library world. Originality/value - The paper offers insights into how librarians can prepare themselves for the remediation revolution.
Duguid, P. Limits of self-organization: Peer production and laws of quality" First Monday Volume 11 2006
People often implicitly ascribe the quality of peer production projects such as Project Gutenberg or Wikipedia to what the author calls œlaws? of quality. These are drawn from open source software development and it is not clear how applicable they are outside the realm of software. In this article, the author looks at examples from peer production projects to ask whether faith in these laws does not so much guarantee quality as hide the need for improvement. The author concludes that, given the bulk of these projects (52 million tracks in the Gracenote database, 1 million entries on the English Wikipedia site, 17,000 books on Project Gutenberg), sampling for quality is both difficult and tendentious. Clearly, the author's is not a scientific survey. Nor was his intention simply to find flaws. Rather, the author used these examples to try, however inadequately, to raise questions about the transferability of open source quality assurance to other domains. The author's underlying argument is that the social processes of open source software production may transfer to other fields of peer production, but, with regard to quality, software production remains a special case.
Csomai, A. & Mihalcea, R. Linking documents to encyclopedic knowledge IEEE Intelligent Systems Volume 23 2008 [346]
Wikipedia has become one of the largest online repositories of encyclopedic knowledge. Wikipedia editions are available for more than 200 languages, with entries varying from a few pages to more than 1 million articles per language. Embedded in each Wikipedia article is an abundance of links connecting the most important words or phrases in the text to other pages, thereby letting users quickly access additional information. An automatic text-annotation system combines keyword extraction and word-sense disambiguation to identify relevant links to Wikipedia pages.
Furbach, Ulrich; Glöckner, Ingo; Helbig, Hermann & Pelzer, Björn Logic-Based Question Answering KI - Künstliche Intelligenz Volume 24 2010 [347]
Weiss, Stephane; Urso, Pascal & Molli, Pascal Logoot-undo: Distributed collaborative editing system on P2P networks IEEE Transactions on Parallel and Distributed Systems Volume 21 2010 [348]
Peer-to-peer systems provide scalable content distribution for cheap and resist to censorship attempts. However, {P2P} networks mainly distribute immutable content and provide poor support for highly dynamic content such as produced by collaborative systems. A new class of algorithms called {CRDT} {(Commutative} Replicated Data Type), which ensures consistency of highly dynamic content on {P2P} networks, is emerging. However, if existing {CRDT} algorithms support the edit anywhere anytime feature they do not support the "undo anywhere anytime feature. In this paper we present the {Logoot-Undo} {CRDT} algorithm which integrates the "undo anywhere anytime feature. We compare the performance of the proposed algorithm with related algorithms and measure the impact of the undo feature on the global performance of the algorithm. We prove that the cost of the undo feature remains low on a corpus of data extracted from Wikipedia. "
Czarnecka-Kujawa, Kasia; Abdalian, Rupert & Grover, Samir C. M1042 The Quality of Open Access and Open Source Internet Material in Gastroenterology: Is Wikipedia Appropriate for Knowledge Transfer to Patients? Gastroenterology Volume 134 2008
Gibson, David Make It a Two-Way connection: A Response to œConnecting Informal and Formal Learning Experiences in the Age of Participatory Media? Contemporary Issues in Technology and Teacher Education Volume 8 2008 [349]
Beer, D Making Friends with Jarvis Cocker: Music Culture in the Context of Web 2.0 CULTURAL SOCIOLOGY Volume 2 2008 [350]
The movement toward what has been described as Web 2.0 has brought with it some significant transformations in the practices, organization and relations of music culture. The user-generated and web-top applications of Web 2.0 are already popular and widely used, the social networking site {MySpace} already having more than 130 million members worldwide. By focusing specifically upon the presence of the popular music performer Jarvis Cocker across various Web 2.0 applications, this article seeks to open up a series of questions and create opportunities for research into what is happening in contemporary music culture. This exploratory article lays out an agenda for research into music culture and Web 2.0 that is not only concerned with the implications of Web 2.0 for music, but which also attempts to understand the part played by music in making the connections that form the collaborative and participatory cultures of Web 2.0 and the flickering friendships of social networking sites.
McKibbin, Ross Making History: The Changing Face of the Profession in Britain English Historical Review 2010 [351]
Staley, David J Managing the Platform: Higher Education and the Logic of Wikinomics EDUCAUSE Review Volume 44 2009
Wikipedia is an online free-content encyclopedia that anyone can edit and an efficient way to marshal the talents of many bright, capable people to produce knowledge. But the real significance of Wikipedia and similar Web 2.0 technologies is the way in which they organize people and activities, not simply the way in which they create and distribute information. Don Tapscott and Anthony Williams call this new organization of activities wikinomics." At its heart wikinomics involves motivated amateurs who voluntarily produce knowledge and information in a new form of social and managerial organization. Socially a wiki-ized system cannot exist without an agreement among the members of that system to behave in a certain fashion. Managerially wikinomics is built on the idea of the "platform." Wikipedia and other social networking sites provide a space or platform upon which all kinds of activities can flourish with the idea of a platform transcending any particular technology or application and referring to either virtual or physical worlds. Collaboration among many users upon such a platform often produces unplanned and emergent results--results frequently unattainable in a command-and-control management setting. In a wiki-ized setting leadership thus involves "managing the platform with leaders ensuring the vitality and stability of the platform rather than regulating the actions and activities of the people who use the platform. Wikinomics and Web 2.0 technologies represent as important a historical phenomenon as the birth of bureaucracy; indeed, people should refer to this moment in time as signaling a participatory turn in their culture. Yet whereas this participatory turn is rewriting the rules for many industries, most notably the software industry, people have yet to witness the full effects on the university--specifically on how they might organize, manage, and lead colleges and universities in the future. In this article, the author discusses how the logic of Web 2.0, the logic of commons-based peer production, and the logic of platform management might transform the idea of the university and the very activities--teaching and learning, research, and publishing--that lie at the heart of this enterprise. {(Contains} 16 notes.)
FranÄ?ula, Nedjeljko Map Projections in Wikipedia. Cartography \& Geoinformation Volume 8 Pages 120 2009
A discussion on an article about map projection, published in wikipedia, is presented. The article presents a definition of a map projection. It tackles various issues concerning map projections including the construction of a map projection, selection of a model for the shape of the Earth, and an analysis of pseudocylindrical and azimuthal projections. The author claims that the article provides a reliable source of information on map projections.
Ormeling, F. Mapping out Map Libraries Liber Quarterly Volume 18 Pages 239 2008
Meishar-Tal, H. & Tal-Elhasid, E. Measuring collaboration in educational wikis - a methodological discussion International Journal of Emerging Technologies in Learning 2008 [352]
Measuring the collaboration in collaborative learning scenarios is important for assessment and research purposes. This paper describes the methodology developed in the Open University of Israel {(OUI)} to measure collaboration among students in wikis. It opens with an overview of the methods used to measure collaboration in Wikipedia, proceeds with explaining why these methods are not suitable enough for measuring collaboration in an educational wiki setting, and concludes by presenting a new method for measuring collaboration in educational wikis.
Rand, Angela Doucet Mediating at the Student-Wikipedia Intersection. Journal of Library Administration Volume 50 Pages 07/08/2011 2010
Wikipedia is a free online encyclopedia. The encyclopedia is openly edited by registered users. Wikipedia editors can edit their own and others' entries, and some abuse of this editorial power has been unveiled. Content authors have also been criticized for publishing less than accurate content. Educators and students acknowledge casual use of Wikipedia in spite of its perceived inaccuracies. Use of the online encyclopedia as a reference resource in scholarly papers is still debated. The increasing popularity of Wikipedia has led to an influx of research articles analyzing the validity and content of the encyclopedia. This study provides an analysis of relevant articles on academic use of Wikipedia. This analysis attempts to summarize the status of Wikipedia in relation to the scope (breadth) and depth of its contents and looks at content validity issues that are of concern to the use of Wikipedia for higher education. The study seeks to establish a reference point from which educators can make informed decisions about scholarly use of Wikipedia as a reference resource.
Fiore, Francine Medications in Wikipedia. Comparison of reliability Perspective Infirmière: Revue Officielle De l'Ordre Des Infirmières Et Infirmiers Du Québec Volume 6 Pages 11 2009 [353]
Patch, P. Meeting Student Writers Where They Are: Using Wikipedia to Teach Responsible Scholarship Teaching English in the Two-Year College Volume 37 2010 [354]
Medelyan, Olena; Milne, David; Legg, Catherine & Witten, Ian H. Mining meaning from Wikipedia International Journal of Human Computer Studies Volume 67 2009 [355]
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced. 2009 Elsevier Ltd. All rights reserved.
Stewart, Graham Mirage of us: a reflection on the role of the web in widening access to references on Southern African arts, culture and heritage Tydskrif vir Letterkunde Volume 47 Pages 129 2010 [356]
Carpineto, Claudio; Mizzaro, Stefano; Romano, Giovanni & Snidero, Matteo Mobile information retrieval with search results clustering: Prototypes and evaluations Journal of the American Society for Information Science and Technology Volume 60 2009 [357]
Web searches from mobile devices such as {PDAs} and cell phones are becoming increasingly popular. However, the traditional list-based search interface paradigm does not scale well to mobile devices due to their inherent limitations. In this article, we invthe application of search results clustering, used with some success for desktop computer searches, to the mobile scenario. Building on {CREDO} {(Conceptual} Reorganization of Documents), a Web clustering engine based on concept lattices, we present imobile versions Credino and {SmartCREDO} , for {PDAs} and cell phones, respectively. Next, we evaluate the retrieval performance of the three prototype systems. We measure the effectiveness of their clustered results compared to a ranked list of results on a retrieval task, by means of the device-independent notion of subtopic reach time with a reusable test collection built from Wikipedia ambiguous entries. Then, we make a crosscomparison of methods (i.e., clustering and ranked list) and dev(i.e., desktop, {PDA,} and cell phone), using an interactive information-finding task performed by external participants. The main finding is that clustering engines are a viable complementary approach to plain search engines both for desktop and mobilsearches especially, but not only, for multitopic informational queries.
Javanmardi, Sara; Lopes, Cristina & Baldi, Pierre Modeling user reputation in wikis Statistical Analysis and Data Mining Volume 3 2010 [358]
Collaborative systems available on the Web allow millions of users to share information through a growing collection of tools and platforms such as wikis, blogs, and shared forums. By their very nature, these systems contain resources and information with different quality levels. The open nature of these systems, however, makes it difficult for users to determine the quality of the available information and the reputation of its providers. Here, we first parse and mine the entire English Wikipedia history pages in order to extract detailed user edit patterns and statistics. We then use these patterns and statistics to derive three computational models of a user's reputation. Finally, we validate these models using ground-truth Wikipedia data associated with vandals and administrators. When used as a classifier, the best model produces an area under the receiver operating characteristic {(ROC)} curve {(AUC)} of 0.98. Furthermore, we assess the reputation predictions generated by the models on other users, and show that all three models can be used efficiently for predicting user behavior in Wikipedia. 2010 Wiley Periodicals, Inc.


Yang, Heng-Li & Lai, Cheng-Yu Motivations of Wikipedia content contributors Computers in Human Behavior Volume 26 2010 [359]
Rapidly developing web technologies have increased the prevalence of user-generated Internet content. Of the many websites with user-generated content on the Internet, one of the most renowned is Wikipedia, which is the largest multilingual free-content encyclopedia written by users collaboratively. Nevertheless, although contributing to Wikipedia takes time and knowledge, contributors are rarely compensated. As a result, there is a need to understand why individuals share their knowledge in Wikipedia. The aim of this study was to evaluate the effects of both conventional and self concept-based motivation on individual willingness to share knowledge in Wikipedia. After performing an online questionnaire survey, {SEM} was applied to assess the proposed model and hypotheses. The analytical results showed that internal self-concept motivation is the key motivation for knowledge sharing on Wikipedia. 2010 Elsevier Ltd. All rights reserved.
Neal, Lisa My life as a Wikipedian eLearn eLearn Homepage Pages 1 2007 [360]
An abstract is not available.
Sharrow, Steven H. Natural Resource Management on the Other Side of the World: The Nagorno Karabakh Republic Rangelands Volume 29 2007 [361]
Blackman, S Nature has Wikipedia in its cites SCIENTIST Volume 20 2006 [362]
Pöllä, Matti & Honkela, Timo Negative Selection of Written Language Using Character Multiset Statistics Journal of Computer Science and Technology Volume 25 2010 [363]


Hemphill, C. NETWORK NEUTRALITY AND THE FALSE PROMISE OF ZERO-PRICE REGULATION Yale Journal on Regulation Volume 25 Pages 135 2008
This Article examines zero-price regulation, the major distinguishing feature of many modern network neutrality" proposals. A zero-price rule prohibits a broadband Internet access provider from charging an application or content provider (collectively "content provider") to send information to consumers. The Article differentiates two access provider strategies thought to justify a zero-price rule. Exclusion is anticompetitive behavior that harms a content provider to favor its rival. Extraction is a toll imposed upon content providers to raise revenue. Neither strategy raises policy concerns that justify implementation of a broad zero-price rule. First there is no economic exclusion argument that justifies the zero-price rule as a general matter given existing legal protections against exclusion. A stronger but narrow argument for regulation exists in certain cases in which the output of social producers such as Wikipedia competes with ordinary market-produced content. Second prohibiting direct extraction is undesirable and counterproductive in part because it induces costly and unregulated indirect extraction. I conclude therefore that recent calls for broad-based zero-price regulation are mistaken. {[PUBLICATION} {ABSTRACT]}"
Nelson, Rolf New Media review. Visual Studies Volume 23 2008
The article reviews two online information resources including Wikipedia and Wikimedia Commons.
NealBaxter, Robert New technologies and terminological pressure in lesser-used languages : The Breton Wikipedia, from terminology consumer to potential terminology provider Language problems \& language planning Volume 33 2009
Taking the impact of the Wikipedia on the Breton language as a case in point, whilst highlighting the huge potential benefits that new technologies have to offer to economically less viable languages as a whole, this article discusses the way internet-based systems can have an impact on the terminological pressure exerted on such languages in many specialised areas. The article goes on to analyse possible conflict resolution mechanisms for competing terminological strategies and the relative merits and shortcomings of each. While centred on the specific case of a European minority" or "lesser used" language the article shows the extent to which the discussion and findings can also be relevant to the way other equally economically challenged languages around the globe can evolve and develop unfettered thanks to the use of free-access virtual resources such as Wikipedia."


Keim, B. News feature: WikiMedia Nature Medicine Volume 13 2007
G LÍ NIDA Versus Wikipedia. Science Volume 315 Pages 743 2007
The article reports on the removal of sections of an article at National Institute on Drug Abuse {(NIDA)} on the Wikipedia Web site by an anonymous employee of the agency in the {U.S.} The employee replaced them with prose about unprecented opportunities at the institute and its aim of improving the health of the nation. The citizens who write and monitor the entries, mentions of debates over the potency of marijuana given to {NIDA-funded} weed researchers at the University of Mississippi and complaints that government statistics on emergency room visits overstate the dangers of pot are included in the sections deleted during an editing battle with Wikipedians.
Goldspink, Christopher NORMATIVE BEHAVIOUR IN WIKIPEDIA Information Volume 13 2010
This paper examines the effect of norms and rules on editor communicative behaviour in Wikipedia. Specifically, processes of micro-coordination through speech acts are examined as a basis for norm establishment, maintenance, reinforcement and effectiveness. This is pursued by analysing discussion pages taken from a sample of controversial and featured articles. The results reveal some unexpected patterns. Despite the Wikipedia community generating a large number of rules, etiquettes and guidelines, the explicit invocation of rules and/or the use of wider social norms is rare and appears to play a very small role in influencing editor behaviour. The emergent pattern of communicative exchange is not well aligned either with rules established by Wikipedia contributors or with the characteristics of a coherent community and nor is it consistent with the behaviour needed to reach agreement on controversial topics. The paper concludes by offering some tentative hypotheses as to why this may be so and outlines possible future research which may help distinguish between alternatives. Adapted from the source document.
Farrell, Henry & Schwartzberg, Melissa Norms, Minorities, and Collective Choice Online Ethics \& International Affairs Volume 22 2008
Building on case studies of Wikipedia and the Daily Kos, this essay argues that different kinds of rules shape relations between members of the majority and of the minority in these communities in important and consequential ways. Adapted from the source document.
Kitchen, R. Not an authority. British Dental Journal Volume 206 Pages 241 2009
A letter to the editor is presented in response to the article {Wikipedia} use by E. Shawkat in the 2009 issue.
Kohn, R.S. Of Descartes And Of Train Schedules: Evaluating The Encyclopedia Judaica, Wikipedia, And Other General And Jewish Studies Encyclopedias Library Review Volume 59 2010 [364]
Purpose - The purpose of this paper is to discuss the second edition of the Encyclopaedia Judaica (2007) within its broader historical context of the production of encyclopedias in the twentieth and the twenty-first centuries. The paper contrasts the 2007 edition of the Encyclopaedia Judaica to the Jewish Encyclopedia published between 1901 and 1905, and to the first edition of the Encyclopaedia Judaica published in 1972; then contrasts the 2007 edition of the Encyclopaedia Judaica to Wikipedia and to other projects of online encyclopedias. Design/methodology/approach - The paper provides a personal reflective review of the sources in question. Findings - That Encyclopaedia Judaica in its latest edition does not adequately replace the original first edition in terms of depth of scholarly work. It is considered that the model offered by Wikipedia could work well for the Encyclopaedia Judaica, allowing it to retain the core of the expert knowledge, and at the same time channel the energy of volunteer editors which has made Wikipedia such a success. Practical implications - The paper is of interest to those with an interest in encyclopedia design or Jewish studies. Originality/value - This paper provides a unique reflection on the latest edition of the encyclopedia and considers future models for its publication based on traditional and non-traditional methods.
Kim, Won; Jeong, Ok-Ran & Lee, Sang-Won On social Web sites Information Systems Volume 35 2010 [365]
Today hundreds of millions of Internet users are using thousands of social Web sites to stay connected with their friends, discover new friends and to share user-created contents, such as photos, videos, social bookmarks, and blogs. There are so many social Web sites, and their features are evolving rapidly. There is controversy about the benefits of these sites, and there are social issues these sites have given rise to. There are lots of press articles, Wikipedia articles, and blogs-in varying degrees of authoritativeness, clarity and accuracy-about some of the social Web sites, uses of the sites, and some social problems, and business challenges faced by the sites. In this paper, we attempt to organize the status, uses, and issues of social Web sites into a comprehensive framework for discussing, understanding, using, building, and forecasting the future of social Web sites. 2009 Elsevier {B.V.} All rights reserved.
LeLoup, Jean W. & Ponterio, Robert On the net: Wikipedia: a multilingual treasure trove Language, Learning \& Technology Volume 10 Pages 4 2006 [366]
Zimmer, Carl On the Origin of Eukaryotes Science Volume 325 2009 [367]
Zimmer, Carl On the Origin of Sexual Reproduction Science Volume 324 2009 [368]
Travis, John On the Origin of the Immune System Science Volume 324 2009 [369]
Miller, Greg On the Origin of the Nervous System Science Volume 325 2009 [370]
Zimmer, Carl On the Origin of Tomorrow Science Volume 326 2009 [371]
Krizhanovsky, A.A. & Smirnov, A.V. On the problem of wiki texts indexing Journal of Computer and Systems Sciences International Volume 48 2009 [372]
A new type of documents called a wiki page" is winning the Internet. This is expressed not only in an increase of the number of Internet pages of this type but also in the popularity of Wiki projects (in particular Wikipedia); therefore the problem of parsing in Wiki texts is becoming more and more topical. A new method for indexing Wikipedia texts in three languages: Russian English and German is proposed and implemented. The architecture of the indexing system including the software components {GATE} and Lemmatizer is considered. The rules of converting Wiki texts into texts in a natural language are described. Index bases for the Russian Wikipedia and Simple English Wikipedia are constructed. The validity of Zipf's laws is tested for the Russian Wikipedia and Simple English Wikipedia. 2009 Pleiades Publishing Ltd."
Hahn, J. On the remediation of Wikipedia to the iPod Reference Services Review Volume 37 2009 [373]
The purpose of this paper is to present the results of a usability study of information search on mobile devices, seeking to understand mobile computing best practice in the design of library services. Three second-year undergraduate students took part in this semester long study. They are loaned {iPods} with a Wikipedia copy to use as desired. Usability data are drawn from search logs recording titles of the articles searched and an Internet-based survey completed by students. Students characterize the nature of information searched for on the Wikipedia {iPods} as recreational. Students did not utilize the {iPods} for academic research. Search logs show students viewed articles primarily about objects. The results of this paper do not show generalized principles of mobile search. More data collected from additional sets of users are needed in order to articulate principles of mobile search. If it is the case that students will primarily make use of mobile computing for recreational or leisurely purposes then library services on mobile computing platforms must be designed accordingly. The paper presents methods for the study of information search though mobile computing and poses questions resulting from this paper that require further study.


Svoboda, E. One-click content, no guarantees [online encyclopedia reliability] IEEE Spectrum Volume 43 2006
As the first-ever major reference work with a democratic premise, the free online encyclopedia, Wikipedia, has generated shared scholarly efforts to rival those of any literary or philosophical movement in history. As such, Wikipedia is vulnerable to user-generated articles that are inaccurate or irrelevant. While a carefully executed and multilayered review process is performed by a team of volunteers, critics believe that the lack of formal gatekeeping procedures ensures that the lowest common denominator will prevail and, since no experts or editors are hired to review the articles, no clear standards exist for accuracy or writing quality. Despite its imperfections, Wikipedia users claim that it works well in practice. Nevertheless, readers are advised to check their online finds against other sources and to be aware of Wikipedia's unique strengths and weaknesses, especially when gathering information for research projects
Pickering, B. Online news and reference services Information World Review 2006
With blogs, bulletin boards, podcasts and Webcasts, the Internet is an ever-evolving source of information. The online information industry continues to evolve rapidly as information providers experiment with new models of content distribution on a global scale, and new technologies develop that allow people to distribute information (from blogs, podcasts and Webcasts for news suppliers to the Wikipedia, Project Citizendium and Digital Universe models on the reference side). A significant new development for news aggregators is the speed at which news dissemination is now happening on a global scale.
Greysen, S Ryan; Kind, Terry & Chretien, Katherine C Online professionalism and the mirror of social media Journal of General Internal Medicine Volume 25 2010 [374]
The rise of social media--content created by Internet users and hosted by popular sites such as Facebook, Twitter, {YouTube,} and Wikipedia, and blogs--has brought several new hazards for medical professionalism. First, many physicians may find applying principles for medical professionalism to the online environment challenging in certain contexts. Second, physicians may not consider the potential impact of their online content on their patients and the public. Third, a momentary lapse in judgment by an individual physician to create unprofessional content online can reflect poorly on the entire profession. To overcome these challenges, we encourage individual physicians to realize that as they tread" through the World Wide Web they leave behind a "footprint" that may have unintended negative consequences for them and for the profession at large. We also recommend that institutions take a proactive approach to engage users of social media in setting consensus-based standards for "online professionalism." Finally given that professionalism encompasses more than the avoidance of negative behaviors we conclude with examples of more positive applications for this technology. Much like a mirror social media can reflect the best and worst aspects of the content placed before it for all to see."
Iorio, Angelo Di; Musetti, Alberto; Peroni, Silvio & Vitali, Fabio Ontology-driven generation of wiki content and interfaces New Review of Hypermedia and Multimedia Volume 16 Pages 01/02/2011 2010 [375]
The planetary success of Wikipedia has opened the road to using wikis as shared resources for communities to collect and organize facts, concepts, and structures that constitute both the shared knowledge of the community and, more often than not, the very reason for the community to exist. The ease of creating, editing, and debating one's own and each other's contributions to the wiki knowledge-based are key aspects of the success and livelihood of the community itself. The need for semantic wiki data cannot be separated from the need of friendly authoring environments for those data. This paper introduces a framework that allows users to easily create semantic wiki content by exploiting ontology-driven forms and templates. The system, called {OWiki,} is an instantiation of a more general model, named {GAFFE,} that exploits ontologies to generate metadata editors. Both {GAFFE} and {OWiki} are presented in this paper, with particular attention to the way they exploit ontologies to model the community shared knowledge, the interfaces used to create that knowledge, and the way it evolves. 2010 Taylor Francis.


Nov, Oded & Kuk, George Open source content contributors' response to free-riding: The effect of personality and context Computers in Human Behavior Volume 24 2008 [376]
We address concerns about the sustainability of the open source content model by examining the effect of external appropriation, whereby the product of open source contributors' efforts is monetized by a party that did not contribute to the project, on intended effort withdrawal (reduction in contribution level). We examine both the personality of contributors and their contextual motivations to contribute, using a scenario-based survey of Wikipedia contributors. The findings suggest that perceived justice of the open source license terms, and intrinsic motivations are both negatively related with effort withdrawal intentions. Moreover, we find that the effect of the fairness personality trait on effort withdrawal is stronger for individuals who are low in perceived justice and weaker for individuals high in justice. The findings of factors predicting effort withdrawal contribute to the open source literature, which tends to focus on contribution and motivations, but not on what impacts changes in individual contribution levels. 2008 Elsevier Ltd. All rights reserved.


Mateos-Garcia, Juan & Steinmueller, W. Edward Open, But How Much? Growth,Conflict, and Institutional Evolution in OpenSource Communities Community, Economic Creativity, and Organization Volume 1 2008 [377]
Talbot, David OurTube Technology Review Volume 112 2009
Open video which could provide a great change in Web innovation is discussed. The transformation of video that would allow trouble free playback of any video which also means that any innovation, such as a new way to search, would apply to all videos, allowing new technologies to spread more rapidly. Wikimedia Foundation is also working to wards realization of this vision, to create video companions to the online encyclopedia's text entries. This will enable users to search the Web for snippets of video, import them to a Wikipedia article, and keep track of all edits using open technologies that don't require video plug-ins or software purchases. {YouTube} has helped make video a mainstay of web, anyone can open a {YouTube} account and upload videos, all for free. The open format provides users to watch videos freely and also jump between relevant clips and Open licensing is a crucial part of the open video format.
Denoyer, Ludovic & Gallinari, Patrick Overview of the INEX 2008 XML Mining Track Advances in Focused Retrieval 2009 [378]
We describe here the {XML} Mining Track at {INEX} 2008. This track was launched for exploring two main ideas: first identifying key problems for mining semi-structured documents and new challenges of this emerging field and second studying and assessing the potential of machine learning techniques for dealing with generic Machine Learning {(ML)} tasks in the structured domain i.e. classification and clustering of semi structured documents. This year, the track focuses on the supervised classification and the unsupervised clustering of {XML} documents using link information. We consider a corpus of about 100,000 Wikipedia pages with the associated hyperlinks. The participants have developed models using the content information, the internal structure information of the {XML} documents and also the link information between documents.
Jijkoun, Valentin & Rijke, Maarten Overview of the WiQA Task at CLEF 2006 Evaluation of Multilingual and Multi-modal Information Retrieval 2007 [379]
We describe {WiQA} 2006, a pilot task aimed at studying question answering using Wikipedia. Going beyond traditional factoid questions, the task considered at {WiQA} 2006 was to return--given an source page from Wikipedia--to identify snippets from other Wikipedia pages, possibly in languages different from the language of the source page, that add new and important information to the source page, and that do so without {repetition.A} total of 7 teams took part, submitting 20 runs. Our main findings are two-fold: (i) while challenging, the tasks considered at {WiQA} are do-able as participants achieved impressive scores as measured in terms of yield, mean reciprocal rank, and precision, (ii) on the bilingual task, substantially higher scores were achieved than on the monolingual tasks.
Wood, Andrew & Struthers, Kate Pathology education, Wikipedia and the Net generation Medical Teacher Volume 32 Pages 618 2010 [380]
Jones, J Patterns of revision in online writing - A study of wikipedia's featured articles WRITTEN COMMUNICATION Volume 25 2008 [381]
This study examines the revision histories of 10 Wikipedia articles nominated for the site's Featured Article Class {(FAC),} its highest quality rating, 5 of which achieved {FAC} and 5 of which did not. The revisions to each article were coded, and the coding results were combined with a descriptive analysis of two representative articles in order to determine revision patterns. All articles in both groups showed a higher percentage of additions of new material compared to deletions and revisions that rearranged the text. Although the {FAC} articles had roughly equal numbers of content and surface revisions, the {non-FAC} articles had fewer surface revisions and were dominated by content revisions. Although the unique features of the Wikipedia environment inhibit strict comparisons between these results and those of earlier revision studies, these results suggest revision in this environment places unique structural demands on writers, possibly leading to unique revision patterns.
Stehr, Henning; Duarte, Jose M.; Lappe, Michael; Bhak, Jong & Bolser, Dan M. PDBWiki: added value through community annotation of the Protein Data Bank Database 2010 [382]
The success of community projects such as Wikipedia has recently prompted a discussion about the applicability of such tools in the life sciences. Currently, there are several such science-wikis' that aim to collect specialist knowledge from the community into centralized resources. However, there is no consensus about how to achieve this goal. For example, it is not clear how to best integrate data from established, centralized databases with that provided by community annotation'. We created {PDBWiki,} a scientific wiki for the community annotation of protein structures. The wiki consists of one structured page for each entry in the the Protein Data Bank {(PDB)} and allows the user to attach categorized comments to the entries. Additionally, each page includes a user editable list of cross-references to external resources. As in a database, it is possible to produce tabular reports and structure galleries' based on user-defined queries or lists of entries. {PDBWiki} runs in parallel to the {PDB,} separating original database content from user annotations. {PDBWiki} demonstrates how collaboration features can be integrated with primary data from a biological database. It can be used as a system for better understanding how to capture community knowledge in the biological sciences. For users of the {PDB,} {PDBWiki} provides a bug-tracker, discussion forum and community annotation system. To date, user participation has been modest, but is increasing. The user editable cross-references section has proven popular, with the number of linked resources more than doubling from 17 originally to 39 today. Database {URL:} http://www.pdbwiki.org
Fitzpatrick, Kathleen Peer-to-peer review and the future of scholarly authority Cinema Journal Volume 48 Pages 124 2009 [383]
Kubiszewski, Ida; Noordewier, Thomas & Costanza, Robert Perceived Credibility of Internet Encyclopedias Computers \& Education Pages Accepted Manuscript 2010
Rectanus, Mark W. Performing Knowledge: Cultural Discourses, Knowledge Communities, and Youth Culture. Telos 2010
The article discusses the destabilization of expert knowledge and the de-centering of the book in youth culture. The current fundamental shifts in the social construction of knowledge involves a number of interrelated topics such as the status of the book and scholarly publishing, the digitization and virtualization of libraries and the role of search engines, databases and books like Google Book Search, and the creation of encyclopedic projects like Wikipedia. It also explores the development of media culture in the {U.S.} and Germany.


Anon PICTURE DONATION: FEDERAL ARCHIVE CO-OPERATES WITH WIKIPEDIA ZEITSCHRIFT FUR BIBLIOTHEKSWESEN UND BIBLIOGRAPHIE Volume 56 2009 [384]
Minol, Klaus; Spelsberg, Gerd; Schulte, Elisabeth & Morris, Nicholas Portals, blogs and co.: the role of the Internet as a medium of science communication Biotechnology Journal Volume 2 2007 [385]
While the use of the Internet for the exchange of scientific data was characterised by exclusivity during its pioneer era, the active employment of the medium today, by a broad social spectrum of users in the exchange of information, for dialogue and in the accumulation of knowledge, displays an almost unbounded inclusion. Blo and online encyclopaedias based on the {'Wikipedia'} model have contributed to the formation of a marketplace in which the free expression of opinions and the relaying of information occur. Counted among the ideas which have been popularised in the wake of this phenomenon, lay journalism" and the "wisdom of the masses" are seen to be integral to the new 'web 2.0'. Consequently the ever-increasing information disseminated in the web has been diluted in quality and authenticity resulting in the presentation of new challenges to online science journalism. In reference to the public debate surrounding green gene technology the communications platform {bioSicherheit.de} which receives more than one million visitors per year will be examined as an example of an agent that retrieves and mobilises information on biological safety research and that successfully has established itself as an intermediary between the scientific community and the broader public."
Rogers, Kenneth Positive Outcomes With Information Sharing. Athletic Therapy Today Volume 11 Pages 1 2006
The article discusses the importance of information sharing to the future of the National Athletic Trainer's Association {(NATA)} in the United States. The association is evaluating the use of Wikipedia, blogs, public/private access to collaborative work sites and more use of the {NATA} Web page. Sharing information can allow the association to better response on issues.
Ekins, S. & Williams, J. Precompetitive preclinical ADME/Tox data: set it free on the web to facilitate computational model building and assist drug development Lab on a Chip Volume 10 2010 [386]
Web-based technologies coupled with a drive for improved communication between scientists have resulted in the proliferation of scientific opinion, data and knowledge at an ever-increasing rate. The increasing array of chemistry-related computer-based resources now available provides chemists with a direct path to the discovery of information, once previously accessed via library services and limited to commercial and costly resources. We propose that preclinical absorption, distribution, metabolism, excretion and toxicity data as well as pharmacokinetic properties from studies published in the literature (which use animal or human tissues in vitro or from in vivo studies) are precompetitive in nature and should be freely available on the web. This could be made possible by curating the literature and patents, data donations from pharmaceutical companies and by expanding the currently freely available {ChemSpider} database of over 21 million molecules with physicochemical properties. This will require linkage to {PubMed,} {PubChem} and Wikipedia as well as other frequently used public databases that are currently used, mining the full text publications to extract the pertinent experimental data. These data will need to be extracted using automated and manual methods, cleaned and then published to the {ChemSpider} or other database such that it will be freely available to the biomedical research and clinical communities. The value of the data being accessible will improve development of drug molecules with good {ADME/Tox} properties, facilitate computational model building for these properties and enable researchers to not repeat the failures of past drug discovery studies.


Taylor-Mendes, Cosette Proceed with caution: using Wikipedia as a reference Neonatal Network: NN Volume 26 2007 [387]


Rebillard, Franck & Touboul, Annelise Promises unfulfilled? 'Journalism 2.0', user participation and editorial policy on newspaper websites. Media, Culture \& Society Volume 32 2010
In this article the authors contemplate on the ideology involving the Web 2.0 services for journalism. They present their analysis on the ideological assumptions regarding the effectiveness of journalism 2.0., especially on online interaction and social networking sites. They also explore the material concretization of these assumptions particularly on users of participatory websites like Wikipedia or {YouTube} links and newsmaking within a corpus of news media websites in Europe and America.
Conrad, M. Public History and its Discontents or History in the Age of Wikipedia Journal of the Canadian Historical Association Volume 18 Pages 1 2007 [388]
Cross, Tom Puppy smoothies: improving the reliability of open, collaborative wikis First Monday Volume 11 2006
The reliability of information collected from at large Internet users by open collaborative wikis such as Wikipedia has been a subject of widespread debate. This paper provides a practical proposal for improving user confidence in wiki information by coloring the text of a wiki article based on the venerability of the text. This proposal relies on the philosophy that bad information is less likely to survive a collaborative editing process over large numbers of edits. Colorization would provide users with a clear visual cue as to the level of confidence that they can place in particular assertions made within a wiki article.
Pressley, L. & McCallum, C.J. Putting the library in Wikipedia Online Volume 32 2008
Few online resources provoke as much controversy in the library community as Wikipedia. Some librarians hate it, arguing that since anyone can edit it, it can't be trusted. Others love it, because it is fast, easy to use, and a good starting point for research. In the {March/April} 2008 issue of online, William Badke wondered about, in his {InfoLit} Land column, {What} to Do With Wikipedia" {(www.infotoday.com/online/mar08/Badke.shtml).} The column describes how this online encyclopedia is snubbed by academia but widely accepted by many others as a valid place to find information. He proposes that academia should participate in Wikipedia and makes several suggestions as to how professors and their students could improve Wikipedia by contributing new scholarly content evaluating existing articles and editing those that are less than scholarly. Badke's article fails to mention one academic group that could positively impact the content and scholarship in Wikipedia-librarians."
Monaci, Sara Quality assessment process in Wikipedias Vetrina: the role of the communitys policies and rules Observatorio (OBS*) Volume 3 2009 [389]
The increasing growth of Wikipedia poses many questions about its organizational model and its development as a free-open knowledge repository. Yochai Benkler describes Wikipedia as a {CBPP} (commons-based peer production) system: a platform which enables users to easily generate knowledge contents and to manage them collaboratively and on free-voluntary basis. Quality is one of the main concerns related to such a system. How would a {CBPP} environment guarantee at the same time the openness of its organization and a good level of accreditation? The paper offers an overview of the quality assessment processes in it.wikis Vetrina section. It also suggests an explanation to quality assessment which questions Benklers hypothesis. Thanks to a qualitative analysis carried out through in-depth interviews to Wikipedia users and through a period of ethnographic observation, the paper outlines Vetrinas organization and the factors related to the evaluation of quality contents.wiki, wikipedia, web, open content
Elia, Antonella QUANTITATIVE DATA AND GRAPHICS ON LEXICAL SPECIFICITY AND INDEX OF READABILITY: THE CASE OF WIKIPEDIA. RaeL: Revista Electronica de Linguistica Aplicada 2009
Issue 8 p248; Subject Term: {ELECTRONIC} encyclopedias; Subject Term: {DISCOURSE} analysis; Subject Term: {LINGUISTICS;} Subject Term: {READABILITY} {(Literary} style); Subject Term: {QUANTITATIVE} research; {Author-Supplied} Keyword: Discourse Analysis; {Author-Supplied} Keyword: Encyclopedia Britannica Online; {Author-Supplied} Keyword: Index of Redability; {Author-Supplied} Keyword: Online Encyclopedias; {Author-Supplied} Keyword: Quantitative Analysis; {Author-Supplied} Keyword: Wikipedia; {Author-Supplied} Keyword: Analyse du Discours; {Author-Supplied} Keyword: Analyse Quantitative; {Author-Supplied} Keyword: Encyclopédies en Ligne; {Author-Supplied} Keyword: Index de Lisibilité; Language of Keywords: English; Language of Keywords: Spanish; Reviews \& Products: {WIKIPEDIA;} Number of Pages: 24p; Illustrations: 1 Color Photograph 5 Charts 4 Graphs; Document Type: Article"This paper is part of a wider corpus based study focused on Web encyclopedias {(Elia} 2008). It is built on and extends the comparative analysis of Emigh and Herring (2005). In particular, attention is focused on the English edition of Wikipedia. A quantitative analysis compares Wikipedia vs. Britannica encyclopedic entries. Linguistic features such as type/token ratio, word and sentence length, and Index of Readability are analyzed. The findings show to what extent collaboratively produced Wikipedia entries are readable and standardized in a way not very dissimilar from those produced by experts in the Encyclopaedia Britannica Online. {(English)
Baltzersen, R.K. Radical transparency: Open access as a key concept in wiki pedagogy Australasian Journal of Educational Technology Volume 26 2010
Educators have just started to use wikis and most of the educational research to date has focused primarily on the use of local wikis with access limitations. There seems to be little research related to how students can contribute in global, transparent wiki communities such as Wikipedia and Wikibooks. The purpose of this article is to argue that we need to extend our understanding of transparency as a pedagogical concept if we want to use these open, global wiki communities in an educational setting. By describing one wiki based course in detail, I argue that these kinds of radically transparent learning environments in tertiary education challenge traditional pedagogy and our ordinary perceptions of what a class and working assignment is. The course data in this article include a course description and teacher and student reflections on assessed group projects which produced student written collaboratively edited textbooks" on Wikibooks. Student perceptions indicate positive attitudes towards global learning environments if the didactical design is carefully planned. In the article I suggest that "outsiders" and "former and future students" should be included as categories in a pedagogical definition of transparency. These categories represent a radical expansion of course space and course availability."
Pollard, E. A Raising the Stakes: Writing about Witchcraft on Wikipedia The History Teacher Volume 42 2008 [390]
Lewandowski, D. & Spree, U. Ranking of Wikipedia articles in search engines revisited: Fair ranking for reasonable quality? Journal of the American Society for Information Science and Technology 2010 [391]
This paper aims to review the fiercely discussed question of whether the ranking of Wikipedia articles in search engines is justified by the quality of the articles. After an overview of current research on information quality in Wikipedia, a summary of the extended discussion on the quality of encyclopedic entries in general is given. On this basis, a heuristic method for evaluating Wikipedia entries is developed and applied to Wikipedia articles that scored highly in a search engine retrieval effectiveness test and compared with the relevance judgment of jurors. In all search engines tested, Wikipedia results are unanimously judged better by the jurors than other results on the corresponding results position. Relevance judgments often roughly correspond with the results from the heuristic evaluation. Cases in which high relevance judgments are not in accordance with the comparatively low score from the heuristic evaluation are interpreted as an indicator of a high degree of trust in Wikipedia. One of the systemic shortcomings of Wikipedia lies in its necessarily incoherent user model. A further tuning of the suggested criteria catalog, for instance, the different weighing of the supplied criteria, could serve as a starting point for a user model differentiated evaluation of Wikipedia articles. Approved methods of quality evaluation of reference works are applied to Wikipedia articles and integrated with the question of search engine evaluation.
Arazy, O; Stroulia, E; Ruecker, S; Arias, C; Fiorentino, C; Ganev, V & Yau, T Recognizing Contributions in Wikis: Authorship Categories, Algorithms, and Visualizations JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY Volume 61 2010 [392]
Wikis are designed to support collaborative editing, without focusing on individual contribution, such that it is not straightforward to determine who contributed to a specific page. However, as wikis are increasingly adopted in settings such as business, government, and education, where editors are largely driven by career goals, there is a perceived need to modify wikis so that each editor's contributions are clearly presented. In this paper we introduce an approach for assessing the contributions of wiki editors along several authorship categories, as well as a variety of information glyphs for visualizing this information. We report on three types of analysis: (a) assessing the accuracy of the algorithms, (b) estimating the understandability of the visualizations, and (c) exploring wiki editors' perceptions regarding the extent to which such an approach is likely to change their behavior. Our findings demonstrate that our proposed automated techniques can estimate fairly accurately the quantity of editors' contributions across various authorship categories, and that the visualizations we introduced can clearly convey this information to users. Moreover, our user study suggests that such tools are likely to change wiki editors' behavior. We discuss both the potential benefits and risks associated with solutions for estimating and visualizing wiki contributions.
Korsgaard, Thomas Rune & Jensen, Christian D. Reengineering the Wikipedia for Reputation Electronic Notes in Theoretical Computer Science Volume 244 2009 [393]
The Wikipedia is a free online encyclopedia collaboratively edited by Internet users with a minimum of administration. Anybody can write an article for the Wikipedia and there is no verification of the author's expertise on the particular subject. This may lead to problems relating to the quality of articles, especially completeness and correctness of information, and inaccuracies in the Wikipedia have been rumoured to cause students to fail courses; innocent people have been associated with the killing of John F. Kennedy, etc. Providing a means to assess the correctness, completeness and impartiality of information in the Wikipedia is therefore vitally important for the users to build trust in the Wikipedia and ensure the continued success and growth of the system. Integrating a reputation system into the Wikipedia would help users assess the quality of articles and provide a powerful incentive for authors to improve the quality of their articles. There are currently more than 7.5 million articles in the Wikipedia, and more than a thousand new articles are added daily, so the investment in the existing system is significant. The introduction of a recommendation system should therefore not require any modifications to the existing Wikipedia software. In this paper we examine the problem of reengineering a large and popular system, in this case the Wikipedia, in order to include a reputation system. We propose a recommendation system, which allows Wikipedia users to calculate a personalised rating for any article based on feedback (recommendations) provided by other Wikipedia users. The recommendation system developed for the Wikipedia is based on a general architecture, which we believe applies to many existing applications for online collaboration. The proposed recommendation system is implemented in a proxy placed between the user's web-browser and the Wikipedia server, e.g., on the user's own machine, so there is no need to modify Wikipedia servers or software. A simple prototype of the proposed recommendation system is presented in this paper along with a preliminary evaluation of the prototype. 2009 Elsevier {B.V.} All rights reserved.
Notess, Greg R. Re-evaluating Web evaluation Online (Wilton, Connecticut) Volume 30 2006
The evaluation of content continues to be crucial, as the Web becomes increasingly prevalent as an information source and finding tool. Critical evaluation of information sources is important to the academic process and to any advanced information seeker. One of the problem in dealing with evaluation of online sources is that an increasing number of library resources are made available via the Web. The more typical evaluation criteria such as those listed in Texas Information Literacy Tutorial {(TILT)} work well to validate many sites. Wikipedia deserves credit for keeping track of the changes under the History tab so that specific versions of an article can be cited.
Reiner, Laura & Smith, Allen REFERENCE SOURCES. Journal of Academic Librarianship Volume 32 Pages 343 2006
The article presents abstracts of academic librarianship. They include {Business} News Web Sites {Can} You Trust Wikipedia? {The} Changing Format of Reference Collections: Are Research Libraries Favoring Electronic Access Over Print? {Evolving} Internet Reference Resources.""
Lucky, Robert W. Reflections: A billion amateurs IEEE Spectrum Volume 44 Pages 96 2007 [394]
The Internet is now a medium for all sorts of people to share whatever comment or idea they have. There are a billion people out there that uses the Net in any instance. It seems that open-source or sharing information is the current trend, as seen in the 10 most popular Websites. For instance, Wikipedia, the online encyclopedia, is about sharing any information from just about anyone who comes and visits the site. Another is Flikr, which contain thousands upon thousands of pictures of every known place, taken from all angles and under all lighting conditions. Another is blogging, wherein 80 million people are involved in any one instance.
Stettler, R. Reframing semiotic telematic knowledge spaces, and the anthropological challenge to designing interhuman relations Technoetic Arts: A Journal of Speculative Research Volume 6 Number 2 2008 [395]
Bauerlein, M. REPN TRI to the Fullest!!! EDUCATION NEXT Volume 8 Pages 81 2008 [396]
Poudat, C & Loiseau, S Representation and lexical characterization of sciences in Wikipedia REVUE FRANCAISE DE LINGUISTIQUE APPLIQUEE Volume 12 2007 [397]
The free and online encyclopaedia project Wikipedia has become in less than six years one of the most prominent commons-based peer production example. The way the project works and evolves is now at stake for academics eager to explore auto-organized structures. Although many studies have been led on the connections between contributors, the linguistic properties of Wikipedia productions remain almost unexplored. In this article, we focus on the way sciences are represented within the project and examine the general and epistemic lexical characteristics of the articles thanks to the comparison of a set of corpora extracted from Wikipedia's category system.


Anthony, Denise; Smith, Sean W & Williamson, Timothy Reputation and Reliability in Collective Goods Rationality and Society Volume 21 2009
An important organizational innovation enabled by the revolution in information technologies is 'open source' production which converts private commodities into essentially public goods. Similar to other public goods, incentives for reputation and group identity appear to motivate contributions to open source projects, overcoming the social dilemma inherent in producing such goods. In this paper we examine how contributor motivations affect the type of contributions made to the open source online encyclopedia Wikipedia. As expected, we find that registered participants, motivated by reputation and commitment to the Wikipedia community, make many contributions with high reliability. Surprisingly, however, we find the highest reliability from the vast numbers of anonymous {'Good} Samaritans' who contribute only once. Our findings of high reliability in the contributions of both Good Samaritans and committed 'zealots' suggest that open source production succeeds by altering the scope of production such that a critical mass of contributors can participate. {[Reprinted} by permission of Sage Publications Inc., copyright holder.]
Ardia, David S. Reputation in a Networked World: Revisiting the Social Foundations of Defamation Law. Harvard Civil Rights-Civil Liberties Law Review Volume 45 2010
The article explores the social foundations of defamation law as of 2010 and the concept of reputation amid the emergence of online platforms such as blogs, social networks and discussion forums. It recounts the definition of reputation and its importance in humans and other social species as part of a set of feedback mechanisms within human social systems and a major factor in evolution. Described is how reputational information is used, created, and disseminated by a networked society. The court case about the editing of celebrity Ron Livingston's Wikipedia entry to suggests that he is gay is also discussed. It is inferred that private online intermediaries like content hosts and search providers would be helpful in mitigating reputational harms.
Yu, Jonathan; Thom, James A. & Tam, Audrey Requirements-oriented methodology for evaluating ontologies Information Systems Volume 34 2009 [398]
Many applications benefit from the use of a suitable ontology but it can be difficult to determine which ontology is best suited to a particular application. Although ontology evaluation techniques are improving as more measures and methodologies are proposed, the literature contains few specific examples of cohesive evaluation activity that links ontologies, applications and their requirements, and measures and methodologies. In this paper, we present {ROMEO,} a requirements-oriented methodology for evaluating ontologies, and apply it to the task of evaluating the suitability of some general ontologies (variants of sub-domains of the Wikipedia category structure) for supporting browsing in Wikipedia. The {ROMEO} methodology identifies requirements that an ontology must satisfy, and maps these requirements to evaluation measures. We validate part of this mapping with a task-based evaluation method involving users, and report on our findings from this user study.
Zhao, Fei; Zhou, Tao; Zhang, Liang; Ma, Ming-Hui; Liu, Jin-Hu; Yu, Fei; Zha, Yi-Long & Li, Rui-Qi Research progress on wikipedia Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China Volume 39 2010 [399]
The rapid development of web technology has promoted the emergence and organization of the collaborative Wiki systems. This paper introduces the Wikipedia's history, macro-level statistical properties, evolution regularities, and so on. Especially the application of the motivation and methods of complex network study in analyzing the Wikipedia is emphasized. Wikipedia's significance and impacts on society, economy, culture and education are also discussed. Finally, some open questions are outlined for future research; especially the connection between Wikipedia and the new development in complexity sciences, such as the studies of complex network and human dynamics.
Gorman, J. Respect My Authority Code4Lib Journal 2008
Anonymous REVOLUTIONIZING KNOWLEDGE WORK Leader to Leader Pages 55 2008
Social media such as {MySpace,} Wikipedia, and {LinkedIn} have revolutionized the Internet experience for millions of people around the world. New technologies such as wikis, tagging, blogs, and social networks are democratizing content creation and distribution. By putting the power of the Web into the hands of individuals, social media have helped transform the Internet from a mechanism for one-way dissemination of information to a platform for many-to-many interaction. But businesses -- and the men and women who lead them -- have been slow to adopt social media technologies. By bringing social media into the enterprise, forward-thinking leaders are reshaping their business strategies. As knowledge workers gain more control over online information, they create a more collaborative and efficient work environment, improving employee productivity and driving competitive advantage.


Thelwall, Mike & Stuart, David RUOK? Blogging Communication Technologies During Crises Journal of Computer-Mediated Communication Volume 12 2007
This article compares communication technologies within and across crises, using evidence from contemporary postings in 68,022 blogs and news feeds and using a semi-automatic method to detect words that increase in usage during a crisis. Three case studies from 2005 are used: the July 7 London attacks, the New Orleans hurricane, and the {Pakistan-Kashmir} earthquake. The results highlight the information provision importance for bloggers of Web 2.0 resources such as Wikinews, the Wikipedia, and the Flickr picture sharing site, although these still play a minor role in comparison to the mass media. Some personal communication methods were also mentioned significantly, including {SMS} and cellphones, but the newest technologies of those mentioned were all Web 2.0. The importance of electronic communication for bloggers was found to depend on the nature of the crisis: For example, despite the heavy {Pakistan-Kashmir} earthquake death toll, there was relatively little interest in related communication issues from English language bloggers and news sources.
Anonymous Sanford Berman Women and Language Volume 30 Pages 53 2007 [400]
Kim, Jung-Mn; Shin, Hyopil & Kim, Hyoung-Joo Schema and constraints-based matching and merging of Topic Maps Information Processing and Management Volume 43 2007 [401]
In this paper, we propose a multi-strategic matching and merging approach to find correspondences between ontologies based on the syntactic or semantic characteristics and constraints of the Topic Maps. Our multi-strategic matching approach consists of a linguistic module and a Topic Map constraints-based module. A linguistic module computes similarities between concepts using morphological analysis, string normalization and tokenization and language-dependent heuristics. A Topic Map constraints-based module takes advantage of several Topic Maps-dependent techniques such as a topic property-based matching, a hierarchy-based matching, and an association-based matching. This is a composite matching procedure and need not generate a cross-pair of all topics from the ontologies because unmatched pairs of topics can be removed by characteristics and constraints of the Topic Maps. Merging between Topic Maps follows the matching operations. We set up the {MERGE} function to integrate two Topic Maps into a new Topic Map, which satisfies such merge requirements as entity preservation, property preservation, relation preservation, and conflict resolution. For our experiments, we used oriental philosophy ontologies, western philosophy ontologies, Yahoo western philosophy dictionary, and Wikipedia philosophy ontology as input ontologies. Our experiments show that the automatically generated matching results conform to the outputs generated manually by domain experts and can be of great benefit to the following merging operations. 2006.


Muller-Birn, Claudia; Meuthrath, Benedikt; Erber, Andreas; Burkhart, Sebastian; Baumgrass, Anne; Lehmann, Janette & Schmidl, Robert Seeing similarity in the face of difference: enabling comparison of online production systems Social Network Analysis and Mining 2010 [402]


Krotzsch, Markus; Vrandecic, Denny; Volkel, Max; Haller, Heiko & Studer, Rudi Semantic Wikipedia Web Semantics Volume 5 2007 [403]
Wikipedia is the world's largest collaboratively edited source of encyclopaedic knowledge. But in spite of its utility, its content is barely machine-interpretable and only weakly structured. With Semantic {MediaWiki} we provide an extension that enables wiki-users to semantically annotate wiki pages, based on which the wiki contents can be browsed, searched, and reused in novel ways. In this paper, we give an extended overview of Semantic {MediaWiki} and discuss experiences regarding performance and current applications. 2007 Elsevier {B.V.} All rights reserved.
Stealth, PR Shillipedia FORBES Volume 177 2006 [404]
Wikipedia, the most democratic encyclopedia, is the new battleground for corporate spin. By Evan Hessel.
O'Neil, M. Shirky and Sanger, or the costs of crowdsourcing JCOM Volume 9 Pages 1 2010
Stutzel, W. & Sins, H. Skywiki - The Corporate-Wide Knowledge Portal of Fraport AG Information Management \& Consulting Volume 24 2009
In 2007, the airport operator Fraport {AG} set up its corporate Wiki - Skywiki. To ensure high acceptance of Skywiki, the software {MediaWiki} applied by Wikipedia was implemented. After a 2 years long test period 2,100 articles were written. The reason for the high number of users was not the technology but the active communication of the tools and intensive recruitment of potential authors.
Chandler, C. J. Sleeping with the Enemy: Wikipedia in the College Classroom The History Teacher Volume 43 2010 [405]
Anonymous Social Computing. AI Magazine Volume 31 Pages 10 2010
The article offers information on the social computing issues discussed by the council of the Association for the Advancement of Artificial Intelligence {(AAAI).} It says that the social computing issues formulated by Martha Pollack include establishment of a Wikipedia team, Facebook Inc. presence, {YouTube} channel, and a community blog. It mentions that Carol Hamilton will conduct talks and paper lectures which will be videotaped by {VideoLectures} and be posted on their site.
Yang, Qiang; Zhou, Zhi-Hua; Mao, Wenji; Li, Wei & Liu, Nathan Nan Social Learning IEEE Intelligent Systems Volume 25 2010 [406]
In recent years, social behavioral data have been exponentially expanding due to the tremendous success of various outlets on the social Web (aka Web 2.0) such as Facebook, Digg, Twitter, Wikipedia, and Delicious. As a result, there's a need for social learning to support the discovery, analysis, and modeling of human social behavioral data. The goal is to discover social intelligence, which encompasses a spectrum of knowledge that characterizes human interaction, communication, and collaborations. The social Web has thus become a fertile ground for machine learning and data mining research. This special issue gathers the state-of-the-art research in social learning and is devoted to exhibiting some of the best representative works in this area.
Goldspink, C. Social self-regulation in computer mediated communities: the case of Wikipedia International Journal of Agent Technologies \& Systems Volume 1 2009
This article documents the findings of research into the governance mechanisms within the distributed on-line community known as Wikipedia. It focuses in particular on the role of normative mechanisms in achieving social self regulation. A brief history of the Wikipedia is provided. This concentrates on the debate about governance and also considers characteristics of the wiki technology which can be expected to influence governance processes. The empirical findings are then presented. These focus on how Wikipedians use linguistic cues to influence one another on a sample of discussion pages drawn from both controversial and featured articles. Through this analysis a tentative account is provided of the agent-level cognitive mechanisms which appear necessary to explain the apparent behavioural coordination. The findings are to be used as a foundation for the simulation of normative 'behaviour. The account identifies some of the challenges that need to be addressed in such an attempt including a mismatch between the case findings and assumptions used in past attempts to simulate normative behaviour.
Willinsky, J. Socrates Back on the Street: Wikipedia's Citing of the Stanford Encyclopedia of Philosophy? International Journal of Communication 2 2008 [407]
Spinellis, Diomidis Start with the most difficult part IEEE Software Volume 26 2009 [408]
The process of putting together wpl, a small system that extends arbitrary Web pages with links to Wikipedia entries, demonstrates the value of being able to choose between bottom-up versus top-down design and implementation. By starting our work with the most difficult task, we ensure that we'll face the fewest possible constraints and therefore have the maximum freedom to tackle it. This approach allows the early shrinking of the project's cone of uncertainty, while ensuring that we undertake it with a beginner's enthusiasm and motivation. We can also apply the principle when ordering elements of the software life cycle: requirements elicitation, high- and low-level design, coding, debugging, testing, and maintenance.
Badke, W Stepping Beyond WIKIPEDIA EDUCATIONAL LEADERSHIP Volume 66 2009 [409]
Cimini, Nicholas Struggles online over the meaning of 'Down's syndrome': A 'dialogic' interpretation Health (London, England: 1997) Volume 14 2010 [410]
Bakhtin's suggestion that a unified truth demands a 'multiplicity of consciousnesses' seems particularly relevant in the 'globally connected age'. At a time when the {DIY/'punk} ethic' seems to prevail online, and Wikipedia and blogging means that anyone with access to the Internet can enter into public deliberation, it is worth considering the potential for mass communication systems to create meaningful changes in the way that 'disability' is theorized. Based on the findings of qualitative research, this study explores competing interpretations of disability, specifically dialogue online over the meaning of Down's syndrome, from the vantage point of an approach towards language analysis that emanates from the work of the Bakhtin Circle. It will be shown that, suitably revised and supplemented, elements of Bakhtinian theory provide powerful tools for understanding online relations and changes in the notion of disability. It will also be shown that, while activists in the disabled people's movement have managed to effect modest changes to the way that disability is theorized, both online and in the 'real world', there remains a great deal still to be achieved. This study allows us to understand better the social struggles faced by disabled people and the opportunities open to them.
Camihort, Karin Moyano Students as Creators of Knowledge: When Wikipedia Is the Assignment. Athletic Therapy Today Volume 14 2009
The article examines the use of the online encyclopedia Wikipedia in higher education. Wikipedia is said not to be suitable to be cited as a reference in academic writing, but is described as a valuable teaching aid in the classroom. Means of having college students present their research and classroom learning on the online encyclopedia are discussed.
Goodman, Rachel Students Contribute to a Global Community through Improvement of Wikipedia The American Biology Teacher Volume 70 Pages 138 2008 [411]
Loncarek, Karmen Surfing, Diving, and Epistemological Pleasure. Croatian Medical Journal Volume 50 2009
The author discusses the Internet surfing of medical patients who may not have submerse enough on the activity to get real knowledge as modern scientific production has been deprived of the benefits of epistemology. She urges medical professionals to contribute to Wikipedia. The author disapproves of several aspects of modern scientific production such as being highly regulated, professionalized, and profit-driven.
Achterman, Doug Surviving Wikipedia. Knowledge Quest Volume 33 2005
Discusses the issue of information literary in school library media programs in the United States. Impact of information literacy and teacher collaboration on student search habits; Criticism of information literacy; Reason why information literacy is actually harmful.
McCrae, J. & Collier, N. Synonym set extraction from the biomedical literature by lexical pattern discovery BMC bioinformatics Volume 9 Pages 159 2008
Hall, Elton TAKING NOTE The Chronicle of the Early American Industries Association, Inc. Pages 4 2010 [412]
A fellow EAIA member recently told me of an online encyclopedia called Wikipedia. I had never heard of it before, so I put the name into a Google search and was immediately confronted with the first ten out of about 387,000,000 results. Every time I looked again the number had grown by a few million. Here is a site that is clearly on the move. It's a fascinating approach to gathering and disseminating information.
M Griffiths Talking physics in the social Web Physics World Volume 20 2007
Web is becoming an interactive medium for communicating and accessing information among physicists. Web 2.0 is making it easier for people to create and share content, ranging from digital photos to entries in user-edited encyclopedias. Several science magazines and academic journals have set up blogs, featuring reports from conferences or updates on the latest science news. In addition, many professional physicists have blogs of their own. Physics blogs are starting to have a real impact on the way of communication among researchers. Physicians are using Wikipedia, which is almost as accurate as the Encyclopaedia Britannica. In addition to {WEb} 2.0, {MySpace,} a social networking site, also features social tagging and trust networks and allows each of its users to produce a personal homepage with photos and details of their likes and dislikes.
Capocci, A.; Rao, F. & Caldarelli, G. Taxonomy and clustering in collaborative systems: the case of the on-line encyclopedia wikipedia Europhysics Letters Volume 81 2008 [413]
In this paper we investigate the nature and structure of the relation between imposed classifications and real clustering in a particular case of a scale-free network given by the on-line encyclopedia Wikipedia. We find a statistical similarity in the distributions of community sizes both by using the top-down approach of the categories division present in the archive and in the bottom-up procedure of community detection given by an algorithm based on the spectral properties of the graph. Regardless of the statistically similar behaviour, the two methods provide a rather different division of the articles, thereby signaling that the nature and presence of power laws is a general feature for these systems and cannot be used as a benchmark to evaluate the suitability of a clustering method.
Fox, Bob Teaching Through Technology: Changing Practices in Two Universities International Journal on E-Learning Volume 6 2007 [414]
Winder, D. Team working Information World Review 2005
Wikipedia is the perfect example of how social networking and online collaboration can combine to produce the perfect content creation and communication tool. And a lot of people are using Wikipedia - the popular buzz on it is akin to how the world embraced Google a few years ago. The word wiki" derives from the Hawaiian for quick ("wiki wiki") and perfectly describes the way that anyone can edit create or delete content within a wiki site. It sounds like a recipe for disaster but in practise works amazingly well because mistakes can be corrected immediately without having to jump through hoops to get someone else to do it. In this paper the authors deals with wiki services and social computing and casts his eye over what these powerful collaborative working tools namely Confluence {SocialText.org} {EditHe}"
Lucky, R. Technical Publications and the Internet IEEE Spectrum Volume 45 Pages 25 2008
Not all technical publications are freely accessible on the Web. The first argument that comes to mind is that institutions must restrict their publications to theft members to keep those members. There are a number of other arguments against free access to technical publications, including the revenue that libraries and publications bring to the institution. The Internet community has been inventing new ways to convey information and to collaborate in understanding it -- consumer reviews, discussion forums, blogs, community filtering, and the Wikipedia model. An interesting experiment that has come to the author's attention is a new policy called publish first, review later.
McFedries, Paul Technically speaking: It's a wiki, wiki world IEEE Spectrum Volume 43 Pages 88 2006 [415]
Stephen Colbert of the {US} cable {TV} show, {The} Colbert Report" has coined new words that have to do with Wikipedia. In one of his programs he urged his viewers to "apply {[Wikipedia]} principles to all information. Wikipedia is by far the most famous wiki but there are thousands of others. In general crowdsourcing has become so big because of thousands of Wikipedians."
Aragon, Janni TECHNOLOGIES AND PEDAGOGY: HOW YOUTUBING, SOCIAL NETWORKING, AND OTHER WEB SOURCES COMPLEMENT THE CLASSROOM. Feminist Collections: A Quarterly of Women's Studies Resources Volume 28 Pages 45 2007
The author shares her increasing use of online sources as teaching tools in her women's studies and political science courses. She discloses that she has been using {YouTube,} {MySpace} and {FaceBook} for her classroom discussions. She explains that these online sources attract the attention of students because of the ease in access to such sites. Her students send her video clips from {YouTube,} which relate to the course content. Meanwhile, she warns her students not to cite Wikipedia in their research papers.
Kimura, Bert & Ho, Curtis Technology Trends in Learning and Implications for Intercultural Exchange Global Learn Asia Pacific 2010 [416]
Caverly, David C. & Ward, Anne Techtalk: Wikis and Collaborative Knowledge Construction. Journal of Developmental Education Volume 32 2008
The article explores the use of wikis in college classrooms, particularly for developmental education {(DE)} students. A wiki refers to a variety of dynamic Web pages that can be edited using Web browsers. Examples of wikis include Wikipedia, {MySpace} and {YouTube.} It discusses the vulnerability of wikis like Wikipedia to present a constructed reality dependent on those who post. Information is presented on the instructional applications of wikis, namely resource wikis, presentation wikis, gateway wikis, simulation wikis and illuminated wikis.
Morgan, Sarah Kline TeenLibWiki: The Teen Librarian's Wikipedia. Young Adult Library Services Volume 5 Pages 51 2007
The article evaluates the web site {TeenLibWiki:} The Teen Librarian's Wikipedia available at http://yalibrarian.com/yalib\_wiki.
Logan, Darren W.; Sandal, Massimo; Gardner, Paul P.; Manske, Magnus & Bateman, Alex Ten Simple Rules for Editing Wikipedia. PLoS Computational Biology Volume 6 2010
The article offers tips on how to edit wikipedia, an online encyclopedia containing millions of English language articles. It suggests Internet users to create a user account in Wikipedia because it offers privacy and security. It reminds the Internet users that Wikipedia is different from blogs that encourage editorializing and tells the users to treat other editors as collaborators. It also advises them to know the audience and avoid infringe copyright.
Cho, Hichang; Chen, Meihui & Chung, Siyoung Testing an Integrative Theoretical Model of Knowledge-Sharing Behavior in the Context of Wikipedia Journal of the American Society for Information Science and Technology Volume 61 2010 [417]
This study explores how and why people participate in collaborative knowledge-building practices in the context of Wikipedia. Based on a survey of 223 Wikipedians, this study examines the relationship between motivations, internal cognitive beliefs, social-relational factors, and knowledge-sharing intentions. Results from structural equation modeling {(SEM)} analysis reveal that attitudes, knowledge self-efficacy, and a basic norm of generalized reciprocity have significant and direct relationships with knowledge-sharing intentions. Altruism (an intrinsic motivator) is positively related to attitudes toward knowledge sharing, whereas reputation (an extrinsic motivator) is not a significant predictor of attitude. The study also reveals that a social-relational factor, namely, a sense of belonging, is related to knowledge-sharing intentions indirectly through different motivational and social factors such as altruism, subjective norms, knowledge self-efficacy, and generalized reciprocity. Implications for future research and practice are discussed.
Leary, John Testing Wiki Credibility. Communications of the ACM Volume 49 Pages 12 2006
A letter to the editor is presented in response to the article {Wikipedia} Risks" by Peter Denning in the December 2005 issue."
Verbert, K.; Ochoa, X. & Duval, E. The ALOCOM framework: towards scalable content reuse JoDI - Journal of Digital Information Volume 9 2009
This paper presents a framework that enables flexible content reuse. Unlike the usual practice where document components, such as images, definitions, text fragments, tables or diagrams, are assembled manually through copy-and-paste, the framework enables on-the-fly access and reuse. Retrieval of relevant components is enabled by automatic decomposition of legacy documents and storage of individual components, enriched with metadata. Furthermore, the automatic assembly of these components in mainstream authoring tools is supported. The paper describes the framework and its current support for reassembling {PowerPoint,} Wikipedia and {SCORM} components in authoring tools. In addition, an evaluation is presented that aims to assess the effectiveness and efficiency of such content reuse for presentations.
Perry, M. The appliance of science: Web 2.0 Information World Review 2008
This paper shows that scientists uses Web 2.0 that serves as another outlet for them to discuss and to easily publish their scientific research. This also allowed people to share information and replicate what a university environment wold be like. Online encyclopedia Wikipedia and social networking sites such as {MySpace} and Facebook is transforming the dissemination of research.
Anthony, Denise; Smith, Sean W. & Williamson, Timothy THE CASE OF THE ONLINE ENCYCLOPEDIA WIKIPEDIA. Rationality \& Society Volume 21 2009
An important organizational innovation enabled by the revolution in information technologies is 'open source' production which converts private commodities into essentially public goods. Similar to other public goods, incentives for reputation and group identity appear to motivate contributions to open source projects, overcoming the social dilemma inherent in producing such goods. In this paper we examine how contributor motivations affect the type of contributions made to the open source online encyclopedia Wikipedia. As expected, we find that registered participants, motivated by reputation and commitment to the Wikipedia community, make many contributions with high reliability. Surprisingly, however, we find the highest reliability from the vast numbers of anonymous {'Good} Samaritans' who contribute only once. Our findings of high reliability in the contributions of both Good Samaritans and committed 'zealots' suggest that open source production succeeds by altering the scope of production such that a critical mass of contributors can participate.
Purdy, J.P. The Changing Space of Research: Web 2.0 and the Integration of Research and Writing Environments Computers and Composition Volume 27 2010 [418]
Web 2.0 challenges the artificial compartmentalization of research and writing that often characterizes instruction in composition classes. In Web 2.0, writing and researching activities are increasingly integrated both spatially and conceptually. This article contends that, with this integration, Web 2.0 technologies showcase how research and writing together participate in knowledge production. Through analyzing specific technologies that incorporate Web 2.0 features, including Wikipedia, {JSTOR,} {ARTstor,} and del.icio.us, this article argues that including Web 2.0 technologies in composition courses as objects of analysis and as writing and researching resources offers a means to bridge the gap between students' online proficiencies and academic writing tasks. {[All} rights reserved Elsevier].
Cheong, Pauline Hope; Halavais, Alexander & Kwon, Kyounghee The Chronicles of Me: Understanding Blogging as a Religious Practice. Journal of Media \& Religion Volume 7 2008
Blogs represent an especially interesting site of online religious communication. Analysis of the content of 200 blogs with mentions of topics related to Christianity, as well as interviews of a subset of these bloggers, suggests that blogs provide an integrative experience for the faithful, not a œthird place,? but a melding of the personal and the communal, the sacred and the profane. Religious bloggers operate outside the realm of the conventional nuclear church as they connect and link to mainstream news sites, other nonreligious blogs, and online collaborative knowledge networks such as Wikipedia. By chronicling how they experience faith in their everyday lives, these bloggers aim to communicate not only to their communities and to a wider public but also to themselves. This view of blogging as a contemplative religious experience differs from the popular characterization of blogging as a trivial activity.


Malone, TW; Laubacher, R & Dellarocas, C The Collective Intelligence Genome MIT SLOAN MANAGEMENT REVIEW Volume 51 2010 [419]
Google. Wikipedia. Threadless. All are platinum exemplars of collective intelligence in action. Two of them are famous. The third is getting there. Each of the three helps demonstrate how large, loosely organized groups of people can work together electronically in surprisingly effective ways sometimes even without knowing that they are working together, as in the case of Google. In the authors' work at {MIT's} Center for Collective Intelligence, they have gathered nearly 250 examples of web-enabled collective intelligence. After examining these examples in depth, they identified a relatively small set of building blocks that are combined and recombined in various ways in different collective intelligence systems. This article offers a new framework for understanding those systems - and more important, for understanding how to build them. It identifies the underlying building blocks - the genes" - that are at the heart of collective intelligence systems. It explores the conditions under which each gene is useful. And it begins to suggest the possibilities for combining and recombining these genes to not only harness crowds in general but to harness them in just the way that your organization needs."
Polukarova, N.A. The concept of open editing from the copyright viewpoint Automatic Documentation and Mathematical Linguistics Volume 41 2007 [420]
The principles of open editing in wiki technology are described as illustrated by the example of Wikipedia, a popular free Web encyclopedia. Information is given about {GNU} software, Uniform Computer Information Transaction Act, licenses for free software and their corresponding free user manuals, as well as about legal issues arising in connection with the free documentation license.


Wallace, D. P. The Democratization of Information? Wikipedia as a Reference Resource Reference & User Services Quarterly Volume 45 2005 [421]
Yuan, YC; Cosley, D; Welser, HT; Xia, L & Gay, G The Diffusion of a Task Recommendation System to Facilitate Contributions to an Online Community JOURNAL OF COMPUTER-MEDIATED COMMUNICATION Volume 15 2009 [422]
This paper studies the diffusion of {SuggestBot,} an intelligent task recommendation system that helps people find articles to edit in Wikipedia. We investigate factors that predict who adopts {SuggestBot} and its impact on adopters' future contributions to this online community. Analyzing records of participants' activities in Wikipedia, we found that both individual characteristics and social ties influence adoption. Specially, we found that highly involved contributors were more likely to adopt {SuggestBot;} interpersonal exposure to innovation, cohesion, and tie homophily all substantially increased the likelihood of adoption. However, connections to prominent, high-status contributors did not influence adoption. Finally, although the {SuggestBot} innovation saw limited distribution, adopters made significantly more contributions to Wikipedia after adoption than nonadopter counterparts in the comparison group.
Hartling, F The Digital Author?: Authorship in the Digital Era PRIMERJALNA KNJIZEVNOST 2009
Since the birth of the World Wide Web as the most,successful application of the Internet there have been hopes of literary theorists {(Landow,} Bolter) that the new digital media would. finally allow for the death of the author" and the birth of the "writing reader". The hypertext as new genre of text seemed to be powerful enough to fulfill the older hopes of the poststructuralists {(Barthes} Foucault). Although these euphoric hopes have been abandoned by literary theory for the most part the Internet in the actual literary production still seems to have the power to be an "author-less" media in principle: In the oft-discussed encyclopaedia {"Wikipedia"} the collaborative written text supposedly is more important than the authors. literary experiments in the digital media are exploring how text can be written just by text-algorithms. These projects finally do not need writers anymore; they are using data taken from search engines. But this somewhat naive idea of an "authorless" digital media clearly call be reined. First the author has been revived by the new media and continues to thrive a within it. Second in contrast to the prediction of huge "authorless" collaborative text-production in online journalism it is hard to find any collaborative works of literature. Third even with collaborative projects or "codeworks" the function of an author does riot disappear but is spread over different persons which can even lead to a "dissociated" authorship. The author cannot disappear or "die" in the Internet because its characteristics will not allow this to happen. Therefore the Internet does not stand for the "death" of the author; it actually appears to be a fountain of youth for literary authorship instead. These findings are discussed using recent experiments with authorship in digital literature."
Sandars, John The e-learning site. Education for Primary Care Volume 19 2008
The article reports about e-learning. It focuses on the effectivity of educational wikis that could be accessed and edited by anyone who will want to do so. This paper highlights the use of {Wikipedia"} resource in medical education which provides collaborative protocols and various attachments available on the website."


Oboler, Andre; Steinberg, Gerald & Stern, Rephael The Framing of Political NGOs in Wikipedia through Criticism Elimination Journal of Information Technology \& Politics Volume 7 2010 [423]
This article introduces criticism elimination, a type of information removal leading to a framing effect that impairs Wikipedia's delivery of a neutral point of view {(NPOV)} and ultimately facilitates a new form of gatekeeping with political science and information technology implications. This article demonstrates a systematic use of criticism elimination and categorizes the editors responsible into four types. We show that some types use criticism elimination to dominate and manipulate articles to advocate political and ideological agendas. We suggest mitigation approaches to criticism elimination. The research is interdisciplinary and based on empirical analysis of the public edit histories.
Huss, Jon W; Lindenbaum, Pierre; Martone, Michael; Roberts, Donabel; Pizarro, Angel; Valafar, Faramarz; Hogenesch, John B & Su, Andrew I The Gene Wiki: community intelligence applied to human gene annotation Nucleic Acids Research Volume 38 Pages Database issue 2010 [424]
Annotating the function of all human genes is a critical, yet formidable, challenge. Current gene annotation efforts focus on centralized curation resources, but it is increasingly clear that this approach does not scale with the rapid growth of the biomedical literature. The Gene Wiki utilizes an alternative and complementary model based on the principle of community intelligence. Directly integrated within the online encyclopedia, Wikipedia, the goal of this effort is to build a gene-specific review article for every gene in the human genome, where each article is collaboratively written, continuously updated and community reviewed. Previously, we described the creation of Gene Wiki 'stubs' for approximately 9000 human genes. Here, we describe ongoing systematic improvements to these articles to increase their utility. Moreover, we retrospectively examine the community usage and improvement of the Gene Wiki, providing evidence of a critical mass of users and editors. Gene Wiki articles are freely accessible within the Wikipedia web site, and additional links and information are available at http://en.wikipedia.org/wiki/Portal:Gene\_Wiki.
Marche, Stephen The iPad and Twenty-First-Century Humanism Queen's Quarterly Volume 117 Pages 195 2010 [425]
Hilbert, Martin The Maturing Concept of E-Democracy: From E-Voting and Online Consultations to Democratic Value Out of Jumbled Online Chatter Journal of Information Technology \& Politics Volume 6 2009
Early literature on e-democracy was dominated by euphoric claims about the benefits of e-voting (digital direct democracy) or continuous online citizen consultations (digital representative democracy). High expectations have gradually been replaced with more genuine approaches that aim to break with the dichotomy of traditional notions of direct and representative democracy. The ensuing question relates to the adequate design of information and communication technology {(ICT)} applications to foster such visions. This article contributes to this search and discusses issues concerning the adequate institutional framework. Recently, so-called Web 2.0 applications, such as social networking and Wikipedia, have proven that it is possible for millions of users to collectively create meaningful content online. While these recent developments are not necessarily labeled e-democracy in the literature, this article argues that they and related applications have the potential to fulfill the promise of breaking with the longstanding democratic trade-off between group size (direct mass voting on predefined issues) and depth of argument (deliberation and discourse in a small group). Complementary information-structuring techniques are at hand to facilitate large-scale deliberations and the negotiation of interests between members of a group. This article presents three of these techniques in more depth: weighted preference voting, argument visualization, and the Semantic Web initiative. Notwithstanding these developments, the maturing concept of e-democracy still faces serious challenges. Questions remain in political and computer science disciplines that ask about adequate institutional frameworks, the omnipresent democratic challenges of equal access and free participation, and the appropriate technological design. Adapted from the source document.
Lucia, Leão The mirror labyrinth: reflections on bodies and consciousness at cybertimes Technoetic Arts: a journal of speculative research Volume 3 2005
Discusses references to the body in cyber artworks, and how new technologies such as the World Wide Web are changing perceptions of the body. The author defines the body" for the purpose of the article as encompassing consciousness and involved in the process of all actions and outlines theoretical relations between body and cyberspace. She discusses concepts of multiple bodies in belief systems including Ancient India Theosophy and {Afro-Brazilian} religions and discusses instances of the Egyptian concept of "ka" in the arts. The author analyses avatars as representations of the double body in the digital world which are capable of interacting with others from all over the earth and discusses their design by artist Rebecca Allen. She discusses multiple body-related cyber artworks by Mark Napier Natalie Bookchin Tina {LaPorta} the Critical Art Ensemble Diana Domingues and the Tsunami group and describes the collaborative {'Wikipedia'} encyclopaedia as an example of a cyberactivism project."
Lewis, Paul; Davies, Christie; Kuipers, Giselinde; Martin, Rod A.; Oring, Elliott & Raskin, Victor The Muhammad cartoons and humor research: A collection of essays. Humor: International Journal of Humor Research Volume 21 2008
At the 2006 conference of the International Society for Humor Studies {(Danish} University of Education, Copenhagen), several panels addressed issues raised by the Muhammad cartoon story. Among these, a colloquium organized by Paul Lewis and decorously titled Transnational Ridicule and Response focused on the implications for humor research of the events surrounding the publication of the cartoons. Along with other materials, panelists were encouraged to review summaries of and timelines for the story available from the {BBC} and Wikipedia. Of the questions considered by panelists, the following drew interesting and, at times, provocative responses: Were the cartoons humorous; if so, did they represent a distinct or new kind of humor? Were the modes of global transmission of the cartoons new? Does the story have implications for ongoing humor research and advocacy? The goal was to approach the controversy not as partisans with particular political outlooks but as students of humor. The brief essays collected here were written following the conference by members of the panel {(Christie} Davies, Giselinde Kuipers, Paul Lewis, and Victor Raskin) and by two others who attended the colloquium {(Elliott} Oring and Rod A. Martin). After reviewing the essays, {HUMOR} editor Salvatore Attardo suggested that each of the participants be invited to read what the others had written and submit a brief response. Responses included here were received from Davies, Kuipers, Lewis, Oring, and Raskin.
Hyatt, J. The Oh-So-Practical Magic of Open-Source Innovation MIT Sloan Management Review Volume 50 Pages 15 2008
{MySQL} {AB,} the business Marten Mickos has built since 2001, has committed itself to open-source innovation since its founding in 1995 with results successful enough that Sun Microsystems Inc. acquired what is the worlds fastest-growing database vendor earlier this year for \$1 billion. ike such well-known proponents as Linux, the operating system, and Wikipedia, the online encyclopedia, {MySQL} shares its source code for free, giving programmers everywhere permission to debug, add features or otherwise modify the product before redistributing it. In an interview, Mickos, now a senior vice president at Sun, discusses freely sharing his ideas about why this Internet-age version of a barn-raising produces superior innovation, what murky motivations keep all those developers devoted and why Leonardo da Vinci is the father of the open-source movement.
Powell, Louie The Paradox fo Wikipedia IEEE Industry Applications Magazine Volume 14 Pages 2 2008 [426]
No abstract available
Shachaf, P. The paradox of expertise: is the Wikipedia reference desk as good as your library? Journal of Documentation Volume 65 2009 [427]
Purpose - The purpose of this paper is to examine the quality of answers on the Wikipedia reference desk, and to compare it with library reference services. It aims to examine whether Wikipedia volunteers outperform expert reference librarians and exemplify the paradox of expertise. Design/methodology/approach - The study applied content analysis to a sample of 434 messages (77 questions and 357 responses) from the Wikipedia reference desk and focused on three {SERVQUAL} quality variables: reliability (accuracy, completeness, verifiability), responsiveness, and assurance. Findings - The study reports that on all three {SERVQUAL} measures quality of answers produced by the Wikipedia reference desk is comparable with that of library reference services. Research limitations/implications - The collaborative social reference model matched or outperformed the dyadic reference interview and should be further examined theoretically and empirically. The generalizability of the findings to other similar sites is questionable. Practical implications - Librarians and library science educators should examine the implications of the social reference on the future role of reference services. Originality/value - The study is the first to: examine the quality of the Wikipedia Reference Desk; extend research on Wikipedia quality; use {SERVQUAL} measures in evaluating {QA} sites; and compare {QA} sites with traditional reference services.


Kim, Ji Yeon; Gudewicz, Thomas M; Dighe, Anand S & Gilbertson, John R The pathology informatics curriculum wiki: Harnessing the power of user-generated content Journal of Pathology Informatics Volume 1 2010 [428]
{BACKGROUND:} The need for informatics training as part of pathology training has never been so critical, but pathology informatics is a wide and complex field and very few programs currently have the resources to provide comprehensive educational pathology informatics experiences to their residents. In this article, we present the pathology informatics curriculum wiki" an open on-line wiki that indexes the pathology informatics content in a larger public wiki Wikipedia (and other online content) and organizes it into educational modules based on the 2003 standard curriculum approved by the Association for Pathology Informatics {(API).} {METHODS} {AND} {RESULTS:} In addition to implementing the curriculum wiki at http://pathinformatics.wikispaces.com we have evaluated pathology informatics content in Wikipedia. Of the 199 non-duplicate terms in the {API} curriculum 90\% have at least one associated Wikipedia article. Furthermore evaluation of articles on a five-point Likert scale showed high scores for comprehensiveness (4.05) quality (4.08) currency (4.18) and utility for the beginner (3.85) and advanced (3.93) learners. These results are compelling and support the thesis that Wikipedia articles can be used as the foundation for a basic curriculum in pathology informatics. {CONCLUSIONS:} The pathology informatics community now has the infrastructure needed to collaboratively and openly create maintain and distribute the pathology informatics content worldwide {(Wikipedia)} and also the environment (the curriculum wiki) to draw upon its own resources to index and organize this content as a sustainable basic pathology informatics educational resource. The remaining challenges are numerous but largest by far will be to convince the pathologists to take the time and effort required to build pathology informatics content in Wikipedia and to index and organize this content for education in the curriculum wiki."
Laslie, Mitch The People's Encyclopedia. Science Volume 301 Pages 1299 2003
The do-it-yourself spirit flourishes on the Internet, where for the last two-and-a-half years, readers have been writing and editing their own encyclopedia, known as Wikipedia. It now has more than 152,000 articles under way in English, and the project's participants aim to create the world's largest encyclopedia. Wikipedia offers a substantial science section, with biographies of scientists such as the late paleontologist Stephen Jay Gould, backgrounds on subjects such as relativity and acid-base reactions, and overviews of major disciplines. These articles brim with links to other Wikipedia entries and outside sources. Instead of undergoing formal peer review by experts, these articles endure the scrutiny of readers, who can edit, correct, and polish the prose.
liang Chen, Hsin The perspectives of higher education faculty on Wikipedia Electronic Library Volume 28 2010 [429]
Purpose - This purpose of this paper is to investigate whether higher education instructors use information from Wikipedia for teaching and research. Design/methodology/approach - This is an explorative study to identify important factors regarding user acceptance and use of emerging information resources and technologies in the academic community. A total of 201 participants around the world answered an online questionnaire administered by a commercial provider. The questionnaire consisted of 16 Likert-scaled questions to assess participants' agreement with each question along with an optional open-ended explanation. Findings - The findings of this project confirm that internet access was related to faculty technology use. Online resources and references were ranked the first choice by the participants when searching for familiar and unfamiliar topics. The investigator found that participants' academic ranking status, frequency of e-mail use and academic discipline were related to their use of online databases, web-based information and directing students to information from the Web. Although the participants might often use online resources for research and teaching, Wikipedia's credibility was the participants' major concern. Research limitations/implications - This project is an exploratory study and more considerations are needed for this research area. Originality/value - The paper shows that participants who used online databases more often showed a negative attitude toward Wikipedia. Those participants who used Wikipedia for teaching and research also allowed students to use information from Wikipedia and were more likely to be contributors to Wikipedia.
Weiss, Aaron The power of collective intelligence netWorker - Beyond file-sharing: collective intelligence Volume 9 2005 [430]
Though the overall health of the tech sector may have looked bleak a few years back---at least in the eyes of financial analysts---a blend of old and new ideas, evolving technologies, and changing cultural values have recently given the online world new vigor. With content derived primarily by community contribution, popular and influential services like Flickr and Wikipedia represent the emergence of collective intelligence" as the new driving force behind the evolution of the Internet."
Rask, M. The reach and richness of Wikipedia: is Wikinomics only for rich countries? First Monday Volume 13 2008
This study examined the impact of technological and economic factors on the global diffusion of Wikinomics among developed and developing countries. Examining different language editions of Wikipedia, this study found significant correlation between a variety of socio-economic factors and involvement in Wikipedia.
Preece, Jennifer & Shneiderman, Ben The Reader-to-Leader Framework: Motivating Technology-Mediated Social Participation AIS Transactions on Human-Computer Interaction Volume 1 2009 [431]
Billions of people participate in online social activities. Most users participate as readers of discussion boards, searchers of blog posts, or viewers of photos. A fraction of users become contributors of user-generated content by writing consumer product reviews, uploading travel photos, or expressing political opinions. Some users move beyond such individual efforts to become collaborators, forming tightly connected groups with lively discussions whose outcome might be a Wikipedia article or a carefully edited {YouTube} video. A small fraction of users becomes leaders, who participate in governance by setting and upholding policies, repairing vandalized materials, or mentoring novices. We analyze these activities and offer the {Reader-to-Leader} Framework with the goal of helping researchers, designers, and managers understand what motivates technology-mediated social participation. This will enable them to improve interface design and social support for their companies, government agencies, and non-governmental organizations. These improvements could reduce the number of failed projects, while accelerating the application of social media for national priorities such as healthcare, energy sustainability, emergency response, economic development, education, and more.
Daub, Jennifer; Gardner, Paul P; Tate, John; Ramsköld, Daniel; Manske, Magnus; Scott, William G; Weinberg, Zasha; Griffiths-Jones, Sam & Bateman, Alex The RNA WikiProject: community annotation of RNA families RNA (New York, N.Y.) Volume 14 2008 [432]
The online encyclopedia Wikipedia has become one of the most important online references in the world and has a substantial and growing scientific content. A search of Google with many {RNA-related} keywords identifies a Wikipedia article as the top hit. We believe that the {RNA} community has an important and timely opportunity to maximize the content and quality of {RNA} information in Wikipedia. To this end, we have formed the {RNA} {WikiProject} {(http://en.wikipedia.org/wiki/Wikipedia:WikiProject\_RNA)} as part of the larger Molecular and Cellular Biology {WikiProject.} We have created over 600 new Wikipedia articles describing families of noncoding {RNAs} based on the Rfam database, and invite the community to update, edit, and correct these articles. The Rfam database now redistributes this Wikipedia content as the primary textual annotation of its {RNA} families. Users can, therefore, for the first time, directly edit the content of one of the major {RNA} databases. We believe that this {Wikipedia/Rfam} link acts as a functioning model for incorporating community annotation into molecular biology databases.
Prasarnphanich, P. & Wagner, C. The role of wiki technology and altruism in collaborative knowledge creation Journal of Computer Information Systems Volume 49 2009
Collaborative knowledge creation is presently being reshaped by the use of Web 2.0 technologies such as wikis. Wikipedia, arguably the most successful application of wiki technology, demonstrates the feasibility and success of this form of collaborative knowledge creation (in a broad sense) within selforganizing, open access community. The study seeks to understand the success of the public wiki model, with Wikipedia as the test case, assessing both technology and participant motivations. The study finds that, contrary to the motivation in open source software development, altruism is a prevalent driver for participation, although mixed motives clearly exist. In particular, while participants have both individualistic and collaborative motives, collaborative (altruistic) motives dominate. The success of the collaboration model embedded in Wikipedia thus appears to be related to wiki technology and the wiki way" (i.e. social norms) of collaboration."
Kane, G. C. & Fichman, R. G. The Shoemakers Children: Using Wikis for Information Systems Teaching, Research, and Publication MIS Quarterly Volume 33 2009
This paper argues that Web 2.0 tools, specifically wikis, have begun to influence business and knowledge sharing practices in many organizations. Information Systems researchers have spent considerable time exploring the impact and implications of these tools in organizations, but those same researchers have not spent sufficient time considering whether and how these new technologies may provide opportunities for us to reform our core practices of research, review, and teaching. To this end, this paper calls for the {IS} discipline to engage in two actions related to wikis and other Web 2.0 tools. First, the {IS} discipline ought to engage in critical reflection about how wikis and other Web 2.0 tools could allow us to conduct our core processes differently. Our existing practices were formulated during an era of paperbased exchange; wikis and other Web 2.0 tools may enable processes that could be substantively better. Nevertheless, users can appropriate information technology tools in unexpected ways, and even when tools are appropriated as expected there can be unintended negative consequences. Any potential changes to our core processes should, therefore, be considered critically and carefully, leading to our second recommended action. We advocate and describe a series of controlled experiments that will help assess the impact of these technologies on our core processes and the associated changes that would be necessary to use them. We argue that these experiments can provide needed information regarding Web 2.0 tools and related practice changes that could help the discipline better assess whether or not new practices would be superior to existing ones and under which circumstances.
Grier, David Alan The Spirit of Combination Computer Volume 43 2010
We find new ideas by starting from where we are and asking the simple question, {Where} can we go from here?""


liang Chen, Hsin The use and sharing of information from Wikipedia by high-tech professionals for work purposes Electronic Library Volume 27 2009 [433]
{Purpose-The} aim of this paper is to focus on discovering whether high-tech professionals as a user community search for information from Wikipedia to fulfill their job duties and, if they do, how they share information with co-workers and clients. Design/methodology/approach - An online questionnaire was used, administered by a commercial provider. The questionnaire consisted of 15 Likert-scaled questions to assess participants' agreement with each question along with an optional open-ended explanation. A total of 68 participants successfully answered the questionnaire. Participants' Likert rating scores were analyzed by two-way {ANOVA,} one-way {ANOVA} and correlational analyses using {SPSS.} {Findings-The} analyses examined relationships among participants' characteristics, their use of information resources for research and teaching, information-sharing behaviors, and use/non-use of Wikipedia. Findings indicated that the participants treated Wikipedia as a ready reference for general information. Their concern is that Wikipedia only has a limited number of entries available at this point. They suggested that Wikipedia needed to improve the contribution and editorial process and to make it more rigorous. {Originality/value-Personal} information infrastructure affects how the high-tech professionals surveyed use-and-share information from Wikipedia for work. In the current situation, the participants consider Wikipedia to be a developing information resource and show less interest in contributing to it. The project is an exploratory study and more considerations are needed for this research area.
Fiedler, T. The Web's Pathway to Accuracy Nieman Reports Volume 62 Pages 40 2008
Wikipedia is the wildly popular Internet encyclopedia that proudly operates on the idea that there is more wisdom to be found in its crowds of anonymous readers than in the brains of editors and academics. Here, Fielder details how Wikipedia's credible source of information was badly damaged. He argues that the damage came shortly after Wikipedia launched a bogus entry stating that John Seigenthaler, Sr., a prominent journalist at {USA} Today may have played a role in the assassinations of President John F. Kennedy in 1963 and Robert Kennedy in 1968. He notes that none of the blogs were true, yet the Wikipedia posted it for four months and it was picked up and reproduced without change on two other web sites.
Mendoza, Hannah Rose The WikiID: An Alternative Approach to the Body of Knowledge. Journal of Interior Design Volume 34 2009
A discussion of the locus of design knowledge is currently underway as well as a search for clear boundaries defined by a formal Body of Knowledge {(BoK).} Most attempts to define a {BoK} involve the creation of jurisdictional boundaries of knowledge" that "allow those who possess this knowledge to claim authority over its application" {(Guerin} \& Thompson2004 p. 1). This claim is attractive but such control may no longer be an option in the Internet Age when even the call for the discussion of the {BoK} definition process is on the Web. {Marshall-Baker} (2005) argued that "the moment knowledge is bordered it is no longer knowledge" (p. xiv). Whereas data and information are easily captured and generalized knowledge is specific to users and their evolving understandings implying purposeful application over time. This paper explores knowledge as process transcending boundaries and seeks to answer not "where" the locus lies but rather "what" that locus could be. Using a feminist framework I argue that in conjunction with the work done thus far we should move toward the creation of an inclusive model for the {BoK.} In such a model the value of the profession is felt as a result of inclusion in and interaction with the knowledge creation process. I propose that the {BoK} should not be a printed document but a Web-based organizational system that supports change and innovation. Wikipedia provides this type of inclusive mutable system and the same framework could be applied to the creation of a systemic {BoK.} I call this creation the {WikiID.} {(Author} abstract)"
Rose, A. The Wikinews Ace Columbia Journalism Review Volume 47 Pages 22 2009
Rose features David Miller, a Wikinews' star reporter. Miller's journalistic sideline began in 2005 after he dropped out of Fordham Law School. He couldn't afford the tuition for his final year because he missed a few credit-card payments and didn't qualify for loans. His older sister gave him a low-end digital camera for his birthday and he began snapping photos around the city, which he'd then upload to relevant Wikipedia articles that had no images. It was around this time he created his {pseudonym--David} Shankbone. Eventually, Miller got tired of just taking photos. He'd always considered himself a writer--he wrote about the war in Iraq and the aftermath of Hurricane Katrina for a student news blog at Fordham-- when a volunteer Wikipedia editor suggested he check out the fledgling Wikinews, Miller decided to broaden his journalistic repertoire.
Austin, M. The wikipedia phenomenon Information World Review 2005
Wikipedia is a rapidly expanding encyclopaedia of communally assembled information, but can information professionals really trust the knowledge that is being put together here? Or is the world now embracing a new form of knowledge repository that undermines many of our sacred principles? The author gets deep inside the wiki concept of collective wisdom. Wikipedia is an online collaborative encyclopaedia created by nerds and geeks and often - if not always - inaccurate in some way. It uses as its base a wiki to control the version management and handle the database and user information
Arney, C. The Wikipedia Revolution Mathematics and Computer Education Volume 44 2010 [434]
Kowalsky, David The Wikipedia Revolution: How a Bunch of Nobodies Created the World's Greatest Encyclopedia. Technical Communication Volume 57 2010
The article reviews the book {The} Wikipedia Revolution: How a Bunch of Nobodies Created the World's Greatest Encyclopedia by Andrew Lih.
Denoyer, Ludovic & Gallinari, Patrick The Wikipedia XML corpus ACM SIGIR Forum Homepage Volume 40 2006 [435]
Wikipedia is a well known free content, multilingual encyclopedia written collaboratively by contributors around the world. Anybody can edit an article using a wiki markup language that offers a simplified alternative to {HTML.} This encyclopedia is composed of millions of articles in different languages.
Villano, P. The wizard of oz effect and a new Emerald city On the Horizon Volume 18 2010 [436]
Purpose - The purpose of this paper is to develop three key concepts to the future of knowledge work: knowledge work is a natural, ever-changing process - not something that can be certified; open education, connection and interaction are the way of the future; and the future of knowledge work hinges on enabling shared practical knowledge globally. Design/methodology/approach - The paper is filled with metaphor mixed with research from recognized knowledge management {(KM)} experts as well as extensive social media sources such as Wikipedia. The intent is to demonstrate as well as describe the natural process and potential of global connection and interaction. Findings - Knowledge can be found in one's own back yard (or as close as one's pocket) and one's ability to connect to the world. Open education will be increasingly available to support community-generated certification of knowledge workers. Originality/value - The paper uses a unique approach to forward a new, inclusive way of looking at knowledge worker certification. It also suggests pragmatic approaches for accomplishing community-generated certification.
Kasneci, Gjergji; Ramanath, Maya; Suchanek, Fabian & Weikum, Gerhard The YAGO-NAGA approach to knowledge discovery SIGMOD Record Volume 37 2008 [437]
This paper gives an overview on the {YAGO-NAGA} approach to information extraction for building a conveniently searchable, large-scale, highly accurate knowledge base of common facts. {YAGO} harvests infoboxes and category names of Wikipedia for facts about individual entities, and it reconciles these with the taxonomic backbone of {WordNet} in order to ensure that all entities have proper classes and the class system is consistent. Currently, the {YAGO} knowledge base contains about 19 million instances of binary relations for about 1.95 million entities. Based on intensive sampling, its accuracy is estimated to be above 95 percent. The paper presents the architecture of the {YAGO} extractor toolkit, its distinctive approach to consistency checking, its provisions for maintenance and further growth, and the query engine for {YAGO,} coined {NAGA.} It also discusses ongoing work on extensions towards integrating fact candidates extracted from natural-language text sources.
Rader, Heather they call me newbie"." Teacher Librarian Volume 34 2006
The article presents information on the term newbie. According to Wikipedia: The Free Encyclopedia, a newbie is a newcomer to a particular field, the term being commonly used on the Internet, where it might refer to new, inexperienced, or ignorant users of a game, a newsgroup, an operating system, or the Internet itself. The terra is generally regarded as an insult, although in many cases, it is used in purposes of negative reinforcement by more experienced or knowledgeable people.
BUSH, GAIL Thinking Around the Corner: The Power of Information Literacy. Phi Delta Kappan Volume 90 2009
The article discusses methods teacher can use to increase information literacy in students who must function in an information society. The author suggests teachers should train students to examine sources of information, such as websites or newspapers, for accuracy and bias. She recommends that students be instructed to verify facts and notes the unreliability of the Internet encyclopedia Wikipedia. Teachers can use global topics such as globalization and environmental issues to help students explore local connections and develop critical thinking skills. Social networking websites and multi-user virtual environments can be used to promote cooperative learning.
Reagle, Joseph TIMELINES: Wikipedia: the happy accident interactions - Design Fiction Interactions Homepage Volume 16 2009 [438]
Joseph Reagle's work on Wikipedia and its predecessors opened my eyes to a fascinating history. I'm delighted he has provided this account of the origin of the most interesting digital object since the Web itself. {---Jonathan} Grudin
Atanassova, V. Topics of Bioengineering in Wikipedia International Journal Volume 13 2009
Theobald, Martin; Bast, Holger; Majumdar, Debapriyo; Schenkel, Ralf & Weikum, Gerhard TopX: efficient and versatile top-k query processing for semistructured data The VLDB Journal ” The International Journal on Very Large Data Bases Volume 17 2008 [439]
Recent {IR} extensions to {XML} query languages such as Xpath 1.0 {Full-Text} or the {NEXI} query language of the {INEX} benchmark series reflect the emerging interest in {IR-style} ranked retrieval over semistructured data. {TopX} is a top-k retrieval engine for text and semistructured data. It terminates query execution as soon as it can safely determine the k top-ranked result elements according to a monotonic score aggregation function with respect to a multidimensional query. It efficiently supports vague search on both content- and structure-oriented query conditions for dynamic query relaxation with controllable influence on the result ranking. The main contributions of this paper unfold into four main points: (1) fully implemented models and algorithms for ranked {XML} retrieval with {XPath} {Full-Text} functionality, (2) efficient and effective top-k query processing for semistructured data, (3) support for integrating thesauri and ontologies with statistically quantified relationships among concepts, leveraged for word-sense disambiguation and query expansion, and (4) a comprehensive description of the {TopX} system, with performance experiments on large-scale corpora like {TREC} Terabyte and {INEX} Wikipedia.
Yu, Haiyuan; Jansen, Ronald; Stolovitzky, Gustavo & Gerstein, Mark Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications Bioinformatics (Oxford, England) Volume 23 2007 [440]
{MOTIVATION:} Many classifications of protein function such as Gene Ontology {(GO)} are organized in directed acyclic graph {(DAG)} structures. In these classifications, the proteins are terminal leaf nodes; the categories 'above' them are functional annotations at various levels of specialization and the computation of a numerical measure of relatedness between two arbitrary proteins is an important proteomics problem. Moreover, analogous problems are important in other contexts in large-scale information organization--e.g. the Wikipedia online encyclopedia and the Yahoo and {DMOZ} web page classification schemes. {RESULTS:} Here we develop a simple probabilistic approach for computing this relatedness quantity, which we call the total ancestry method. Our measure is based on counting the number of leaf nodes that share exactly the same set of 'higher up' category nodes in comparison to the total number of classified pairs (i.e. the chance for the same total ancestry). We show such a measure is associated with a power-law distribution, allowing for the quick assessment of the statistical significance of shared functional annotations. We formally compare it with other quantitative functional similarity measures (such as, shortest path within a {DAG,} lowest common ancestor shared and Azuaje's information-theoretic similarity) and provide concrete metrics to assess differences. Finally, we provide a practical implementation for our total ancestry measure for {GO} and the {MIPS} functional catalog and give two applications of it in specific functional genomics contexts. {AVAILABILITY:} The implementations and results are available through our supplementary website at: http://gersteinlab.org/proj/funcsim. {SUPPLEMENTARY} {INFORMATION:} Supplementary data are available at Bioinformatics online.
Damrosch, D. Toward a History of World Literature New Literary History Volume 39 2008 [441]


Naslund, J. A. Towards School Library 2.0: An Introduction to Social Software Tools for Teacher Librarians School Libraries Worldwide Volume 14 2008 [442]
Devereux, Barry; Pilkington, Nicholas; Poibeau, Thierry & Korhonen, Anna Towards Unrestricted, Large-Scale Acquisition of Feature-Based Conceptual Representations from Corpus Data Research on Language and Computation Volume 7 Pages 02/04/2011 2009 [443]
In recent years a number of methods have been proposed for the automatic acquisition of feature-based conceptual representations from text corpora. Such methods could offer valuable support for theoretical research on conceptual representation. However, existing methods do not target the full range of concept-relation-feature triples occurring in human-generated norms (e.g. flute produce sound) but rather focus on concept-feature pairs (e.g. flute --- sound) or triples involving specific relations only (e.g. is-a or part-of relations). In this article we investigate the challenges that need to be met in both methodology and evaluation when moving towards the acquisition of more comprehensive conceptual representations from corpora. In particular, we investigate the usefulness of three types of knowledge in guiding the extraction process: encyclopedic, syntactic and semantic. We present first a semantic analysis of existing, human-generated feature production norms, which reveals information about co-occurring concept and feature classes. We introduce then a novel method for large-scale feature extraction which uses the class-based information to guide the acquisition process. The method involves extracting candidate triples consisting of concepts, relations and features (e.g. deer have antlers, flute produce sound) from corpus data parsed for grammatical dependencies, and re-weighting the triples on the basis of conditional probabilities calculated from our semantic analysis. We apply this method to an automatically parsed Wikipedia corpus which includes encyclopedic information and evaluate its accuracy using a number of different methods: direct evaluation against the {McRae} norms in terms of feature types and frequencies, human evaluation, and novel evaluation in terms of conceptual structure variables. Our investigation highlights a number of issues which require addressing in both methodology and evaluation when aiming to improve the accuracy of unconstrained feature extraction further.
Santana, Adele & Wood, Donna J. Transparency and social responsibility issues for Wikipedia Ethics and Information Technology Volume 11 2009 [444]
Wikipedia is known as a free online encyclopedia. Wikipedia uses largely transparent writing and editing processes, which aim at providing the user with quality information through a democratic collaborative system. However, one aspect of these processes is not transparent--the identity of contributors, editors, and administrators. We argue that this particular lack of transparency jeopardizes the validity of the information being produced by Wikipedia. We analyze the social and ethical consequences of this lack of transparency in Wikipedia for all users, but especially students; we assess the corporate social performance issues involved, and we propose courses of action to compensate for the potential problems. We show that Wikipedia has the appearance, but not the reality, of responsible, transparent information production.
Wong, Wilson; Liu, Wei & Bennamoun, Mohammed Tree-traversing ant algorithm for term clustering based on featureless similarities Data Mining and Knowledge Discovery Volume 15 2007 [445]
Many conventional methods for concepts formation in ontology learning have relied on the use of predefined templates and rules, and static resources such as {WordNet.} Such approaches are not scalable, difficult to port between different domains and incapable of handling knowledge fluctuations. Their results are far from desirable, either. In this paper, we propose a new ant-based clustering algorithm, {Tree-Traversing} Ant {(TTA),} for concepts formation as part of an ontology learning system. With the help of Normalized Google Distance {(NGD)} and n of Wikipedia {(nW)} as measures for similarity and distance between terms, we attempt to achieve an adaptable clustering method that is highly scalable and portable across domains. Evaluations with an seven datasets show promising results with an average lexical overlap of 97\% and ontological improvement of 48\%. At the same time, the evaluations demonstrated several advantages that are not simultaneously present in standard ant-based and other conventional clustering methods.
Dejan Milojicic Trend wars Web 2.0 and enterprise IT IEEE Distributed Systems Online Volume 8 2007
According to Wikipedia {(http://en.wikipedia.Org/wiki/Web\_2.0),} Web 2.0 refers to a perceived second generation of Web-based communities and hosted services-such as social networking sites wikis and folksonomies-which aim to facilitate collaboration and sharing between users." Web 2.0 is especially widely used in the consumer space for those people who are their own {IT} administrators. However this paper focus on enterprise IT"
Chen, Hsinchun Trends Controversies [Business and Market Intelligence 2.0] IEEE Intelligent Systems Volume 25 2010 [446]
Business Intelligence {(BI),} a term coined in 1989, has gained much traction in the {IT} practitioner community and academia over the past two decades. According to Wikipedia, {BI} refers to the skills technologies applications and practices used to support decision making" {(http://en.wikipedia.org/wiki/Business\_intelligence).} On the basis of a survey of 1400 {CEOs} the Gartner Group projected {BI} revenue to reach {US\$3} billion in 2009. Through {BI} initiatives businesses are gaining insights from the growing volumes of transaction product inventory customer competitor and industry data generated by enterprise-wide applications such as enterprise resource planning {(ERP)} customer relationship management {(CRM)} supply-chain management {(SCM)} knowledge management collaborative computing Web analytics and so on. The same Gartner survey also showed that {BI} surpassed security as the top business {IT} priority in 2006."
Henderson, L. Tribal Knowledge Applied Clinical Trials Volume 19 Pages 12 2010 [447]
Tribal knowledge is any unwritten information that is not commonly known by others within a company. This term is used most when referencing information that may need to be known by others in order to produce quality product or service. The information may be key to quality performance but it may also be totally incorrect. Unlike similar forms of artisan intelligence tribal knowledge can be converted into company property. It is often a good source of test factors during improvement efforts." That from Wikipedia sourced from Sixsigma.com. Six Sigma is the methodology that companies apply to achieve optimal efficiencies and performance. It all sounds well and good until they start talking about belts. I recently heard the term tribal knowledge applied to the outsourcing process. That is when transferring a job function to the outsourcer the outsourced is shadowed by the outsourcer for a time to acquire said tribal knowledge. That knowledge then becomes part of the outsourcers' knowledge. And the outsourcer can apply Six Sigma and kung fu the process right up to optimal efficiency I suppose."
Bates, Mary Ellen Truth and fiction on the Web Online (Wilton, Connecticut) Volume 30 Pages 64 2006
The views of Mary Ellen Bates on the research tool, Wikipedia, which prevents false information from being added to it and needs to be added to a new article in the registered English-language version of the project, are presented. He suggests that efforts must be made to the accuracy of the tool by checking articles that are about the organization. Much debate has occurred on blogs and e-mail discussion lists, still there are people who wouldn't know about the blog. The challenge we info pros is to manage the expectations of the clients and patrons, and teach them how to trust and verify.
Elveren, Erhan & YumuÅŸak, Nejat Tuberculosis Disease Diagnosis Using Artificial Neural Network Trained with Genetic Algorithm Journal of Medical Systems 2009 [448]
Tuberculosis is a common and often deadly infectious disease caused by mycobacterium; in humans it is mainly Mycobacterium tuberculosis {(Wikipedia} 2009). It is a great problem for most developing countries because of the low diagnosis and treatment opportunities. Tuberculosis has the highest mortality level among the diseases caused by a single type of microorganism. Thus, tuberculosis is a great health concern all over the world, and in Turkey as well. This article presents a study on tuberculosis diagnosis, carried out with the help of multilayer neural networks {(MLNNs).} For this purpose, an {MLNN} with two hidden layers and a genetic algorithm for training algorithm has been used. The tuberculosis dataset was taken from a state hospital's database, based on patient's epicrisis reports.
Houghton-Jan, Sarah Twenty Steps to Marketing Your Library Online Journal of Web Librarianship Volume 1 2008 [449]
Libraries are quite practiced at outreach activities in the physical world, but now, just as our services and resources have moved online, so must our outreach efforts. This article provides a list of twenty practical things libraries can do to begin to delve into the world of online outreach. Topics covered include listing your library in Wikipedia, listing library events in local community calendars, listing librarians in expert-finding directories, pushing newsletters out via {RSS,} being present in online game and other environments, and much more. The requirements for online outreach at libraries will always be evolving, but this starter list will provide a place for all libraries to begin their foray into online outreach and marketing.
Jaques, R. Twitter ye not [Web 2.0 and social networks] Financial Director Pages 22 2008
There is more than a slight hint of old wine in new bottles about the hype surrounding Web 2.0 technology. And the nagging scepticism is hardly surprising as firms of all sizes have long been successfully using {IT} systems such as email, shared documents and instant messaging to improve collaboration and communication among staff members. Given that these technologies are now well proven, why should firms need to resort to anything as exotic as a wiki (a collection of web pages that enables anyone who accesses it to contribute or modify content, says the ultimate wiki, Wikipedia), or as apparently puerile as a social networking site? It is argued in more conservative corporate circles that this new-fangled social networking fad should remain the preserve of Facebooking and Twittering teenagers. So the advice to savvy firms is clear: whether you like it or not, social software is coming your way and cannot be ignored. While the bottles in which the technology is packaged may appear a trifle dusty, the smart thing to do is raise your glasses to the brave new(ish) world of Web 2.0.
Zhirov, A. O.; Zhirov, O. V. & Shepelyansky, D. L. Two-dimensional ranking of Wikipedia articles The European Physical Journal B - Condensed Matter and Complex Systems 2010 [450]
Xu, Liang; Takeda, Hideaki; Hamasaki, Masahiro & Wu, Huayu Typing software articles with Wikipedia category structure NII Technical Reports 2010
In this paper we present a low-cost method for typing Named Entities with Wikipedia. Different from other text analysis-based approaches, our approach relies only on the structural features of Wikipidia and the use of external linguistic resources is optional. We perform binary classification of an article by analyzing of the names of its categories as well as the structure. The evaluation shows our method can be successfully applied to the 'software' category {(F} 80\%).
Yang, Heng-Li & Lai, Cheng-Yu Understanding Knowledge Sharing Behaviour in Wikipedia Behaviour \& Information Technology 2010 [451]
Wikipedia is the world's largest multilingual free-content encyclopaedia written by users collaboratively. It is interesting to investigate why individuals have willingness to spend their time and knowledge to engage in. In this study, we try to explore the influence of self-concept-based motivation and individual attitudes toward Wikipedia on individual's knowledge sharing intention in Wikipedia. Members from Wikipedia were invited to participate in the investigation. An online questionnaire and structural equation modelling {(SEM)} technology was utilized to test the proposed model and hypotheses. Analytical results indicate that internal self-concept-based motivation significantly influences individual's knowledge sharing intention. Further, both information and system quality have significant effects on individual's attitude toward Wikipedia, and therefore, influence the intention to share knowledge in it.
Shao, Guosong Understanding the appeal of user-generated media: a uses and gratification perspective Internet Research Volume 19 2009 [452]
Purpose - User-generated media {(UGM)} like {YouTube,} {MySpace,} and Wikipedia have become tremendously popular over the last few years. The purpose of this paper is to present an analytical framework for explaining the appeal of {UGM.} Design/methodology/approach - This paper is mainly theoretical due to a relative lack of empirical evidence. After an introduction on the emergence of {UGM,} this paper investigates in detail how and why people use {UGM,} and what factors make {UGM} particularly appealing, through a uses and gratifications perspective. Finally, the key elements of this study are summarized and the future research directions about {UGM} are discussed. Findings - This paper argues that individuals take with {UGM} in different ways for different purposes: they consume contents for fulfilling their information, entertainment, and mood management needs; they participate through interacting with the content as well as with other users for enhancing social connections and virtual communities; and they produce their own contents for self-expression and self-actualization. These three usages are separate analytically but interdependent in reality. This paper proposes a model to describe such interdependence. Furthermore, it argues that two usability attributes of {UGM,} easy to use" and "let users control enable people to perform the aforementioned activities efficiently so that people can derive greater gratification from their {UGM} use. Originality/value - {UGM} are an extremely important topic in new media scholarship, and this study represents the first step toward understanding the appeal of {UGM} in an integrated way.
Ruiz, Antonio Toral; Puşcaşu, Georgiana; Monteagudo, Lorenza Moreno; Beviá, Rubén Izquierdo & Boró, Estela Saquete University of Alicante at WiQA 2006 Evaluation of Multilingual and Multi-modal Information Retrieval 2007 [453]
This paper presents the participation of University of Alicante at the {WiQA} pilot task organized as part of the {CLEF} 2006 campaign. For a given set of topics, this task presupposes the discovery of important novel information distributed across different Wikipedia entries. The approach we adopted for solving this task uses Information Retrieval, query expansion by feedback, novelty re-ranking, as well as temporal ordering. Our system has participated both in the Spanish and English monolingual tasks. For each of the two participations the results are promising because, by employing a language independent approach, we obtain scores above the average. Moreover, in the case of Spanish, our result is very close to the best achieved score. Apart from introducing our system, the present paper also provides an in-depth result analysis, and proposes future lines of research, as well as follow-up {experiments.Categories} and Subject Descriptors: {H.3[Information} Storage and Retrieval]: H.3.1 Content Analysis and Indexing; H.3.3 Information Search and Retrieval; H.3.4 Systems and Software;
Anonymous Unplugging Leaks BioTechniques 2009 [454]
Johnson, D. Up on Angels Landing ISHN Volume 44 Pages 12 2010 [455]
Here's how Wikipedia describes the journey: {After} a series of steep switchbacks the trail goes through a gradual ascent. Walter's Wiggles a series of 21 steep switchbacks are the last hurdle before Scout's Lookout. Scout's Lookout is generally the turnaround point for those who are unwilling to make the final summit push to the top of Angels Landing. The last half-mile of the trail is strenuous and littered with sharp drop offs and narrow paths."
Kaplan, A.M. & Haenlein, M. Users of the world, unite! The challenges and opportunities of Social Media Business Horizons Volume 53 2010 [456]
The concept of Social Media is top of the agenda for many business executives today. Decision makers, as well as consultants, try to identify ways in which firms can make profitable use of applications such as Wikipedia, {YouTube,} Facebook, Second Life, and Twitter. Yet despite this interest, there seems to be very limited understanding of what the term Social Media exactly means; this article intends to provide some clarification. We begin by describing the concept of Social Media, and discuss how it differs from related concepts such as Web 2.0 and User Generated Content. Based on this definition, we then provide a classification of Social Media which groups applications currently subsumed under the generalized term into more specific categories by characteristic: collaborative projects, blogs, content communities, social networking sites, virtual game worlds, and virtual social worlds. Finally, we present 10 pieces of advice for companies which decide to utilize Social Media. {[All} rights reserved Elsevier].
Vallance, Michael Using a Database Application to Support Reflective Practice. TechTrends: Linking Research \& Practice to Improve Learning Volume 52 2008
The article discusses the tools that can be used to facilitate reflective practice, or reflection, as part of the learning process. Reflective practice can be used to support teachers as they begin to use more considered, cognitive actions in their teaching, making them better teachers. T. Farrell proposed a five-component model to support reflective practice, which is taught as part of a communication course at a modern science university in Japan using digital resources such as Wikipedia. These five steps are aimed at helping new undergraduate university students consider their learning as reflective practice. Recommendations for implementing reflective practice are also included.
Ciesielka, D. Using a Wiki to Meet Graduate Nursing Education Competencies in Collaboration and Community Health J Nurs Educ Volume 47 2008 [457]


Overell, S. & Ruger, S. Using co-occurrence models for place name disambiguation International Journal of Geographical Information Science Volume 22 2008 [458]
This paper describes the generation of a model capturing information on how place names co-occur together. The advantages of the co-occurrence model over traditional gazetteers are discussed and the problem of place name disambiguation is presented as a case study. We begin by outlining the problem of ambiguous place names. We demonstrate how analysis of Wikipedia can be used in the generation of a co-occurrence model. The accuracy of our model is compared to a handcrafted ground truth; then we evaluate alternative methods of applying this model to the disambiguation of place names in free text (using the {GeoCLEF} evaluation forum). We conclude by showing how the inclusion of place names in both the text and geographic parts of a query provides the maximum mean average precision and outline the benefits of a co-occurrence model as a data source for the wider field of geographic information retrieval(GIR).
Wells, S. & Reed, C. Using dialogical argument as an interface to complex debates Potentials, IEEE Volume 27 2008
Over the last two decades, many online argumentation systems have been developed that support humans in arguing with one another on specific topics. Many of these have been studies in the academic laboratory, though a few of the larger-scale projects have been used in the wild. More recently, spurred perhaps by high-visibility arguments with strong, explicit argumentative structure such as the Iraq Study Group Report, there has been spontaneous interest in argument coming from the online community. Two high profile systems are Convinceme.net and Debatepedia.com. Convinceme. net uses paired message boards to collect the arguments for and against an issue. These arguments are then voted upon with the most popular becoming king of the hill. Debatepedia.com uses a Wikipedia-style interface to enable users to build logic trees in which a thesis is broken down into a series of subquestions with the aim of collating a body of evidence regarding an issue and to help users to rapidly understand a complex debate.
Mathis, T. & Galloway, S. Using Podcasts to Improve Safety Professional Safety Volume 55 Pages 22 2010 [459]
Communicating information is a challenge that has plagued professionals for many years. Several innovative safety managers have identified a new solution: producing podcasts. Wikipedia defines podcast as a series of digital media files (either audio or video) that are released episodically and downloaded through Web syndication." This article discusses three major projects in which the authors discovered potential uses for podcasts. Based on experience the authors believe podcasts can help to improve safety in several ways: 1. overcome logistical challenges 2. ensure message uniformity 3. eliminate message drift 4. multiply or leverage leaders' and experts' ability to communicate 5. facilitate international messages 6. support traditional channels and media and 7. reduce communication costs."
Gilbert, Eric & Karahalios, Karrie Using social visualization to motivate social production IEEE Transactions on Multimedia Volume 11 2009 [460]
In this paper we argue that social visualization can motivate contributors to social production projects, such as Wikipedia and open source development. As evidence, we present {CodeSaw,} a social visualization of open source software development that we studied with real open source communities. {CodeSaw} mines open source archives to visualize group dynamics that currently lie buried in textual databases. Furthermore, {CodeSaw} becomes an active social space itself by supporting comments directly inside the visualization. To demonstrate {CodeSaw,} we apply it to a popular open source project, showing how the visualization reveals group dynamics and individual roles. The paper concludes by presenting evidence that {CodeSaw,} and social visualization more generally, can motivate contributors to social production projects if the visualization leaves the laboratory and makes it to the community visualized.
Perea-Ortega, Jose M.; Montejo-Raez, Arturo; Martin-Valdivia, M.Teresa & Urena-Lopez, L.Alfonso Using web sources for improving video categorization Journal of Intelligent Information Systems Volume 36, Number 1, 117-130 2010 [461]
In this paper, several experiments about video categorization using a supervised learning approach are presented. To this end, the {VideoCLEF} 2008 evaluation forum has been chosen as experimental framework. After an analysis of the {VideoCLEF} corpus, it was found that video transcriptions are not the best source of information in order to identify the thematic of video streams. Therefore, two web-based corpora have been generated in the aim of adding more informational sources by integrating documents from Wikipedia articles and Google searches. A number of supervised categorization experiments using the test data of {VideoCLEF} have been accomplished. Several machine learning algorithms have been proved to validate the effect of the corpus on the final results: Naive Bayes, K-nearest-neighbors {(KNN),} Support Vectors Machine {(SVM)} and the j48 decision tree. The results obtained show that web can be a useful source of information for generating classification models for video data.
Wang, Pu; Hu, Jian; Zeng, Hua-Jun & Chen, Zheng Using Wikipedia knowledge to improve text classification Knowledge and Information Systems Volume 19 2009 [462]
Text classification has been widely used to assist users with the discovery of useful information from the Internet. However, traditional classification methods are based on the {œBag} of Words? {(BOW)} representation, which only accounts for term frequency in the documents, and ignores important semantic relationships between key terms. To overcome this problem, previous work attempted to enrich text representation by means of manual intervention or automatic document expansion. The achieved improvement is unfortunately very limited, due to the poor coverage capability of the dictionary, and to the ineffectiveness of term expansion. In this paper, we automatically construct a thesaurus of concepts from Wikipedia. We then introduce a unified framework to expand the {BOW} representation with semantic relations (synonymy, hyponymy, and associative relations), and demonstrate its efficacy in enhancing previous approaches for text classification. Experimental results on several data sets show that the proposed approach, integrated with the thesaurus built from Wikipedia, can achieve significant improvements with respect to the baseline algorithm.
Weld, Daniel S.; Hoffmann, Raphael & Wu, Fei Using Wikipedia to bootstrap open information extraction ACM SIGMOD Record Volume 37 2009 [463]
An abstract is not available.
Lally, A.M. & Dunford, C.E. Using Wikipedia to extend digital collections D-Lib Magazine Volume 13 Pages 05/06/2011 2007
In May 2006, the University of Washington Libraries Digital Initiatives unit began a project to integrate the {UW} Libraries Digital Collections into the information workflow of our students by inserting links into the online encyclopedia Wikipedia. The idea for this project grew out of our reading of {OCLC's} 2005 report Perceptions of Libraries and Information Resources which states that only 2\% of college and university students begin searching for information at a library Web site. It is, therefore, incumbent upon Librarians to look for new ways to reach out to our users where they begin their information search. The explosive growth of Wikipedia made it a prime candidate for our efforts at pushing information about the Libraries out to where users conduct their research. It should be noted here that our digital collections are already harvested and heavily used by people all over the world; in fact, Google and its affiliates are the top referrers of people to our collections.
Jennings, E. Using Wikipedia to teach information literacy College \& Undergraduate Libraries Volume 15 2008 [464]
Today's college student often starts his research by using a search engine. Because of this, Wikipedia is increasingly becoming the go to reference resource for the newest generation of students. However, many students do not know about the problems (e.g., vandalism) associated with this tool other than ambiguous warnings from librarians and faculty who say that it should not be used for research. Librarians and faculty should help remove the stigma associated with Wikipedia by embracing this Website and its imperfections as a way to make information literacy instruction valuable for the twenty-first-century student.
Younger, Paula Using wikis as an online health information resource. Nursing Standard Volume 24 2010
Wikis can be a powerful online resource for the provision and sharing of information, with the proviso that information found on them should be independently verified. This article defines wikis and sets them in context with recent developments on the internet. The article discusses the use of Wikipedia and other wikis as potential sources of health information for nurses.
Louridas, Panagiotis Using Wikis in software development IEEE Software Volume 23 2006 [465]
Wikis have become one of the most popular tool shells. You can find them just about everywhere that demands effective collaboration and knowledge sharing at a low budget. Wikipedia has certainly enhanced their popularity, but they also have a place in intranet-based applications such as defect tracking, requirements management, test-case management, and project portals. The author describes wiki essentials and nicely distinguishes a variety of types.
Rech, J.; Bogner, C. & Haas, V. Using Wikis to tackle reuse in software projects IEEE Software Volume 24 2007
Software projects in small-and medium-sized enterprises {(SMEs)} produce similar work products when building interactive software systems. For each project, software engineers create requirements, design specifications, source code, data schemes, and so forth, gain experience with these work products, and create associated products such as test cases for the source code or inspection plans for the requirements. All of this constitutes knowledge that the engineers can reuse in new variants of the software system.
Ebner, Martin; Kickmeier-Rust, Michael & Holzinger, Andreas Utilizing Wiki-Systems in higher education classes: A chance for universal access? Universal Access in the Information Society Volume 7 2008 [466]
{{{2}}}


Kalantidis, Yannis; Tolias, Giorgos; Avrithis, Yannis; Phinikettos, Marios; Spyrou, Evaggelos; Mylonas, Phivos & Kollias, Stefanos VIRaL: Visual Image Retrieval and Localization Multimedia Tools and Applications 2010 [467]
New applications are emerging every day exploiting the huge data volume in community photo collections. Most focus on popular subsets, e.g., images containing landmarks or associated to Wikipedia articles. In this work we are concerned with the problem of accurately finding the location where a photo is taken without needing any metadata, that is, solely by its visual content. We also recognize landmarks where applicable, automatically linking them to Wikipedia. We show that the time is right for automating the geo-tagging process, and we show how this can work at large scale. In doing so, we do exploit redundancy of content in popular locations”but unlike most existing solutions, we do not restrict to landmarks. In other words, we can compactly represent the visual content of all thousands of images depicting e.g., the Parthenon and still retrieve any single, isolated, non-landmark image like a house or a graffiti on a wall. Starting from an existing, geo-tagged dataset, we cluster images into sets of different views of the same scene. This is a very efficient, scalable, and fully automated mining process. We then align all views in a set to one reference image and construct a {2D} scene map. Our indexing scheme operates directly on scene maps. We evaluate our solution on a challenging one million urban image dataset and provide public access to our service through our online application, VIRaL.
Perona, P. Vision Of A Visipedia Proceedings of the IEEE Volume 98 2010 [468]
The web is not perfect: while text is easily searched and organized, pictures (the vast majority of the bits that one can find online) are not. In order to see how one could improve the web and make pictures first-class citizens of the web, I explore the idea of Visipedia, a visual interface for Wikipedia that is able to answer visual queries and enables experts to contribute and organize visual knowledge. Five distinct groups of humans would interact through Visipedia: users, experts, editors, visual workers, and machine vision scientists. The latter would gradually build automata able to interpret images. I explore some of the technical challenges involved in making Visipedia happen. I argue that Visipedia will likely grow organically, combining state-of-the-art machine vision with human labor.
Kimmerle, Joachim; Moskaliuk, Johannes; Harrer, Andreas & Cress, Ulrike VISUALIZING CO-EVOLUTION OF INDIVIDUAL AND COLLECTIVE KNOWLEDGE Information, Communication \& Society 2010 [469]
This paper describes how processes of knowledge building with wikis may be visualized, citing the user-generated online encyclopedia Wikipedia as an example. The underlying theoretical basis is a framework for collaborative knowledge building with wikis that describes knowledge building as a co-evolution of individual and collective knowledge. These co-evolutionary processes may be visualized graphically, applying methods from social network analysis, especially those methods that take dynamic changes into account. For this purpose, we have undertaken to analyse, on the one hand, the temporal development of a Wikipedia article and related articles that are linked to this core article. On the other hand, we analysed the temporal development of those users who worked on these articles. The resulting graphics show an analogous process, both with regard to the articles that refer to the core article and to the users involved. These results provide empirical support for the co-evolution model.
Spoerri, A. Visualizing the overlap between the 100 most visited pages on Wikipedia for September 2006 to January 2007 First Monday Volume 12 2008
This paper compares the monthly lists of the 100 most visited Wikipedia pages for the period of September 2006 to January 2007. {searchCrystal} is used to visualize the overlap between the five monthly Top 100 lists to show which pages are highly visited in all five months; which pages in four of the five months and so on. It is shown that almost 40 percent of a month's top 100 pages are visited in all five months, whereas 25 percent are highly visited only in a single month. The presented visualizations make it possible to gain quick insights into the overlap and topical relationships between the monthly lists.
Schroer, J. & Hertel, G. Voluntary engagement in an open Web-based encyclopedia: Wikipedians and why they do it Media Psychology Volume 12 2009
The online encyclopedia Wikipedia is a highly successful œopen content? project, written and maintained completely by volunteers. Little is known, however, about the motivation of these volunteers. Results from an online survey among 106 contributors to the German Wikipedia project are presented. Both motives derived from social sciences (perceived benefits, identification with Wikipedia, etc.) as well as perceived task characteristics (autonomy, skill variety, etc.) were assessed as potential predictors of contributors' satisfaction and self-reported engagement. Satisfaction ratings were particularly determined by perceived benefits, identification with the Wikipedia community, and task characteristics. Engagement was particularly determined by high tolerance for opportunity costs and by task characteristics, the latter effect being partially mediated by intrinsic motivation. Relevant task characteristics for contributors' engagement and satisfaction were perceived autonomy, task significance, skill variety, and feedback. Models from social sciences and work psychology complemented each other by suggesting that favorable task experiences might counter perceived opportunity costs in Wikipedia contributors. Moreover, additional data reported by Wikipedia authors indicate the importance of generativity motives.
Baytiyeh, H. Volunteers in Wikipedia: Why the Community Matters Educational Technology & Society, Volume 13 (2), 128–140 2010 [470]
Greenstein, Shane Wagging Wikipedia's long tail IEEE Micro Volume 27 2007 [471]
In 2005, Wikipedia surpassed Encarta as the Internet's most popular reference site. Wikipedia calls itself the free encyclopedia that anyone can edit and it has grown rapidly since its founding in 2001. As an educator and parent, Greenstein finds himself struggling to come to terms with the economics of Wikipedia, which have shaped a resource that is at times very good, but occasionally poor. The inconsistency is a result of Wikipedia's long tail, a characteristic that requires some explanation. And thereby hangs a tale.
Eijkman, H. Web 2.0 as a non-foundational network-centric learning space Campus-Wide Information Systems Volume 25 2008 [472]
This paper aims to initiate a timely discussion about the epistemological advantages of Web 2.0 as a non-foundational network-centric learning space in higher education. A philosophical analysis of the underpinning design principles of Web 2.0 social media and of conventional foundational and emergent non-foundational learning and which uses Wikipedia as a case study. For academics in higher education to take a more informed approach to the use of Web 2.0 in formal learning settings and begin to consider integrating Web 2.0's architecture of participation with a non-foundational architecture of learning, focused on acculturation into networks of practice. The paper argues that the continuing dominance and therefore likely application of conventional old paradigm foundational learning theory will work against the grain of, if not undermine, the powerful affordances Web 2.0 social media provides for learning focused on social interaction and collaborative knowledge construction. The paper puts the case for non-foundational learning and draws attention to the importance of aligning Web 2.0's architecture of participation with a non-foundational architecture of acculturation as the latter is better epistemologically placed to more fully realise the potential of Web 2.0 to position students on trajectories of acculturation into their new networks of practice. This paper exposes the epistemological dilemma Web 2.0's participatory culture poses for academics wedded to conventional ideas about the nature of knowledge and learning as is, for instance, clearly evidenced by their sceptical disposition towards or outright rejection of, Wikipedia.
Bleicher, Paul Web 2.0 Revolution: Power to the People. Applied Clinical Trials Volume 15 2006
The article highlights developments in the use of websites and weblogs citing the range of services available on the world wide web. One information source is the Wikipedia, which is an online encyclopedia of information that uses an open source software. On the other hand, blogging software sites allow an unsophisticated user to set up a blog in minutes and begin publishing on a topic of their choice almost immediately.
Wijaya, Senoaji; Spruit, Marco; Scheper, Wim & Versendaal, Johan Web 2.0-based webstrategies for three different types of organizations Computers in Human Behavior 2010 [473]
Lately, web technology has gained strategic importance. It can be seen in the growing number of organizations that realize the importance of a proper webstrategy in this globalization era, where distributed work environment, knowledge-based economy and collaborative business models have emerged. The phenomenon of web 2.0 technologies has led many internet companies and communities, such as Google, Amazon, Wikipedia, and Facebook, to successfully adjust their webstrategy by adopting web 2.0 concepts to sustain their advantage and reach their objectives. As a consequence, interest has risen from more traditional organizations to benefit from web 2.0 concepts in enhancing their competitive advantage. This paper discusses the influence of web 2.0 concepts in the webstrategy formulation for organizations with differing requirements, characteristics and objectives. The research categorizes organization types into Customer Intimacy, Operational Excellence and Product Leadership, based on the Value Disciplines model. 2010 Elsevier Ltd. All rights reserved.
Dohn, N.B. Web 2.0-mediated competence-implicit educational demands on learners Electronic Journal of e-Learning Volume 7 2009
The employment of Web 2.0 within higher educational settings has become increasingly popular. Reasons for doing so include student motivation, didactic considerations of facilitating individual and collaborative knowledge construction, and the support Web 2.0 gives the learner in transgressing and resituating content and practices between the formal and informal learning settings in which s/he participates. However, introducing Web 2.0-practices into educational settings leads to tensions and challenges in practice because of conceptual tensions between the views of knowledge and learning inherent in Web 2.0-practices and in the educational system: Implicit in Web 2.0-practices is a conception of 'knowledge' as, on the one side, process and activity, i.e. as use, evaluation, transformation and reuse of material, and, on the other, the product side, as a distributed attribute of a whole system (such as Wikipedia) or community of practice (such as the community of practice of Wikipedia contributors). In contrast, 'knowledge' within the educational system is traditionally viewed as a state possessed by the individual, and learning as the acquisition of this state. This paper is an analysis of the challenges which these tensions lead to for the learners. The argument is that Web 2.0-mediated learning activities within an educational setting place implicit competence demands on the students, along with the more explicit ones of reflexivity, participation and knowledge construction. These demands are to some extent in conflict with each other as well as with the more explicit ones. A simple example of such conflicting competence demands is experienced when students develop a course wiki: The Web 2.0-competence demands here concern the doing something with the material. The copy-pasting of e.g. a Wikipedia-article without referencing it from this point of view is a legitimate contribution to the knowledge building of the course wiki. In contrast, educational competence demands require the student to participate actively in the formulation of the course wiki-articles. Copy-pasting without reference from this point of view is cheating. Here, the student is met with the incoherent requirement of authoring entries that display the acquisition of a knowledge state in a context where authorship is renounced and knowledge is understood dynamically and distributively. More generally, in Web 2.0-mediated educational learning activities, the student is required to manoeuvre in a field of interacting, yet conflicting, demands, and the assessment of his/her competence stands the risk of being more of an evaluation of the skill to so manoeuvre than of skills and knowledge explicitly pursued in the course.
Bar-Ilan, Judit Web links and search engine ranking: The case of Google and the query Jew"" Journal of the American Society for Information Science and Technology Volume 57 2006 [474]
The World Wide Web has become one of our more important information sources, and commercial search engines are the major tools for locating information; however, it is not enough for a Web page to be indexed by the search engines-it also must rank high on relevant queries. One of the parameters involved in ranking is the number and quality of links pointing to the page, based on the assumption that links convey appreciation for a page. This article presents the results of a content analysis of the links to two top pages retrieved by Google for the query jew" as of July 2004: the "jew" entry on the free online encyclopedia Wikipedia and the home page of {"Jew} Watch a highly {anti-Semitic} site. The top results for the query jew" gained public attention in April 2004 when it was noticed that the {"Jew} Watch" homepage ranked number 1. From this point on both sides engaged in {"Googlebombing"} (i.e. increasing the number of links pointing to these pages). The results of the study show that most of the links to these pages come from blogs and discussion links and the number of links pointing to these pages in appreciation of their content is extremely small. These findings have implications for ranking algorithms based on link counts and emphasize the huge difference between Web links and citations in the scientific community."
Anonymous Web Watch Quality Progress Volume 38 Pages 22 2005
Wikis are new forms of social software, designed to promote information sharing. They are Web pages that can be edited by the user. Everyone can share information in real time and with little trouble. Wikis remove the bureaucracy and promote the community of people with like interests. Information on a wiki site does not start out as authoritative. The software makes no attempt to check the accuracy and stature of the person who shares information. Information presented is corrected and refined by peers. This is peer review on a grand scale in that the community can be worldwide. The most successful wiki today is the Wikipedia Project, a huge encyclopedia generated by the citizens of the world at http://en.wikipedia.org. Wikis can be used to promote the understanding of quality and how it's practiced.
Wang, Yu-Chun; Tsai, Richard Tzong-Han & Hsu, Wen-Lian Web-based pattern learning for named entity translation in Korean-Chinese cross-language information retrieval Expert Systems with Applications Volume 36 Pages 2 {PART} 2 2009 [475]
Named entity {(NE)} translation plays an important role in many applications, such as information retrieval and machine translation. In this paper, we focus on translating {NEs} from Korean to Chinese in order to improve {Korean-Chinese} cross-language information retrieval {(KCIR).} The ideographic nature of Chinese makes {NE} translation difficult because one syllable may map to several Chinese characters. We propose a hybrid {NE} translation system. First, we integrate two online databases to extend the coverage of our bilingual dictionaries. We use Wikipedia as a translation tool based on the inter-language links between the Korean edition and the Chinese or English editions. We also use Naver.com's people search engine to find a query name's Chinese or English translation. The second component of our system is able to learn {Korean-Chinese} {(K-C),} {Korean-English} {(K-E),} and {English-Chinese} {(E-C)} translation patterns from the web. These patterns can be used to extract {K-C,} {K-E} and {E-C} pairs from Google snippets. We found {KCIR} performance using this hybrid configuration over five times better than that a dictionary-based configuration using only Naver people search. Mean average precision was as high as 0.3385 and recall reached 0.7578. Our method can handle Chinese, Japanese, Korean, and {non-CJK} {NE} translation and improve performance of {KCIR} substantially. 2008 Elsevier Ltd. All rights reserved.
Svoboda, Elizabeth Websights: One-click content, no guarantees IEEE Spectrum Volume 43 2006
Wikipedia is the first-ever major reference work with a democratic premise, that anyone could contribute an article or edit an entry. Since inception, Wikipedia has generated shared scholarly efforts to rival those of any literally or philosophical movement in history. However, its signature strength is also its greatest vulnerability. User-generated articles are often inaccurate or irrelevant, and vandals are a constant threat. As a result, the role of the encyclopedia's gatekeepers assumes added importance. Readers are advised to check their online finds against other sources and to be aware of Wikipedia's unique strengths and weaknesses, especially when gathering information for research projects.
Finlayson, Alexander Westminster and Wikipedia: The Westminster Seminar in the Twenty-First Century Westminster Theological Journal 2007
Surveys how libraries have developed with particular reference to the relationship between the church and the library, and notes specific challenges and trends facing academic and seminary libraries today. Information on paper is being displaced, at least to some degree, in favor of electronic information. While the age of the books is not over, it is true that information in increasingly accessible in digital formats. Reflects on ehy Westminster still needs its library.
Spoerri, A. What is popular on Wikipedia and why? First Monday Volume 12 2007
This paper analyzes which pages and topics are the most popular on Wikipedia and why. For the period of September 2006 to January 2007, the 100 most visited Wikipedia pages in a month are identified and categorized in terms of the major topics of interest. The observed topics are compared with search behavior on the Web. Search queries, which are identical to the titles of the most popular Wikipedia pages, are submitted to major search engines and the positions of popular Wikipedia pages in the top 10 search results are determined. The presented data helps to explain how search engines, and Google in particular, fuel the growth and shape what is popular on Wikipedia.
Maehre, J. What It Means to Ban Wikipedia: an Exploration of the Pedagogical Principles at Stake College Teaching Volume 57 2009 [476]


Badke, W What to do with Wikipedia ONLINE Volume 32 2008 [477]
Hochstotter, Nadine & Lewandowski, Dirk What users see - Structures in search engine results pages Information Sciences Volume 179 2009 [478]
This paper investigates the composition of search engine results pages. We define what elements the most popular web search engines use on their results pages (e.g., organic results, advertisements, shortcuts) and to which degree they are used for popular vs. rare queries. Therefore, we send 500 queries of both types to the major search engines Google, Yahoo, Live.com and Ask. We count how often the different elements are used by the individual engines. In total, our study is based on 42,758 elements. Findings include that search engines use quite different approaches to results pages composition and therefore, the user gets to see quite different results sets depending on the search engine and search query used. Organic results still play the major role in the results pages, but different shortcuts are of some importance, too. Regarding the frequency of certain host within the results sets, we find that all search engines show Wikipedia results quite often, while other hosts shown depend on the search engine used. Both Google and Yahoo prefer results from their own offerings (such as {YouTube} or Yahoo Answers). Since we used the .com interfaces of the search engines, results may not be valid for other country-specific interfaces. 2009 Elsevier Inc. All rights reserved.
Royal, C. & Kapila, D. What's on Wikipedia, and what's not ... ? Assessing completeness of information Social Science Computer Review Volume 27 2009 [479]
The World Wide Web continues to grow closer to achieving the vision of becoming the repository of all human knowledge, as features and applications that support user-generated content become more prevalent. Wikipedia is fast becoming an important resource for news and information. It is an online information source that is increasingly used as the first, and sometimes only, stop for online encyclopedic information. Using a method employed by Tankard and Royal to judge completeness of Web content, completeness of information on Wikipedia is assessed. Some topics are covered more comprehensively than others, and the predictors of these biases include recency, importance, population, and financial wealth. Wikipedia is more a socially produced document than a value-free information source. It reflects the viewpoints, interests, and emphases of the people who use it.
Lacovara, Jane E When searching for the evidence, stop using Wikipedia! Medsurg Nursing: Official Journal of the Academy of Medical-Surgical Nurses Volume 17 Pages 153 2008 [480]
Purdy, James P When the Tenets of Composition Go Public: A Study of Writing in Wikipedia College Composition and Communication Volume 61 2009
Based on a study of observable changes author-users made to three Wikipedia articles, this article contends that Wikipedia supports notions of revision, collaboration, and authority that writing studies purports to value, while also extending our understanding of the production of knowledge in public spaces. It argues that Wikipedia asks us to reexamine our expectations for the stability of research materials and who should participate in public knowledge making. {(Contains} 2 tables and 8 notes.)
Huvila, I. Where does the information come from? Information source use patterns in Wikipedia Information Research Volume 15 2010
Introduction. Little is known about Wikipedia contributors' information behaviour and from where and how the information in the encyclopaedia originated. Even though a large number of texts in Wikipedia cite external sources according to the intentions of the verifiability policy, many articles lack references and in many others the references have been added afterwards. Method. This article reports the results of a Web survey of information source use patterns, answered by 108 Wikipedia contributors in spring 2008. Analysis. The qualitative questions were analysed using a close reading and grounded theory approach. The multiple-choice questions were analysed using descriptive statistics and bi-variate correlation analysis. Results. The results indicate that there are several distinct groups of contributors using different information sources. The results also indicate a preference for sources available online. However, in spite of the popularity of online material a significant proportion of the original information is based on printed literature, personal expertise and other non-digital sources of information. The information source use of Wikipedia contributors is also illustrative of the complexity and life-world scope of human information behaviour. Conclusions. Understanding the information source use of contributors helps us to understand how new Wikipedia articles emerge, how edits are motivated, where the information actually comes from and more generally, what kind of information may be expected to be found in Wikipedia.
Sampson, Fred Whither the web? Interactions Volume 13 2006 [481]
Some of the challenges and the opportunities for the web operators for an effective communication and the important role of Web 2.0 applications are discussed. The developing Web involves expanding collaboration, universal sharing, maleable identities, and ubiquitous connectivity. Internet use and technology development is driven by people, by sharing and collaborating. Wikipedia is being considered as a valuable resource among the users as it promotes sharing of information in near real time. Findability and delivery are an important feature from the perspective of an information developer. It is suggested that there is still a strong inclination for people to keep their creations under their control by posting to restricted Web sites or databases.
Waldman, Simon Who knows? Guardian 2004
It has no editors, no fact checkers and anyone can contribute an entry, or delete one. It should have been a recipe for disaster, but instead Wikipedia became one of the Internet's most inspiring success stories. Explains how it works. {(Original} abstract - amended)
Calkins, Susanna & Kelley, Matthew R Who Writes the Past? Student Perceptions of Wikipedia Knowledge and Credibility in a World History Classroom Journal on Excellence in College Teaching 2009 [482]
The authors describe an inquiry-based learning project that required students in a first-year world history course to reflect on and analyze critically the nature of the knowledge found in Wikipedia--the free, open-content, rapidly evolving, internet encyclopedia. Using a rubric, the authors explored students' perceptions of the collaborative and community nature of Wikipedia as well as Wikipedia's accuracy, reputability, ease, and accessibility. Furthermore, they examined students' reflections on issues of plagiarism, responsibility, and whether Wikipedia qualifies as a scholarly source. Student perceptions were closely related to their level of intellectual and ethical development as defined by Perry (1970, 1998).


Chakrabarti, Manali Why Did Indian Big Business Pursue a Policy of Economic Nationalism1 in the Interwar Years? A New Window to an Old Debate Modern Asian Studies Volume 43 Pages 979 2008 [483]
Demartini, G.; Firan, C.; Iofciu, T.; Krestel, R. & Nejdl, W. Why finding entities in Wikipedia is difficult, sometimes Information Retrieval Volume 13 Pages 534 2010 [484]
Entity Retrieval {(ER)--in} comparison to classical search--aims at finding individual entities instead of relevant documents. Finding a list of entities requires therefore techniques different to classical search engines. In this paper, we present a model to describe entities more formally and how an {ER} system can be build on top of it. We compare different approaches designed for finding entities in Wikipedia and report on results using standard test collections. An analysis of entity-centric queries reveals different aspects and problems related to {ER} and shows limitations of current systems performing {ER} with Wikipedia. It also indicates which approaches are suitable for which kinds of queries.
Masic, Izet; Dilic, Mirza; Solakovic, Emir; Rustempasic, Nedzad & Ridjanovic, Zoran Why historians of medicine called Ibn al-Nafis second Avicenna? Medicinski Arhiv Volume 62 2008 [485]
At the end of {IX} and beginning of the X century begins development and renaissance of the medicine called Arabic, and which main representatives were: Ali {at-Taberi,} Ahmed {at-Taberi,} {Ar-Razi} {(Rhazes),} Ali ibn {al-Abbas} {al-Magusi} {(Haly),} ibn {al-Baitar,} ibn {al-Qasim} {al-Zahrawi} {(Abulcasis),} ibn Sina {(Avicenna),} ibn {al-Haitam} {(Alhazen),} ibn Abi {al-Ala} Zuhr {(Avenzor),} ibn Rushd {(Averroes)} and ibn {al-Nafis.} Doctors Taberi, Magusi and Razi were born as Persians. Each of the listed great doctors of the Arab medicine in their own way made legacy to the medical science and profession, and left lasting impression in the history of medicine. Majority of them is well known in the West well and have their place in the text-books as donors of significant medical treasure, without which medicine would probably, especially the one at the Middle dark century, be pale and prosaic, insufficiently studied and misunderstood, etc. Abdullah ibn Sina {(Avicenna)} remained unsurpassed in the series of above listed. Close to him can only come Alauddin ibn {al-Nafis,} who will in {mid-XII} century rebut some of the theories made by Avicenna and all his predecessors, from which he collected material for his big {al-Kanun} fit-tibb {(Cannon} of medicine). Cannon will be commended for centuries and fulfilled with new knowledge. One of the numerous and perhaps the best {comments-Excerpts} is from {Nafis-Mugaz} {al-Quanun,} article published as a reprint in War Sarajevo under the siege during 1995 in Bosnian language, translated from Arabic by the professor Sacir Sikiric and chief physician Hamdija Karamehmedovic in 1961. Today, at least 740 years since professor from Cairo and director of the Hospital {A-Mansuri} in Cairo Alauddin ibn Nefis (1210-1288), in his paper about pulse described small (pulmonary) blood circulatory system and coronary circulation. At the most popular search engines very often we can find its name, especially in English language. Majority of quotes about {al-Nafis} are on Arabic or Turkish language, although Ibn Nafis discovery is of world wide importance. Author of this article is among rare ones who in some of the indexed magazines emphasized of that event, and on that debated also some authors from Great Britain and {USA} in the respectable magazine Annals of Internal medicine. Citations in majority mentioning other two describers" or "discoverers" of pulmonary blood circulation Miguel de Servet (1511-1553) physician and theologian and William Harvey (1578-1657) which in his paper {"An} Anatomical Exercise on the Motion of the Hearth and Blood in Animals" published in 1628 described blood circulatory system. Ibn Nafis is due to its scientific work called {"Second} Avicenna". Some of his papers during centuries were translated into Latin and some published as a reprint in Arabic language. Significance of Nafis epochal discovery is the fact that it is solely based on deductive impressions because his description of the small circulation is not occurred by in vitro observation on corps during section. It is known that he did not pay attention to the Galen theories about blood circulation. His prophecy sentence say: {"If} I don't know that my work will not last up to ten thousand years after me I would not write them" Sapient sat. Searching the newest data about all three authors: Alauddin ibn Nafis (1210-1288) Michael Servetus (1511-1533) and William Harvey (1628) in the prestige Wikipedia I manage to link several most relevant facts based on which we can in more details explain to whom from these three authors the glory and the right to call them self first describer of the pulmonary and cardiac circulation belongs. About Servetus and Harvey there is much more data than on ibn Nafis about which on Google there are mainly references in Arabic and Turkish language and my four references on Bosnian with the abstracts in English. Probably the language barrier was one of the key reasons that we know so little about Nafis and so little is written although respectable professor Fuat Sezgin from Frankfurt in 1997 published comprehensive monograph about this great physician scientist and explorer in which papers we can clearly recognize detailed description of the pulmonary and cardiac circulation. Also I personally published separate monographs about this scientist and which can be found on www. avicenapublisher.org."


Aycock, John & Aycock, Alan Why I Love/Hate Wikipedia: Reflections upon (Not Quite) Subjugated Knowledges Journal of the Scholarship of Teaching and Learning Volume 8 2008
Wikipedia is a well-known online encyclopedia, whose content is contributed and edited by volunteers. Its use by students for their research is, to be polite, controversial. Is Wikipedia really evil, or is it a teaching opportunity in disguise, a representation of some deeper cultural change? We present first-hand accounts from two different disciplines, computer science and anthropology, to illustrate how experiences with Wikipedia may be crossdisciplinary. We use these to reflect upon the nature of Wikipedia and its role in teaching.
Lee, Julian C. H. Why Isn't Panesar a Pommie Bastard? Multiculturalism and the Implications of Cricket Australia's Racial Abuse Policy Anthropology Today Volume 24 2008 [486]
Manthous, CA Why not physician-assisted death? CRITICAL CARE MEDICINE Volume 37 2009 [487]
Objective: The Hippocratic Oath states ...1 will neither give a deadly drug to anybody who asked for it nor will I make a suggestion to this effect" {(http://en.wikipedia.org/wiki/Hippocratic\_Oath).} Physician-assisted suicide and euthanasia are topics that engender a strong negative response on the part of many physicians and patients. This article explores contributions of religion Western medical mores law and emerging concepts of moral neurocognition that may explain our inherent aversion to these ideas. Sources. Religious texts legal opinions manifestos of medical ethics medical literature and lay literature. Conclusion. Our collective repudiation of physician-assisted death in all its forms has complex origins that are not necessarily rational. If great care is taken to ensure that a request for physician-assisted death is persistent despite exhaustion of all available therapeutic modalities then an argument can be made that our rejection constrains unnecessarily the liberty of a small number of patients. {(Crit} Care Med 2009; 37:1206-1209)"
Munk, Timme Bisgaard Why wikipedia: Self-efficacy and self-esteem in a knowledge-political battle for an egalitarian epistemology Observatorio (OBS*) Volume 3 2009 [488]
What makes people contribute voluntarily to Wikipedia? A new qualitative empirical study uncovers new motives, publication strategies and social dynamics in Wikipedia. In addition to the motives treated in the existing scientific literature such as status through status play, altruism through ideological identification, identity through community, the analysis uncovers three other motives through theoretical probability-making and empirical demonstration. Consequently, the following three motives must be added to the repertoire of possible motives for contributing voluntarily to Wikipedia. Firstly, the contributors experience a unique and cheap feeling of self-efficacy. They feel that they are efficient and able to handle the tasks that they take upon themselves. This feeling is caused by the fact that many types of contributions may be experienced as a successful contribution, from small text corrections to authoring of complete lexicon articles. Secondly, the contributors get a unique and cheap experience of self-esteem. A feeling that their modest input has a great impact because they are contributing to the creation of a global knowledge good. Thirdly, they are motivated by the ideology that all people have something to bring to Wikipedia. This may be called an egalitarian epistemology. These three motives in combination with the motives described in the literature provide a better and more balanced answer to the above question. The case is the Danish version of Wikipedia and the qualitative survey consists of six qualitative interviews with six contributors.wiki, wikipedia, web, open content


Farrelly, M. G Wiki What? ublic Libraries Volume 47 2008 [489]
Bryant, Antony Wiki and the Agora: 'It's organising, Jim, but not as we know it' Development in Practice Volume 16 2006 [490]
This article argues that those keen to characterise and harness the empowering potential of Information and Communications Technology {[ICT]} for development projects must understand that the very existence of this technology opens up alternative models of co-operation and collaboration. These models themselves necessitate breaking away from 'traditional' command-and-control models of management. One alternative is to persuade participants, or potential participants, to co-ordinate their efforts along the lines exemplified by the open-source software movement and the contributors to Wikipedia: models of co-ordination that ought not to work but appear to do so. The article offers a summary of this argument, and then suggests ways in which {NGOs} in particular might try to incorporate these insights into their strategies. This is particularly critical for organisations that rely on increasingly pressurised funding opportunities, and which also seek to develop and engender participation and determination from within and among specific target groupings.
Arazy, Ofer; Gellatly, Ian; Jang, Soobaek & Patterson, Raymond Wiki Deployment in Corporate Settings. IEEE Technology \& Society Magazine Volume 28 2009
The article explores the deployment of Wikipedia, an online encyclopedia in corporate set up. The authors found that the medium emerged as powerful collaborative technology. They determined in corporate setting it was used for variety of purposes, such as from portals, to project management and knowledge-base creation. They also examined that regularly attract users who are primarily persuaded by making work easier and helping the organization achieve its goals, while social reputation did not seem to a significant motivational aspect.
Techanamurthy, Umawathy; Mohamad, Baharom & Hashim, Mohamad Hisyam Mohd. Wiki Experience in a Statistics Classroom: A Case Study Global Learn Asia Pacific 2010 [491]
Hardy, Mat Wiki Goes to War AQ - Journal of Contemporary Analysis Volume 79 2007
Since launching nearly six years ago, Wikipedia has exhibited sustained growth as an interest encyclopedic resource. Amongst the millions of pages, the 2006 {Israel-Lebanon} conflict is one of the most revised \& popular topics of all, ranking even above the Second World War. Why is this \& what do Wikipedia \& its daughter project, Wikinews, have to offer history, academia \& journalism in their coverage of the Middle East? Adapted from the source document.
Jancarik, A. & Jancarikova, K. Wiki Tools in the Preparation and Support of e-Learning Courses Electronic Journal of e-Learning Volume 8 2010
Wiki tools, which became known mainly thanks to the Wikipedia encyclopedia, represent quite a new phenomenon on the Internet. The work presented here deals with three areas connected to a possible use of wiki tools for the preparation of an e-learning course. To what extent does Wikipedia.com contain terms necessary for scientific lectures at the university level and to what extent are they localised into other languages? The second area covers the use of Wikipedia as a knowledge base for e-learning study materials. Our experience with Enviwiki which originated within the {E-V} Learn project and its use in e-learning courses is presented. The third area aims at the use of wiki tools for building a knowledge base and sharing experience of the participants of an e-learning course.
Endres, Joe Wiki websites wealth of information INFORM - International News on Fats, Oils and Related Materials Volume 17 Pages 312 2006
In 1995, the Wiki technology was developed to create encyclopedic entries on subjects that are the culmination of the knowledge and experience each contributor brings to the table. On 20 June 2003, the Wikimedia Foundation was created to manage Wiki projects. The goal of the Wikimedia Foundation is to develop and maintain open content, wiki-based projects and to provide the full contents of these projects to the public free of charge. Currently, the Foundation has seven active projects: Wikipedia, Wikitionary, Wikiquote, Wikisource, Wikibooks, Wikijunior, Wikimedia Commons, Wikinews, and Wikispecies.
Brunsell, Eric & Horejsi, Martin Wiki, Wiki! Science Teacher Volume 77 Pages 12 2010
The article focuses on the benefits of wikis to science students. The introduction of the crowd-source encyclopedia called Wikipedia in 2001 is credited for the increased wiki visibility. The value of wikis in education is attributed to their ease of use, accessibility and opportunity for students to collaborate on projects in and outside of class. Among the wiki projects ideal for students are creating online posters and virtual museum exhibits.
Dorroh, Jennifer Wiki: Don't Lose That Number. American Journalism Review Volume 27 2005
This article explores the advantages provided by wikis to news organizations. Some online media experts say news sites should not give up on the wiki form too quickly. According to Nora Paul, director of the Institute for Media Studies at the University of Minnesota, news outlets that ignore wikis may miss a rich opportunity to expand their influence and their brand. Furthermore, wikis let readers do something they have never done with the newspaper before: They can edit, on the fly, text that has already been put out there, and then track the kinds of changes or contributions that others have made. The focus of Wikipedia and Wikinews on reporting, rather than on the opinion writing that the Times attempted, provides a useful model for news sites that aim to draw more reader input. A group of investigators could also use a wiki as a collection point for the information they unearth.
Chaletzky, Aaron D. Wiki: The Collaborative Resource for Library Science and Information Technology Professionals. Slavic \& East European Information Resources Volume 7 2006
This paper looks at the value of wikis as a collaborative resource for library science and digital libraries, briefly explores the history of wikis, cites examples of why wikis are viewed with hope and suspicion, and illustrates a wiki in use by the Digital Conversion Team at The Library of Congress.
Decker, Bjorn; Ras, Eric; Rech, Jorg; Jaubert, Pascal & Rieth, Marco Wiki-Based stakeholder participation in requirements engineering IEEE Software Volume 24 2007 [492]
Requirements elicitation and documentation are complex activities. The quality of their products can improve through stakeholders' participation, particularly in high-uncertainty projects. However, participative {RE,} especially in distributed environments, needs a platform that can support effective collaboration. The authors adapted the Wikipedia approach to collaboration in content creation to support active stakeholder participation in {RE,} including a document structure for wikibased {RE.} They discuss challenges and solutions based on their experience.
Hu, B. WiKi'mantics: interpreting ontologies with WikipediA Knowledge and Information Systems Volume 25 Pages 445 2010
In the context of the Semantic Web, many ontology-related operations can be boiled down to one fundamental task: finding as accurately as possible the semantics hiding beneath the superficial representation of ontological entities. This, however, is not an easy task due to the ambiguous nature of semantics and a lack of systematic engineering method to guide how we comprehend semantics. We acknowledge the gap between human cognition and knowledge representation formalisms: even though precise logic formulae can be used as the canonical representation of ontological entities, understanding of such formulae may vary. A feasible solution to juxtaposing semantics interpretation, therefore, is to reflect such cognitive variations. In this paper, we propose an approximation of semantics using sets of words/phrases, referred to as {WKmantic} vectors. These vectors are emerged through a set of well-tuned methods gradually surfacing the semantics that remain implicit otherwise. Given a concept, we first identify its conceptual niche amongst its neighbours in the graph representation of the ontology. We generate a natural language paraphrases of the isolated sub-graph and project this textual description upon a large document repository. {WKmantic} vectors are then drawn from the document repository. We evaluated each of the aforementioned steps by way of user study.


Hall, Gary Wikination: On Peace and Conflict in the Middle East Cultural Politics Volume 5 2009
This article begins by analyzing critically the usefulness of the recent political philosophy of Chantal Mouffe for reconceptualizing ideas of peace \& conflict. It takes as its focus for doing so the situation of the Middle East. It proceeds to show how Mouffe's radical democratic politics is actually just another form of the liberalism of Habermas \& Rawls that she positions her theory against. The article then explores the potential digital media hold for making affirmative, affective, hyperpolitical interventions in specific contents \& singular situations. In particular it advocates using the wiki medium -- hence the piece's Wikipedia-like form -- to experiment with new ways of organizing institutions, cultures, communities, \& countries which do not uncritically repeat the reductive adherence to democracy, hegemony, \& Western, bourgeois, liberal humanism identified in Mouffe, but which can also be located in the institution of academic criticism more widely. {'WikiNation'} is part of a series of 'performative media' projects. Performative media here stands for media that do not endeavor to represent the world so much as have an effect in or on it. They are media which produce the things of which they speak, in other words, \& which are engaged primarily in their actual performance. Adapted from the source document.
Dijck, J Van & Nieborg, D Wikinomics and its discontents: a critical analysis of Web 2.0 business manifestos NEW MEDIA \& SOCIETY Volume 11 2009 [493]
{'Collaborative} culture', 'mass creativity' and 'co-creation' appear to be contagious buzzwords that are rapidly infecting economic and cultural discourse on Web 2.0. Allegedly, peer production models will replace opaque, top-down business models, yielding to transparent, democratic structures where power is in the shared hands of responsible companies and skilled, qualified users. Manifestos such as Wikinomics {(Tapscott} and Williams, 2006) and {'We-Think'} {(Leadbeater,} 2007) argue collective culture to be the basis for digital commerce. This article analyzes the assumptions behind this Web 2.0 newspeak and unravels how business gurus try to argue the universal benefits of a democratized and collectivist digital space. They implicitly endorse a notion of public collectivism that functions entirely inside commodity culture. The logic of Wikinomics and {'We-} Think' urgently begs for deconstruction, especially since it is increasingly steering mainstream cultural theory on digital culture.
Zelkowitz, Rachel WikiPathways Debuts. Science Volume 321 Pages 623 2008
The article reports on the launch of {GenMAPP's,} the popular online genetic data hub, site for sharing findings on metabolic pathways in the {U.S.} According to Bruce Conklin, creator and cell biologist, {WikiPathways} modeled the hub from Wikipedia, wherein it offers a way to integrate information on these complex networks. It was formally opened with Conklin's colleagues at the University of Maastricht in the Netherlands. It has more than 300 registered users and contains information on 500 metabolic pathways in seven species, including humans.
Jacso, Peter Wikipedia Online (Wilton, Connecticut) Volume 26 Pages 79 2002
Childs, Sue Wikipedia He@lth Information on the Internet Volume 47 2005
G., Delia Juarez Wikipedia Nexos: Sociedad, Ciencia, Literatura 2008 [494]
Sabine Niederer & Jose van Dijck Wisdom of the crowd or technicity of content? Wikipedia as a sociotechnical system New Media & Society, Volume 12, No. 8, 1368-1387 2010



Bar-Ilan, J Wikipedia - A New Community of Practice? ONLINE INFORMATION REVIEW Volume 34 2010 [495]
Doughty, Howard A (REVIEWER) Wikipedia [en.wikipedia.org/wiki/Wikipedia ] College Quarterly Volume 8 2005


Black, Erik W. Wikipedia and academic peer review: Wikipedia as a recognised medium for scholarly publication? Online Information Review Volume 32 2008 [496]
Purpose - The purpose of this paper is to engage in a thought experiment, exploring the use of Wikipedia or similar content-malleable systems for the review and dissemination of academic knowledge. Design/methodology/approach - By looking at other sources, the paper considers the current state of the academic peer-review process, discusses Wikipedia and reflects on dynamic content creation and management applications currently in use in academia. Findings - The traditional peer review process must be updated to match the rapid creation and diffusion of knowledge that characterises the 21st century. The Wikipedia concept is a potential model for more rapid and reliable dissemination of scholarly knowledge. The implications of such a concept would have a dramatic effect on the academic community. Originality/value - This paper promotes a radical idea for changing the methods by which academic knowledge is both constructed and disseminated.
Berinstein, Paula Wikipedia and britannica : The kid's all right (and so's the old man) Searcher (Medford Volume 14 2006
Peut-on comparer Wikipedia, l'encyclopédie communautaire du Web, et {l'Encyclopaedia} Britannica ? Contributeurs, public, mission, ampleur, processus éditorial, autorité, tout les distingue. Wikipedia est un exemple fascinant du passage à un modèle radicalement différent, celui de l'édition collaborative et du consensus provisoire. Mais en dépit de sa popularité, Wikipedia souffre d'absence de crédibilité en tant que source faisant autorité.
van Dijk, Z Wikipedia and lesser-resourced languages LANGUAGE PROBLEMS \& LANGUAGE PLANNING Volume 33 2009 [497]
Wikipedia, the free encyclopedia, exists in more than 260 different language editions, some larger, some smaller. This article deals with difficulties in comparing them with each other and assessing their strength. Wikimedia Statistics can mislead if not interpreted with a knowledge about the ways Wikipedia editing works. Many language editions embellish the total number of articles by creating pseudo-articles with little or no encyclopedic value. The main question of the study presented by this article is what factors make a language edition grow, such as the existence of a standardized language, language status, Internet access for the average speaker, and the attitude of speakers to their language.
Mercer, Jean WIKIPEDIA AND 'OPEN SOURCE' MENTAL HEALTH INFORMATION. Scientific Review of Mental Health Practice Volume 5 2007
The article examines the function of the free online encyclopedia, Wikipedia, as an online information source on topics related to mental health. It outlines some issues about Wikipedia's handling of mental health topics. It provides some suggestions for solving certain problems concerning Wikipedia information as well as the current level of truthfulness of the Wikipedia. It recommends useful Web sites dedicated to preventing misinformation, such as QuackWatch.
Leithner, A; Maurer-Ertl, W; Glehr, M; Friesenbichler, J; Leithner, K & Windhager, R Wikipedia and osteosarcoma: a trustworthy patients' information? JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION Volume 17 2010 [498]
The English version of the online encyclopedia, Wikipedia, has been recently reported to be the prominent source of online health information. However, there is little information concerning the quality of information found in Wikipedia. Therefore, we created a questionnaire asking for scope, completeness, and accuracy of information found on osteosarcoma. Three independent observers tested the English version of Wikipedia, as well as the patient version and the health professional version of the {US} National Cancer Institute {(NCI)} website. Answers were verified with authoritative resources and international guidelines. The results of our study demonstrate that the quality of osteosarcoma-related information found in the English Wikipedia is good but inferior to the patient information provided by the {NCI.} Therefore, non-peer-reviewed commonly used websites offering health information, such as Wikipedia, should include links to more definitive sources, such as those maintained by the {NCI} and professional international organizations on healthcare treatments. Furthermore, frequent checks should make sure such external links are to the highest quality and to the best-maintained aggregate sites on a given healthcare topic.


Fister, B. Wikipedia and the challenge of read/write culture Library Issues Volume 27 2007
The author describes in vivid detail how Wikipedia and other social tools are affecting the behavior of students. These new tools also impact the curriculum of campus information literacy courses usually taught by libraries working in partnership with faculty.


Tollefsen, Deborah Perron WIKIPEDIA and the Epistemology of Testimony Episteme - Edinburgh Volume 6 2009


Garfinkel, Simson L. Wikipedia and the meaning of truth Technology Review Volume 111 2008
Some of the advancement and breakthroughs in materials are discussed. Researchers have fabricated a material for ultrahigh resolution microscopes that interacts with near-infrared light in a way that no naturally occurring material does. Devices made from the material could be used in microscopes to produce much sharper images. The material is made up of alternating layers of a metal, which conducts electricity, and an insulating material; both are punched with a grid of square holes. A new electrolyte developed for use in solid-oxide fuel cells has 100 million times the ionic conductivity of conventional electrolytes at room temperature. The researchers have combined nanometer-thick layers of the electrolyte, an yttria-stabilized zirconia, with 10-nanometer-thick layers of strontium titanate to make cool fuel cells. The new electrolyte developed for use in solid-oxide fuel cells has 100 million times the ionic conductivity of conventional electrolytes at room temperature.
Radtke, Philip J. & Munsell, John F. Wikipedia as a tool for forestry outreach Journal of Forestry Volume 108 2010
The goals of this work were to examine how the online, collaborative encyclopedia Wikipedia presents information related to forest management and the profession of forestry and to explore its potential as a vehicle for widespread public outreach, interaction, and communication regarding these topics. Issues concerning the accuracy of Wikipedia content were reviewed, and a survey of Wikipedia content related to forestry was performed, along with a project enlisting college students to generate content on Wikipedia related to forest measurements coursework. Forestry-related Wikipedia articles generated over one-half million page views during 1 month in 2008, with nearly 6,000 views of 25 studentgenerated pages. In the 18 months since they were first uploaded, student-generated articles were edited 784 times by 132 separate contributors. Developing new content, editing, and revision are essential parts of the Wikipedia collaborative model. As such, significant opportunities exist for individuals or groups of students, professionals, experts, and nonexperts alike to contribute collaboratively to forestry-related articles that will be viewed by relatively large audiences of the online public on Wikipedia. Copyright 2010 by the Society of American Foresters.
Page, Roderic Wikipedia as an encyclopaedia of life. Organisms Diversity \& Evolution Volume 10 2010
In a 2003 essay E. O. Wilson outlined his vision for an œencyclopaedia of life? comprising œan electronic page for each species of organism on Earth?, each page containing œthe scientific name of the species, a pictorial or genomic presentation of the primary type specimen on which its name is based, and a summary of its diagnostic traits.? Although biodiversity informatics has generated numerous online resources, including some directly inspired by Wilsons essay (e.g., {iSpecies} and {EOL),} we are still some way from the goal of having available online all relevant information about a species, such as its taxonomy, evolutionary history, genomics, morphology, ecology, and behaviour. While the biodiversity community has been developing a plethora of databases, some with overlapping goals and duplicated content, Wikipedia has been slowly growing to the point where it now has over 100,000 pages on biological taxa. My goal in this essay is to explore the idea that, largely independent of the aims of biodiversity informatics and well-funded international efforts, Wikipedia has emerged as potentially the best platform for fulfilling E. O. Wilsons vision.


Jimenez-Pelayo, J. Wikipedia as controlled vocabulary: has the traditional authority control been surpassed? El Profesional de la Informacion Volume 18 2009 [499]
Wikipedia, the free encyclopaedia, is the first project to have been born specifically from and for the Web and which has developed an authority control system for access to their information. Here, the different elements, procedures and principles which make up the Wikipedia authority control system is analysed, and a critical analogy is traced between them and those that meet the traditional authority control applied to bibliographical catalogues. From the critical comparison of both models, one wonders up to what point the authority control, constrained by the weight of tradition and by its hopeless adaptation to technology, has been surpassed by developments such as Wikipedia which are based on a philosophy of flexibility and common sense and where the rules are decided by and for the user. The enormous potential and reach of the Wikipedia authority model make it a true contender to become the normal system of access to the semantic Web.
Rush, EK & Tracy, SJ Wikipedia as Public Scholarship: Communicating Our Impact Online JOURNAL OF APPLIED COMMUNICATION RESEARCH Volume 38 2010 [500]
To contribute to the forum asking {oHas} Communication Research Made a Difference?,o this essay examines whether communication scholarship makes a difference (a) to those who search for information online, (b) in the sense that a primary way our research can make a difference is through its accessibility, and (c) by using the criteria of its presence (or absence) on Wikipedia. In this essay, we reason that Wikipedia is a useful benchmark for online accessibility of public scholarship in that it provides immediate, freely available information to today's diverse global public seeking online answers to questions and relief from problems.
Yun, Li; yan, Huang Kai; ji, Ren Fu & xin, Zhong Yi Wikipedia based semantic related chinese words exploring and relatedness computing Journal of Beijing University of Posts and Telecommunications Volume 32 2009
To find how to collect semantic related words and calculate semantic relatedness, an experiment is done to download about 50 thousand documents from the web site of Chinese Wikipedia and extract hyperlinks between lines which contains semantic information. By mining hyperlinked references in documents, about 400 thousand semantic related word pairs are collected. With more experiments on topic groups of related words, tightly related words are grouped into smaller sets with an average semantic relatedness calculated. Semantic relatedness is calculated using information of hyperlink positions and frequencies in documents. Comparing with the result by classic algorithms, the reliability of the new measures is analyzed.
Stillman-Lowe, C. Wikipedia comes second. British Dental Journal Volume 205 Pages 525 2008
A letter to the editor is presented in response to the article {Access} to special care dentistry part 6. Special care dentistry services for young people" in the 2008 issue."
Shaw, Donna WIKIPEDIA IN THE NEWSROOM. American Journalism Review Volume 30 2008
The article discusses the use of the online encyclopedia Wikipedia by journalists. The author states that putting in print that one has used Wikipedia as a source is not up to professional journalism's standards. However, the author has found that some journalists are using it as a way to begin to gather information about a story. The author also presents a series of anecdotes form newspaper copy editors from around the United States about their use of Wikipedia. The author also discusses Wikipedia's own copy editing rules.
McKenna, Brian Wikipedia just as `wiki' as ever, says Wales Infosecurity Today Volume 3 Pages 6 2006 [501]
Langlois, G & Elmer, G Wikipedia leeches? The promotion of traffic through a collaborative web format NEW MEDIA \& SOCIETY Volume 11 2009 [502]
This article investigates the circulation of Wikipedia entries on the web in an effort to determine the integration of its collaborative model into existing proprietary web formats. In particular it details the use of Wikipedia content as 'tags' or information that is used to increase traffic to webpages through search engine results. Consequently, the article discusses the need to develop theoretical models that provide for an understanding of both content and form on the web, particularly as formatted by open-source legal frameworks.
Wagner, A.L. Wikipedia made law? The federal judicial citation of Wikipedia John Marshall Journal of Computer \& Information Law Volume 26 2008
Over three hundred federal judicial opinions have cited Wikipedia as a source. Most opinions cite Wikipedia in footnotes to define terms used in the opinion. Some judges, however, have used Wikipedia as a source on which to base decisions. Judicial use of Wikipedia as a source of evidence or a basis for making decisions is a serious problem, because the nature of Wikipedia undermines the common law system. Wikipedia is an online encyclopedia that contains articles that anyone can create, alter, or revise. Additionally, Wikipedia is not only merely a secondary source, but the articles are subject to change on a daily, sometimes hourly, basis. This paper explores federal judicial opinions that should not cite Wikipedia. Wikipedia may be a starting point for research, but this comment will discuss many of the reasons why federal judges and members of the federal bar should not cite Wikipedia as a source. Additionally, Wikipedia's reliability is questionable at best, and for this reason alone Wikipedia should not be cited as an authoritative source on any topic.
Nakayama, K.; Hara, T. & Nishio, S. Wikipedia mining to construct a thesaurus Transactions of the Information Processing Society of Japan Volume 47 2006
Thesauri have been widely used in many applications such as information retrieval, natural language processing {(NLP),} and interactive agents. However, several problems, such as morphological analysis, treatment of synonymous and multisense words, still remain and degrade accuracy on traditional {NLP-based} thesaurus construction methods. In addition, adding latest/miner words is also a difficult issue on this research area. In this paper, to solve these problems, we propose a Web mining method to automatically construct a thesaurus by extracting relations between words from Wikipedia, a wiki-based huge encyclopedia on WWW.
Lichtenstein, S. & Parker, C.M. Wikipedia model for collective intelligence: a review of information quality International Journal of Knowledge and Learning Volume 5 Pages 03/04/2011 2009 [503]
Online information seekers increasingly utilise the online encyclopaedia Wikipedia as a key reference source. Wikipedia's special feature is that it is based on the collective intelligence {(CI)} of lay citizens. Its consensus-building participatory knowledge-building processes replace traditional encyclopaedia processes founded on the knowledge of experts and gatekeeping practices. However there have been reports of concerns with the level of information quality provided by Wikipedia articles. This paper explores information quality for Wikipedia theoretically. First, it conceptualises the Wikipedia model of knowledge production and second, it analyses information quality for the model. Finally, the paper recommends some improvements for the model and discusses other implications for knowledge management theory and practice.
Huss, J.W. 3rd, et al. Wikipedia needs cell biologists. Journal of Cell Biology Volume 184 Pages 191 2009
The article announces that the Wikipedia Web site is in need of cell biologists to improve the quality of articles that require annotation by scientists working in related fields of study. The site is sponsored by a non-profit educational charity, the Wikimedia Foundation. The American Society for Cell Biology {(ASCB)} offered a workshop which provided an overview of the site. Domains within Wikipedia, called {WikiProjects,} coordinate article coverage for a particular field.
Petrilli, M. J Wikipedia or Wickedpedia? Assessing the online encyclopedia's impact on K-12 education EDUCATION NEXT Volume 8 Pages 87 2008 [504]
Miller, Nora Wikipedia revisited ETC.: A Review of General Semantics Volume 64 Pages 147 2007 [505]
Denning, Peter; Horning, Jim; Parnas, David & Weinstein, Lauren Wikipedia risks Communications of the ACM Volume 48 Pages 152 2005 [506]
Several risks related to Wikipedia, a venerable form of knowledge organization and dissemination are presented. Wikipedia does not confirm the accuracy of the information presented by them, and is unable to tell the motives of the contributors to an article. It is difficult to determine how qualified an article's contributors are, the revision histories often identify them by pseudonyms, making it difficult to check credentials and sources. Many articles in the Wikipedia do not cite independent sources. Wikipedia contains no formal peer review process for fact-checking, and the editors themselves might not be well versed in the topics they write about. The Wikipedia cannot attain the status of a true encyclopedia without more formal content-inclusion and expert review procedures.
Giles, Jim Wikipedia rival calls in the experts. Nature Volume 443 Pages 493 2006
The article reports the launch of an online encyclopaedia Citizendium, which is reportedly going to use all of Wikipedia's content but in another website (http://citizendium.org). According to Larry Sanger, the co-founder of Wikipedia, this would give scientists a new organizational framework to clean up and improve on the work started by Wikipedia. Reportedly many scientists have no desire to navigate the treacherous waters of Wikipedia's editorial system.
Andrews, S. Wikipedia uncovered: the best source of knowledge or broken beyond repair"?" PC Pro 2007
Some call it a miracle of the information age. Lauded by science journals, wealthy tycoons, national newspapers and government ministers, in the space of six years Wikipedia has become one of the most widely consulted knowledge resources in the world. Leapfrogging rivals such as Britannica, it has become the online encyclopedia. This paper uncovers how the net's highly controversial encyclopedia really works.
Shawkat, E. Wikipedia use. British Dental Journal Volume 206 Pages 117 2009
A letter to the editor is presented in response to the article {Wikipedia} comes second in the 2008 issue.
Hamjavar, Farid Wikipedia woes [2] DB2 Magazine Volume 10 Pages 6 2005
No abstract available


Stankus, Tony & Spiegel, Sarah E. Wikipedia, scholarpedia, and references to books in the brain and behavioral sciences: A comparison of cited sources and recommended readings in matching free online encyclopedia entries Science and Technology Libraries Volume 29 Pages 01/02/2011 2010 [507]
We provide a comparative analysis of the references to books in two free online encyclopedias that have very different philosophies about authorship and editorial oversight that may affect the nature and academic respectability of the books they list. These encyclopedias are the loosely edited, non-refereed Wikipedia, where anonymous authors, whose credentials are uncertain, compile the reference list and where many equally anonymous readers can later alter the reference lists, and its peer-reviewed companion Scholarpedia, which features signed articles by invited experts who control its reference lists. We compared 47 entries dealing with the brain or behavioral sciences that had exactly matching titles. We report relative number of book references overall, the age of these references, and those titles that were multiply cited, either through citations in both online encyclopedias or multiple entries in either one of them. We compare the percentages of book references allotted to matching subject categories. We note the distributions of references according to book publishers and compare propensities for citing high-level research volumes versus introductory textbooks and popularizations. Finally, we examine the credentials of the authors of the cited works, providing information on the universities and disciplines in which their authors or editors received their doctoral degrees and their most current academic or professional affiliation. We conclude that in this comparison of a small but carefully matched set of entries in the brain and behavioral sciences, both encyclopedias offer references to solid materials and that any differences in quality indicators represent matters of degree rather than any clear-cut advantage that is exclusive to one or the other. Finally, we provide as an annotated checklist for librarians serving the brain and behavioral sciences of the books multiply cited by these encyclopedias at the time of this study.
Muller-Seitz, Gordon & Reger, Guido Wikipedia, the free encyclopedia as a role model? Lessons for open innovation from an exploratory examination of the supposedly democratic-anarchic nature of Wikipedia International Journal of Technology Management Volume 52 Pages 03/04/2011 2010 [508]
Accounts of open source software {(OSS)} development projects frequently stress their democratic, sometimes even anarchic nature, in contrast to for-profit organisations. Given this observation, our research evaluates qualitative data from Wikipedia, a free online encyclopaedia whose development mechanism allegedly resembles that of {OSS} projects. Our research offers contributions to the field of open innovation research with three major findings. First, we shed light on Wikipedia as a phenomenon that has received scant attention from management scholars to date. Second, we show that {OSS-related} motivational mechanisms partially apply to Wikipedia participants. Third, our exploration of Wikipedia also reveals that its organisational mechanisms are often perceived as bureaucratic by contributors. This finding was unexpected since this type of problem is often associated with for-profit organisations. Such a situation risks attenuating the motivation of contributors and sheds a critical light on the nature of Wikipedia as a role model for open innovation processes. Copyright 2010 Inderscience Enterprises Ltd.
LeLoup, JW & Ponterio, R Wikipedia: A multilingual treasure trove LANGUAGE LEARNING \& TECHNOLOGY Volume 10 2006 [509]
Hussey, Sandra R. Wikipedia: A New Community of Practice? Journal of Academic Librarianship Volume 36 Pages 177 2010
The article reviews the book {Wikipedia:} A New Community of Practice?" by Dan {O'Sullivan.}"
Mason, D Wikipedia: A New Community of Practice? ELECTRONIC LIBRARY Volume 28 2010 [510]
Morrison, Ian Wikipedia: A New Community of Practice? Australian Academic \& Research Libraries Volume 41 2010
The article reviews the book {Wikipedia:} A New Community of Practice? by Dan {O'Sullivan.
Tomaiuolo, Nicholas G. Wikipedia: A New Community of Practice? portal: Libraries \& the Academy Volume 10 2010
The article reviews the book {Wikipedia:} A New Community of Practice? by Dan {O'Sullivan.
Kuznetsov, Stacey Wikipedia: an informal survey of NYU students ACM SIGCAS Computers and Society Homepage Volume 36 Pages 2 2006 [511]
An abstract is not available.
West, K. & Williamson, J. Wikipedia: friend or foe? Reference Services Review Volume 37 2009 [512]
Purpose - The purpose of this paper is to report on a research study that entailed the rigorous evaluation of the quality of a large multidisciplinary sample of Wikipedia articles. The objective of the paper is to assess whether Wikipedia can be used and recommended as a credible reference or information tool. Design/methodology/approach - The 106 randomly generated Wikipedia articles are analyzed and evaluated on specific criteria (completeness, accuracy, presentation, objectivity, and overall quality). Articles are reviewed from a broad range of subject areas: arts, popular culture, entertainment, geography, history, science, technology, people, entities, and politics. Findings - The findings indicate that overall the articles are objective, clearly presented, reasonably accurate, and complete, although some are poorly written, contain unsubstantiated information, and/or provide shallow coverage of a topic. Research limitations/implications - Further research on evaluating Wikipedia entries should include reviewing outward links to more accurately assess overall quality. Practical implications - Wikipedia has a role as a reference and instruction tool. Originality/value - This paper provides empirical data on a large number of articles on a wide range of disciplines in Wikipedia, supporting its use as an acceptable encyclopedia.
Crovitz, Darren & Smoot, W. Scott Wikipedia: Friend, Not Foe English Journal Volume 98 2009
As online research has become an increasingly standard activity for middle school and high school students, Wikipedia (http://www.wikipedia.org) has simultaneously emerged as the bane of many teachers who include research-focused assignments in their courses. An online encyclopedia that allows anyone to edit its entries, Wikipedia has educators fed up with students using the site as a primary resource and citing its content in their essays. For some the site seems to represent the worst of how the Internet has dumbed down the research process, with its easily accessible but unsubstantiated (if not downright false) information on almost any topic, a student's citation of which amounts to a mockery of legitimate inquiry. After all, how can a site that allows anyone" to add change or remove information be credible? Seen in a different light Wikipedia provides a unique opportunity to get students involved in ongoing conversations about writing for a real audience meeting genre expectations establishing credibility revising for clarity and purpose and entering public discussions about the nature of truth accuracy and neutrality. In this article the authors collaborate on successful ways to build Wikipedia assignments into English classes."
Risinger, C. Frederick Wikipedia: Historical Thinking. Social Education Volume 72 Pages 32 2008
The article underscores the topic on history in the Wikipedia entitled {Historical} Thinking." It states that the article has links to The National Center for History in the Schools at {UCLA.} This also includes a five-part definition of historical thinking. These are chronological comprehension historical comprehension historical research capabilities historical analysis and interpretation and historical issues-analysis and decision-making. Moreover the article has other links to good information resources for teachers and methods of teachers in teaching history."
Nix, E. M Wikipedia: How it Works and How it Can Work for You The History Teacher Volume 43 2010 [513]
Wagstaff, Jeremy Wikipedia: it's wicked: here's an example of the Internet as it should be: a font of constantly updated knowledge--available for free Far Eastern Economic Review Volume 167 2004
Discusses online encyclopedia, probably the largest ever collatorative effort on the Internet, available at www.wikipedia.org; comparison with other online encyclopedias.
Sodhi, M. Wikipedia: No Country for Old Men OR-MS Today Volume 36 Pages 10 2009
Unlike a journal, on Wikipedia the editors can simply edit your article to make sure their comments ring true. Different editors/reviewers working independently means the price of their inconsistencies is paid by the hapless author. Wikipedia is an important addition to world knowledge and people all benefit from robust editorial processes. Indeed, Wikipedia has been blamed for inaccuracies and self-promotional material in the past. Their ways of manhandling authors of new entries suggests that creating entries there is something to avoid although the author did read a newspaper story about a high school dropout who created 400+ entries posing as a classics professor.
Walker, MA Wikipedia: Social revolution or information disaster? ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY Volume 231 2006 [514]
Anon Wikipedia: the book; from the free encylopedia Wikipedia; ZEITSCHRIFT FUR BIBLIOTHEKSWESEN UND BIBLIOGRAPHIE Volume 53 2006 [515]
Remy, Melanie Wikipedia: The Free Encyclopedia Online Information Review Volume 26 Pages 434 2002
Wikipedia: The Free Encyclopedia is reviewed.
Campbell, Charles Wikipedia: The Free Encyclopedia. TDR: The Drama Review Volume 53 2009
The article reviews the web site Wikipedia, located at http://en.wikipedia.org/wiki.
Reagle, Joseph Wikipedia: The happy accident Interactions Volume 16 2009 [516]
Some of the significant issues associated with the development and success of Wikipedia as an encyclopedia are discussed. Jimmy Wales Larry Sanger were the professionals who were involved in the initial development of Wikipedia, the wiki-based encyclopedia. One of the most significant features of the encyclopedia was that it was able to be edited by any user without any problem. Some of the challenges associated with the web needed to be understood to understand the success of Wikipedia as an effective encyclopedia. The two professionals in the development of the encyclopedia overcame several challenges to develop it and ensure its success.
Swidler, L. Wikipedia Cities Interreligious Dialogue ournal of Ecumenical Studies Volume 44 2009 [517]
Gabrilovich, Evgeniy & Markovitch, Shaul Wikipedia-based semantic interpretation for natural language processing Journal of Artificial Intelligence Research Volume 34 2009
Adequate representation of natural language semantics requires access to vast amounts of common sense and domain-specific world knowledge. Prior work in the field was based on purely statistical techniques that did not make use of background knowledge, on limited lexicographic knowledge bases such as {WordNet,} or on huge manual efforts such as the {CYC} project. Here we propose a novel method, called Explicit Semantic Analysis {(ESA),} for fine-grained semantic interpretation of unrestricted natural language texts. Our method represents meaning in a high-dimensional space of concepts derived from Wikipedia, the largest encyclopedia in existence. We explicitly represent the meaning of any text in terms of Wikipedia-based concepts. We evaluate the effectiveness of our method on text categorization and on computing the degree of semantic relatedness between fragments of natural language text. Using {ESA} results in significant improvements over the previous state of the art in both tasks. Importantly, due to the use of natural concepts, the {ESA} model is easy to explain to human users. 2009 {AI} Access Foundation. All rights reserved.
Moldwin, Mark B.; Gross, N. & Miller, T. Wikipedia's role in science education and outreach Eos Volume 88 2007
Zlatic, V.; Bozicevic, M.; tefancic, H. & Domazet, M. Wikipedias: Collaborative web-based encyclopedias as complex networks Physical Review E - Statistical, Nonlinear, and Soft Matter Physics Volume 74 2006 [518]
Wikipedia is a popular web-based encyclopedia edited freely and collaboratively by its users. In this paper we present an analysis of Wikipedias in several languages as complex networks. The hyperlinks pointing from one Wikipedia article to another are treated as directed links while the articles represent the nodes of the network. We show that many network characteristics are common to different language versions of Wikipedia, such as their degree distributions, growth, topology, reciprocity, clustering, assortativity, path lengths, and triad significance profiles. These regularities, found in the ensemble of Wikipedias in different languages and of different sizes, point to the existence of a unique growth process. We also compare Wikipedias to other previously studied networks. 2006 The American Physical Society.
McPherson, Keith wikis and literacy development Teacher Librarian Volume 34 Pages 67 2006
Mcpherson explores the question on whether wikis be valuable resources for developing strong literacy links between the school library and the classroom. He finds that public wikis are valuable information sources that teacher-librarians can use to complement and further the width and breadth of literacy objectives developed in the classroom. Although readability and hardware issues create some limitations in using wikis as research and literacy development resources, many of these limitations can be overcome through creative solutions.
McPherson, Keith wikis and student writing Teacher Librarian Volume 34 Pages 70 2006
{McPherson} explores wikis and the possible contributions that they offer teacher-librarians in developing student writing. Current articles and research exploring the educational use of wikis in the classroom and school library have uncovered many positive possibilities for developing students' writing skills. One is that wikis provide students with a variety of authentic audiences. Knowing that real people will be reading and possibly responding to their writing is often the impetus to motivate students to write with much more enthusiasm than they would when composing traditional research essays, in which the classroom teacher or teacher-librarian is the only audience.


Nordin, Norhisham Mohamad; Klobas, Jane & Nordin, Norhisham Mohamad Wikis as collaborative learning tools for knowledge sharing: Shifting the education landscape Global Learn Asia Pacific 2010 [519]
Bell, S. Wikis as legitimate research sources Online Volume 32 2008
Too many people equate the word wiki" with Wikipedia and based on that view information found in public-facing wikis with suspicion. Others see wikis solely as knowledge-sharing tools employed within an enterprise to encourage team collaboration and enhance project management. The wiki format is no longer strange to people although its utility as an information source can be questionable. This article looks at wikis from a content perspective rather than a purely technological one."
Anonymous Wikis in Medicine. Journal of Visual Communication in Medicine Volume 31 2008
The article reviews several web sites in medicine including Wikipedia Medicine Portal, {OpenWetWare} Wild, and Clinfowiki: The Clinical Informatics Wiki.
Harsell, Dana Michael Wikis in the Classroom: Faculty and Student Perspective Journal of Political Science Education Volume 6 2010 [520]
In March 2009, a faculty member and four political science students led a forum entitled {œWikis} in the Classroom: Student and Faculty Perspective.? The discussion centered on a number of benefits and concerns with the use of wikis as an instructional tool within the classroom. Based on student and faculty feedback, this article expands on four themes that emerged from the roundtable discussion: training, applicability of assignments, setting clear guidelines, and expectations and grading.
Roszkiewicz, R. Wikis that mean business Seybold Report Analyzing Publishing Technologies Volume 8 2008
Wikis might be one of the most misunderstood social networking technologies linked to Web 2.0. While {MySpace,} {YouTube} and {FaceBook} get the most visibility, the technology that has the greatest potential for transforming what we know as traditional publishing is wild technology. Wiki has an identity problem, however. The underlying technology is available as open source, and without a strong company-backed marketing effort to tell the wiki story over and over, it is apt to be misunderstood. Another problem is Wikipedia and its overwhelming popularity. The Wikipedia application has co-opted the wiki technology and is strongly identified with it. As a result, the technology is not getting the widespread traction it deserves. The original wiki (wiki is Hawaiian for quick, as in quickly developed Web sites) was created by Howard G. {(Ward)} Cunningham around 1994. The intent of what was then called {WikiWikiWeb} was to make communication among programmers more efficient. Cunningham is also known for developing the programming methodology known as extreme programming {(XP).
Markiewicz, D. Wikisafety is bound to grow ISHN Volume 43 Pages 20 2009 [521]
Wiki is the Hawaiian word for fast but now is used as a blend of words to describe varied aspects of mass collaboration. Wikipedia, the web-based encyclopedia, is the best example of this popular blend of words. Wikipedia (http:// {en.wikipedia.org/wiki/Wikipedia:About)} is collaboratively written by volunteers. The content is free and anyone can edit the information.


Hoffman, David A. & Mehra, Said K. WIKITRUTH THROUGH WIKIORDER. Emory Law Journal Volume 59 2009
How does large-scale social production coordinate individual behavior to produce public goods? In 1968, Hardin denied that the creation of public goods absent markets or the State is possible. Benkler, Shirky, Zittrain, and Lessig recently countered that the necessary coordination might emerge though social norms. However, scholars have not fully explained how this coordination is to occur. Focusing on Wikipedia, we argue that the site `s dispute resolution process is an important force in promoting the public good it produces, i.e., a large number of relatively accurate public encyclopedia articles. We describe the development and shape of Wikipedia's existing dispute resolution system. Further, we present a statistical analysis based on coding of over 250 arbitration opinions from Wikipedia's arbitration system. The data show that Wikipedia's dispute resolution ignores the content of user disputes, and focuses on user conduct instead. Based on fairly formalized arbitration findings, we find a high correlation between the conduct found and the remedies ordered. In effect, the system functions not so much to resolve disputes and make peace between conflicting users, but to weed out problematic users while weeding potentially productive users back in to participate. Game theorists have modeled large-scale social production as a solution to the herder problem/multi-player Prisoner's Dilemma. But we demonstrate that the weeding in" function reflects dynamics more accurately captured in coordination games. In this way dispute resolution can provide a constitutive function for the community.
Friesen, N. & Hopkins, J. Wikiversity; or education meets the free culture movement: an ethnographic investigation First Monday Volume 13 2008 [522]
Wikipedia, the free online encyclopedia, has challenged the way that reference works are used and understood, and even the way that the collective enterprise of knowledge construction and circulation is itself conceptualized. The article presents an ethnographic study of Wikiversity, an educationally-oriented sister project to Wikipedia. It begins by providing an overview of the orientations and aims of Wikiversity. which seeks to provide for participants both open educational contents and an open educational community. It then undertakes a detailed examination of this project's emerging, overlapping communities and cultures by providing descriptions produced through a combination of ethnographic techniques. These descriptions focus on the experiences of a participant-observer in the context of an 11-week course developed and delivered via Wikiversity, titled Composing Free and Open Online Educational Resources. These descriptions are discussed and interpreted through reference to qualitative studies of the more developed dynamics of the Wikipedia effort - allowing this study to trace the possible trajectories for the future development of the fledgling Wikiversity project. In this way, this paper investigates the communal and cultural dynamics of an undertaking that - should it meet only with a fraction of Wikipedia's success - will be of obvious significance to education generally.
Zesch, Torsten & Gurevych, Iryna Wisdom of crowds versus wisdom of linguists “ measuring the semantic relatedness of words Natural Language Engineering Volume 16 Pages 25 2009 [523]
Sausner, R. Wouldn't wikis be wicked wonderful? [Web collaboration] US Banker Volume 117 2007
Wikis are an intriguing collaborative tool, and Wells Fargo is leading the charge in experimenting with them. But wikis use among financial institutions {(FIs)} may always be limited to internal use. Perhaps the most famous wiki on the Internet is Wikipedia, the online encyclopedia that hosts more than 1.7 million articles in English on topics from molecular electronics to the insurgency in Somalia. Bankers may have run across Investopedia, the ad-sponsored site that covers the financial markets
Forte, A. & Bruckman, A. Writing, Citing, and Participatory Media: Wikis as Learning Environments in the High School Classroom International Journal of Learning Volume 1 2009
Arshinoff, Bradley I.; Suen, Garret; Just, Eric M.; Merchant, Sohel M.; Kibbe, Warren A.; Chisholm, Rex L. & Welch, Roy D. Xanthusbase: adapting wikipedia principles to a model organism database Nucleic Acids Res. Volume 35 Pages suppl\_1 2007 [524]
{xanthusBase} (http://www.xanthusbase.org) is the official model organism database {(MOD)} for the social bacterium Myxococcus xanthus. In many respects, M.xanthus represents the pioneer model organism {(MO)} for studying the genetic, biochemical, and mechanistic basis of prokaryotic multicellularity, a topic that has garnered considerable attention due to the significance of biofilms in both basic and applied microbiology research. To facilitate its utility, the design of {xanthusBase} incorporates open-source software, leveraging the cumulative experience made available through the Generic Model Organism Database {(GMOD)} project, {MediaWiki} (http://www.mediawiki.org), and {dictyBase} (http://www.dictybase.org), to create a {MOD} that is both highly useful and easily navigable. In addition, we have incorporated a unique Wikipedia-style curation model which exploits the internet's inherent interactivity, thus enabling M.xanthus and other myxobacterial researchers to contribute directly toward the ongoing genome annotation.


Luyt, B; Zainal, CZBC; Mayo, OVP & Yun, TS Young people's perceptions and usage of Wikipedia INFORMATION RESEARCH-AN INTERNATIONAL ELECTRONIC JOURNAL Volume 13 2008 [525]
Introduction. This exploratory study investigated the perception and usage of Wikipedia among young people. Method. Fifteen respondents aged thirteen to twenty-four were selected for the study. The respondents were composed of secondary and tertiary students, and recent tertiary level graduates. An interview schedule was designed to explore user experiences at three levels: the initial encounter with Wikipedia, the time when the user felt comfortable with Wikipedia, and the user's current state. Questions were open-ended and semi-structured to allow for probing. Interviews were conducted over a span of two weeks with each interview lasting 30-45 minutes. Follow-up questions were asked of some of the respondents for clarification purposes. Analysis. Interview data was used to test Wikipedia, viewed as a technology, against the model of technological appropriation developed by Carroll et al. for their own study of mobile phone use among young people. Results. We found that although Wikipedia is initially attractive for young people, it generally fails to become deeply integrated (appropriated) into the everyday lives of users, instead remaining an instrumental tool for the fulfilment of a narrow range of tasks. We also found that over time respondents do become aware of the problems of accuracy that Wikipedia poses. Conclusions. Given that Wikipedia has not assumed the role of a key technology in the lives of the young people studied here, concern over its use by educators may be overstated. Also, the fact that the respondents were aware of the drawbacks to its use should make the message of the need for checking alternative sources an easier one to impart to students. The key conclusion, however, is the need for those wishing to design more popular information systems to take into account the deeper needs of users to experiment with technology in order to make it fit their lives rather than the other way round. This is something that even Wikipedia, it seems, has been unable to achieve.
Litterst, G. F Your Role in Music History American Music Teacher Volume 55 2005 [526]
Willinsky, John What open access research can do for Wikipedia First Monday volume 12, issue 3 2007 [527] "The open access references that we were able to locate for the smaller sample of twenty entries in the course of the study have now been added to the relevant Wikipedia articles and clearly marked with a link to the “open access copy” (by Sarah Munro"
This study examines the degree to which Wikipedia entries cite or reference research and scholarship, and whether that research and scholarship is generally available to readers. Working on the assumption that where Wikipedia provides links to research and scholarship that readers can readily consult, it increases the authority, reliability, and educational quality of this popular encyclopedia, this study examines Wikipedia’s use of open access research and scholarship, that is, peer-reviewed journal articles that have been made freely available online. This study demonstrates among a sample of 100 Wikipedia entries, which included 168 sources or references, only two percent of the entries provided links to open access research and scholarship. However, it proved possible to locate, using Google Scholar and other search engines, relevant examples of open access work for 60 percent of a sub-set of 20 Wikipedia entries. The results suggest that much more can be done to enrich and enhance this encyclopedia’s representation of the current state of knowledge. To assist in this process, the study provides a guide to help Wikipedia contributors locate and utilize open access research and scholarship in creating and editing encyclopedia entries.
Simone P. Ponzetto and Michael Strube Knowledge Derived from Wikipedia for Computing Semantic Relatedness Journal of Artificial Intelligence Research, 30: 181--212, 2007. 2007 [528]
Wikipedia provides a semantic network for computing semantic relatedness in a more structured fashion than a search engine and with more coverage than WordNet. We present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google counts, and we show that Wikipedia outperforms WordNet on some datasets. We also address the question whether and how Wikipedia can be integrated into NLP applications as a knowledge base. Including Wikipedia improves the performance of a machine learning based coreference resolution system, indicating that it represents a valuable resource for NLP applications. Finally, we show that our method can be easily used for languages other than English by computing semantic relatedness for a German dataset.
knowledge, knowledge-extraction relatedness semantic semantic web, wikipedia
Firer-Blaess, S. Wikipédia: histoire, communauté, gouvernance homo-numericus.net 2007 [529]
Depuis sa création, Wikipedia est un véritable sujet de polémiques, en particulier au sein des milieux académiques qui se sentent menacés par la popularité de cette encyclopédie ouverte, sans doute parce que, éditable et amendable par tous, elle remet en question ce qu'ils estiment relever d'un monopole légitime. Pour preuve, la récente « étude » diffusée par plusieurs étudiants de Science Po, cherchant à mettre en évidence la faillibilité de l'encyclopédie, sur la base d'erreurs qu'ils y ont volontairement introduits. Au delà d'interrogations un peu puériles sur la qualité ou l'absence de qualité intrinsèque de cette encyclopédie qu'on aborderait comme un « produit » fini, il peut être intéressant de se pencher sur le mode de fonctionnement de cette entreprise, considérée cette fois comme un système social, un lieu de coordination et de coopération entre plusieurs milliers de participants ; amendable par tous, éditable indéfiniment, Wikipédia n'est en effet jamais « finie » - pas plus que ne l'est le savoir d'ailleurs, en perpétuel renouvellement. De ce simple fait, il est bien plus pertinent de s'interroger sur la manière dont le travail de co-construction des connaissance s'accomplit en permanence, que sur la « verité » de tel ou tel énoncé qui y serait produit. C'est exactement ce que fait Sylvain Firer-Blaess dans cette série de trois articles qu'il a accepté de publier pour Homo Numericus. S'appuyant sur les travaux de Foucault, mais pas ceux auquel on s'attendrait, il développe une analyse politique percutante de Wikipedia comme lieu où s'exerce et refuse de s'exercer en même temps une certaine forme de pouvoir. Pour lui, et il l'expliquera dans ses deuxième et troisième parties de cette série, Wikipédia est traversé d'une tension qui lui est propre et qu'il tente de qualifier en démontant à la fois les moeurs et les mécanismes de régulation de cette communauté très particulière. Pour l'heure, il nous la présente, dans ses dimension techniques et historiques. Ce travail est issu d'un mémoire de fin d'étude présenté à l'IEP de Lyon.
Wikipedia, History, Governance, Power Structure
Nielsen, Finn Årup Scientific Citations in Wikipedia First Monday volume 12 issue 8 2007 [530]
The Internet-based encyclopædia Wikipedia has grown to become one of the most visited Web sites on the Internet, but critics have questioned the quality of entries. An empirical study of Wikipedia found errors in a 2005 sample of science entries. Biased coverage and lack of sources are among the “Wikipedia risks.”

The study here describes a simple assessment of these aspects by examining the outbound links from Wikipedia articles to articles in scientific journals with a comparison against journal statistics from Journal Citation Reports such as impact factors.

The results show an increasing use of structured citation markup and good agreement with citation patterns seen in the scientific literature though with a slight tendency to cite articles in high-impact journals such as Nature and Science. These results increase confidence in Wikipedia as a reliable information resource for science in general.
Wikipedia, Citations, Information Quality
Wilkinson, Dennis M. and Bernardo A. Huberman Assessing the value of cooperation in Wikipedia First Monday, volume 12, number 4 (March 2007) 2007 [531]
Since its inception six years ago, the online encyclopedia Wikipedia has accumulated 6.40 million articles and 250 million edits, contributed in a predominantly undirected and haphazard fashion by 5.77 million unvetted volunteers. Despite the apparent lack of order, the 50 million edits by 4.8 million contributors to the 1.5 million articles in the English–language Wikipedia follow strong certain overall regularities. We show that the accretion of edits to an article is described by a simple stochastic mechanism, resulting in a heavy tail of highly visible articles with a large number of edits. We also demonstrate a crucial correlation between article quality and number of edits, which validates Wikipedia as a successful collaborative effort.
cooperation, Wikipedia
Nicolas Auray, Céline Poudat, Pascal Pons Democratizing scientific vulgarization. The balance between cooperation and conflict in french Wikipedia Observatorio (OBS*), Vol 1, No 3 (2007) 2007 [532]
The free online encyclopedia project Wikipedia has become in less than six years one of the most prominent commons-based peer production example. The present study investigates the patterns of involvement and the patterns of cooperation within the French version of the encyclopaedia. In that respect, we consider different groups of users, highlighting the opposition between passerby contributors and core members, and we attempt to evaluate for each class of contributors the main motivations for their participation to the project. Then, we study the qualitative and quantitative patterns of cowriting and the correlation between size and quality of the production process.
Maria Ruiz-Casado, Enrique Alfonseca and Pablo Castells Automatising the Learning of Lexical Patterns: an Application to the Enrichment of WordNet by Extracting Semantic Relationships from Wikipedia Data & Knowledge Engineering Volume 61 , Issue 3 (June 2007) Pages 484-499 2007 [533]
This paper describes Koru, a new search interface that offers effective domain-independent knowledge-based information retrieval. Koru exhibits an understanding of the topics of both queries and documents. This allows it to (a) expand queries automatically and (b) help guide the user as they evolve their queries interactively. Its understanding is mined from the vast investment of manual effort and judgment that is Wikipedia. We show how this open, constantly evolving encyclopedia can yield inexpensive knowledge structures that are specifically tailored to expose the topics, terminology and semantics of individual document collections. We conducted a detailed user study with 12 participants and 10 topics from the 2005 TREC HARD track, and found that Koru and its underlying knowledge base offers significant advantages over traditional keyword search. It was capable of lending assistance to almost every query issued to it; making their entry more efficient, improving the relevance of the documents they return, and narrowing the gap between expert and novice seekers.
Information extraction, Lexical patterns, Ontology and thesaurus acquisition, Relation extraction
Neil L Waters Why you can't cite Wikipedia in my class Communications of the ACM Volume 50 , Issue 9 (September 2007) 2007 [534]
The online encyclopedia's method of adding information risks conflating facts with popular opinion.
education
Fabian M. Suchanek, Gjergji Kasneci and Gerhard Weikum Yago: A Large Ontology from Wikipedia and WordNet forthcoming in Elsevier Journal of Web Semantics (?) 2007 (?) [535]
This article presents YAGO, a large ontology with high coverage and precision. YAGO has been automatically derived from Wikipedia and WordNet. It comprises entities and relations, and currently contains more than 1.7 million entities and 15 million facts. These include the taxonomic Is-A hierarchy as well as semantic relations between entities. The facts for YAGO have been extracted from the category system and the infoboxes of Wikipedia and have been combined with taxonomic relations fromWordNet. Type checking techniques help us keep YAGO’s precision at 95% – as proven by an extensive evaluation study. YAGO is based on a clean logical model with a decidable consistency. Furthermore, it allows representing n-ary relations in a natural way while maintaining compatibility with RDFS. A powerful query model facilitates access to YAGO’s data.
Pierpaolo Dondio and Stephen Barret Computational Trust in Web Content Quality: A Comparative Evalutation on the Wikipedia Project Informatica 31 (2007) 151–160 151 2007 [536]
The problem of identifying useful and trustworthy information on the World Wide Web is becoming increasingly acute as new tools such as wikis and blogs simplify and democratize publication. It is not hard to predict that in the future the direct reliance on this material will expand and the problem of evaluating the trustworthiness of this kind of content become crucial. The Wikipedia project represents the most successful and discussed example of such online resources. In this paper we present a method to predict Wikipedia articles trustworthiness based on computational trust techniques and a deep domain-specific analysis. Our assumption is that a deeper understanding of what in general defines high-standard and expertise in domains related to Wikipedia – i.e. content quality in a collaborative environment – mapped onto Wikipedia elements would lead to a complete set of mechanisms to sustain trust in Wikipedia context. We present a series of experiment. The first is a study-case over a specific category of articles; the second is an evaluation over 8 000 articles representing 65% of the overall Wikipedia editing activity. We report encouraging results on the automated evaluation of Wikipedia content using our domain-specific expertise method. Finally, in order to appraise the value added by using domain-specific expertise, we compare our results with the ones obtained with a pre-processed cluster analysis, where complex expertise is mostly replaced by training and automatic classification of common features.
computational trust, Wikipedia, content-quality
Martin Hepp and Daniel Bachlechner and Katharina Siorpaes Harvesting Wiki Consensus: Using Wikipedia Entries as Vocabulary for Knowledge Management IEEE Internet Computing, Volume: 11, Issue: 5 Sept.-Oct. 2007 p. 54-65 2007 [537]
Vocabularies that provide unique identifiers for conceptual elements of a domain can improve precision and recall in knowledge-management applications. Although creating and maintaining such vocabularies is generally hard, wiki users easily manage to develop comprehensive, informal definitions of terms, each one identified by a URI. Here, the authors show that the URIs of Wikipedia entries are reliable identifiers for conceptual entities. They also demonstrate how Wikipedia entries can be used for annotating Web resources and knowledge assets and give precise estimates of the amount of Wikipedia URIs in terms of the popular Proton ontology's top-level concepts.
URIs Wikipedia knowledge management ontologies semantic knowledge management wikis
Andrew Dalby Wikipedia(s) on the language map of the world English Today (2007), 23: 3-8 Cambridge University Press 2007 [538]
This article will not try to describe the whole Wikimedia galaxy. It will stick to Wikipedia in English, and that's ambitious enough. The English-language Wikipedia, by far the biggest of them, now (28th November 2006) contains 1,506,659 articles. The German Wikipedia reached 500,000 articles on 23rd November (note in passing: the English Wikipedia has added that many articles to its total in just six months), while the French Wikipedia reached the 400,000 milestone on 27th November. The newest and smallest Wikipedia, number 250, is in the Lak language of Dagestan, in the Caucasus, with one article and 20 users. One more statistical measure will show how much Wikipedia matters. People who Google already know that for a great many Google searches one or more Wikipedia entries will turn up high on the first page of the results. They don't all know that Wikipedia now comes eleventh in alexa.com's traffic ranking of world websites. For a strictly non-commercial site with relatively academic content, that is astonishing success; what's more, the trend is steadily upwards, though it will be hard to overtake the top four: yahoo.com, msn.com, google.com, and the highly popular Chinese search engine, baidu.com.
A Bhole, B Fortuna, M Grobelnik, D Mladenic Extracting Named Entities and Relating Them over Time Based on Wikipedia Informatica, 2007 2007 [539] Based on conference paper (Conference on Data Mining and Data Warehouses (SiKDD 2007)) "Mining Wikipedia and Relating Named Entities over Time" [540]
This paper presents an approach to mining information relating people, places, organizations and events extracted from Wikipedia and linking them on a time scale. The approach consists of two phases: (1) identifying relevant pages - categorizing the articles as containing people, places or organizations; (2) generating timeline - linking named entities and extracting events and their time frame. We illustrate the proposed approach on 1.7 million Wikipedia articles.
text mining, document categorization, information extraction
K Nakayama, T Hara, S Nishio Wikipedia: A New Frontier for AI Researches JOURNAL- JAPANESE SOCIETY FOR ARTIFICIAL INTELLIGENCE 2007, VOL 22; NUMB 5, pages 693-701 2007
Rubén Rosario Rodríguez Liberating Epistemology: Wikipedia and the Social Construction of Knowledge Religious Studies and Theology, Vol 26, No 2 (2007) 2007 [541]
This investigation contends that postfoundationalist models of rationality provide a constructive alternative to the positivist models of scientific rationality that once dominated academic discourse and still shape popular views on science and religion. Wikipedia, a free online encyclopedia, has evolved organically into a cross-cultural, cross-contextual, interdisciplinary conversation that can help liberate epistemology—especially theological epistemology—from the stranglehold of Enlightenment foundationalism. U.S. Latino/a theology provides an alternative to the dominant epistemological perspective within academic theology that is in many ways analogous to the organic, conversational epistemology embodied by the Wikipedia online community. Accordingly, this investigation argues that the work of human liberation is better served by liberating epistemology from the more authoritarian aspects of the Enlightenment scientific tradition—especially popular positivist conceptions of rationality.
BS Noveck Wikipedia and the Future of Legal Education JOURNAL OF LEGAL EDUCATION, 2007 2007 [542] peer reviewed?
L Devgan, N Powe, B Blakey, M Makary Wiki-Surgery? Internal validity of Wikipedia as a medical and surgical reference Journal of the American College of Surgeons, Volume 205, Issue 3, Supplement 1, September 2007, Pages S76-S77 2007 [543]
{{{2}}}
Oded Nov What motivates Wikipedians? Communications of the ACM Volume 50 , Issue 11 (November 2007) Pages: 60 - 64 ISSN:0001-0782 2007 [544]
In order to increase and enhance user-generated content contributions, it is important to understand the factors that lead people to freely share their time and knowledge with others.
wikipedia


Muchnik, Lev; Royi Itzhack; Sorin Solomon; and Yoram Louzoun Self-emergence of knowledge trees: Extraction of the Wikipedia hierarchies Physical Review E 76, 016106 2007 [545]
The rapid accumulation of knowledge and the recent emergence of new dynamic and practically unmoderated information repositories have rendered the classical concept of the hierarchal knowledge structure irrelevant and impossible to impose manually. This led to modern methods of data location, such as browsing or searching, which conceal the underlying information structure. We here propose methods designed to automatically construct a hierarchy from a network of related terms. We apply these methods to Wikipedia and compare the hierarchy obtained from the article network to the complementary acyclic category layer of the Wikipedia and show an excellent fit. We verify our methods in two networks with no a priori hierarchy (the E. Coli genetic regulatory network and the C. Elegans neural network) and a network of function libraries of modern computer operating systems that are intrinsically hierarchical and reproduce a known functional order.
Konieczny, Piotr Wikis and Wikipedia as a Teaching Tool International Journal of Instructional Technology and Distance Learning, January 2007 2007 [546]
Wikis are a very versatile and easy-to-use tool that is finding increasing applications in teaching and learning. This paper will illustrate how teaching academics can join the wiki revolution. First. it will introduce the common wikis and then focus on Wikipedia, The Free Encyclopedia, which has become one of the most popular Internet sites and offers unique opportunities for teachers and learners. It will describe how wikis and Wikipedia are used as a teaching tool and how to develop them further.Wikipedia can be used for various assignments: for example, students can be asked to reference an unreferenced article or create a completely new one. In doing so, students will see that writing an article is not a 'tedious assignment' but an activity that millions do 'for fun'. By submitting their work to Wikipedia students will see their work benefiting – and being improved upon – by the entire world.
wikis, wikipedia, teaching, education, classroom
O'Donnell, Daniel Paul If I were "You": How Academics Can Stop Worrying and Learn to Love "the Encyclopedia that Anyone Can Edit" The Heroic Age: A Journal of Early Medieval Northwestern Europe, Issue 10, May 2007, ISSN 1526-1867 2007 [547]
"Electronic Medievalia" column in the Saints and Sanctity issue. Sections include: Time Magazine and the Participatory Web, Academic Resistance, Why the Participatory Web Works, Why Don't We Like It, Why We Can't Do Anything About It, and A New Model of Scholarship: The Wikipedia as Community Service
Pentzold, Christian, Sebastian Seidenglanz, Claudia Fraas, Peter Ohler Wikis. Bestandsaufnahme eines Forschungsfeldes und Skizzierung eines integrativen Anlayserahmens. In: Medien und Kommunikationswissenschaft. 55(1), 61-79. 2007
Martin Ebner Wikipedia Hype oder Zukunftshoffnung für die Hochschullehre E-Learning: Strategische Implementierungen und Studiengang, Tagungsband zur 13. FNMA-Tagung, Verlag Forum Neue Medien Austria S. 139-146 2007 [548] German
Pfeil, Ulrike, Panayiotis Zaphiris, Chee Siang Ang Cultural Differences in Collaborative Authoring of Wikipedia Journal of Computer-Mediated Communication, 12(1), article 5 2006 [549]
This article explores the relationship between national culture and computer-mediated communication (CMC) in Wikipedia. The articles on the topic game from the French, German, Japanese, and Dutch Wikipedia websites were studied using content analysis methods. Correlations were investigated between patterns of contributions and the four dimensions of cultural influences proposed by Hofstede (Power Distance, Collectivism versus Individualism, Femininity versus Masculinity, and Uncertainty Avoidance). The analysis revealed cultural differences in the style of contributions across the cultures investigated, some of which are correlated with the dimensions identified by Hofstede. These findings suggest that cultural differences that are observed in the physical world also exist in the virtual world.
collaboration, cultural, differences, wikipedia


Andrew Gregorowicz and Mark A. Kramer Mining a Large-Scale Term-Concept Network from Wikipedia Mitre Technical Report 2006 [550]
Social tagging and information retrieval are challenged by the fact that the same item or idea can be expressed by different terms or words. To counteract the problem of variable terminology, researchers have proposed concept-based information retrieval. To date, however, most concept spaces have been either manually-produced taxonomies or special-purpose ontologies, too small for classifying arbitrary resources. To create a large set of concepts, and to facilitate terms to concept mapping, we introduce mine a network of concepts and terms from Wikipedia. Our algorithm results in a robust, extensible term-concept network for tagging and information retrieval, containing over 2,000,000 concepts with mappings to over 3,000,000 unique terms.
Information retrieval, concept search, Wikipedia, text mining.
Stacey Kuznetsov Motivations of contributors to Wikipedia SIGCAS Comput. Soc., Vol. 36, No. 2. (June 2006) 2006 [551]
This paper aims to explain why people are motivated to contribute to the Wikipedia project. A comprehensive analysis of the motivations of Wikipedians is conducted using the iterative methodology developed by Batya Friedman and Peter Kahn in Value Sensitive Design and Information Systems and co-developed by Nissenbaum and Friedman in Bias in Computer Systems. The Value Sensitive Design (VSD) approach consists of three stages: Empirical Investigation, Conceptual Investigation, and Technical Investigation. During the empirical phase, motivations of the contributors to Wikipedia are identified through analysis of data from two published surveys and a pilot survey conducted at New York University. The underlying values behind these motivations are then defined in the conceptual phase of the study. Finally, a technical investigation is conducted in order to determine how features of the Wiki technology support and facilitate these values.
Wikipedia, motivations, value sensitive design
Lorenzen, Michael Vandals, Administrators, and Sockpuppets, Oh My! An Ethnographic Study of Wikipedia’s Handling of Problem Behavior. MLA Forum 5, no. 2, 2006 [552]
Wikipedia is a 21st Century phenomena which is forcing many to reconsider what is and what is not valid and authoritative online. Wikipedia is an online encyclopedia that anyone can edit. This creates many opportunities to expand knowledge but it also opens the project up to vandalism and abuse. Many writers have commented on this and determined that Wikipedia has a good defense against problematic behavior even if these same writers are unsure of the legitimacy of Wikipedia as a whole. Other writers have noted the need for identified authors for legitimacy to be attainable. This ethnographic study looks at a public system that Wikipedia uses to identify and correct problem behaviors from contributors. It concludes that Wikipedia does have a good system in place that can protect the integrity of articles in many instances. However, this study was limited in scope and was unable to determine if the system in place for abuse reporting is truly able to vouch for the status of Wikipedia as an authoritative resource.
Capocci A, Servedio VDP, Colaiori F, Buriol LS, Donato D, Leonardi S, Caldarelli G Preferential attachment in the growth of social networks: The internet encyclopedia Wikipedia Phys. Rev. E 74 (3): 036116 2006 [553]
We present an analysis of the statistical properties and growth of the free on-line encyclopedia Wikipedia. By describing topics by vertices and hyperlinks between them as edges, we can represent this encyclopedia as a directed graph. The topological properties of this graph are in close analogy with those of the World Wide Web, despite the very different growth mechanism. In particular, we measure a scale-invariant distribution of the in and out degree and we are able to reproduce these features by means of a simple statistical model. As a major consequence, Wikipedia growth can be described by local rules such as the preferential attachment mechanism, though users, who are responsible of its evolution, can act globally on the network.
Computer-supported collaborative work; Organizational Impacts; Information Systems; Systems and Software; Web-based services
Chesney, Thomas An empirical examination of Wikipedia's credibility First Monday. 11 (11) November 2006. 2006 [554]
Wikipedia is a free, online encyclopaedia; anyone can add content or edit existing content. The idea behind Wikipedia is that members of the general public can add their own personal knowledge, anonymously if they wish. Wikipedia then evolves over time into a comprehensive knowledge base on all things. Its popularity has never been questioned, although some have speculated about its authority. By its own admission, Wikipedia contains errors. A number of people have tested Wikipedia’s accuracy using destructive methods, i.e. deliberately inserting errors. This has been criticised by Wikipedia. This short study examines Wikipedia’s credibility by asking 258 research staff with a response rate of 21 percent, to read an article and assess its credibility, the credibility of its author and the credibility of Wikipedia as a whole. Staff were either given an article in their own expert domain or a random article. No difference was found between the two group in terms of their perceived credibility of Wikipedia or of the articles’ authors, but a difference was found in the credibility of the articles — the experts found Wikipedia’s articles to be more credible than the non–experts. This suggests that the accuracy of Wikipedia is high. However, the results should not be seen as support for Wikipedia as a totally reliable resource as, according to the experts, 13 percent of the articles contain mistakes.
Nikolaos Th. Korfiatis, Marios Poulos, George Bokos Evaluating authoritative sources using social networks: an insight from Wikipedia Online Information Review, Volume 30 Number 3 2006 pp. 252-262 2006 [555]
The purpose of this paper is to present an approach to evaluating contributions in collaborative authoring environments and in particular wikis using social network measures. A social network model for wikipedia has been constructed and metrics of importance such as centrality have been defined. Data have been gathered from articles belonging to the same topic using a web crawler in order to evaluate the outcome of the social network measures in the articles. This work tries to develop a network approach to the evaluation of wiki contributions and approaches the problem of quality of wikipedia content from a social network point of view. We believe that the approach presented here could be used to improve the authoritativeness of content found in Wikipedia and similar sources.
Encyclopaedias; Social networks
Rosenzweig, Roy Can History Be Open Source? Wikipedia and the Future of the Past Journal of American History 93 (1): 117-146 2006 [556]
History is a deeply individualistic craft. The singly authored work is the standard for the profession; only about 6 percent of the more than 32,000 scholarly works indexed since 2000 in this journal’s comprehensive bibliographic guide, “Recent Scholarship,” have more than one author. Works with several authors—common in the sciences—are even harder to find. Fewer than 500 (less than 2 percent) have three or more authors. Historical scholarship is also characterized by possessive individualism. Good professional practice (and avoiding charges of plagiarism) requires us to attribute ideas and words to specific historians—we are taught to speak of “Richard Hofstadter’s status anxiety interpretation of Progressivism.”2 And if we use more than a limited number of words from Hofstadter, we need to send a check to his estate. To mingle Hofstadter’s prose with your own and publish it would violate both copyright and professional norms. A historical work without owners and with multiple, anonymous authors is thus almost unimaginable in our professional culture. Yet, quite remarkably, that describes the online encyclopedia known as Wikipedia, which contains 3 million articles (1 million of them in English). History is probably the category encompassing the largest number of articles. Wikipedia is entirely free. And that freedom includes not just the ability of anyone to read it (a freedom denied by the scholarly journals in, say, jstor, which requires an expensive institutional subscription) but also—more remarkably—their freedom to use it. You can take Wikipedia’s entry on Franklin D. Roosevelt and put it on your own Web site, you can hand out copies to your students, and you can publish it in a book—all with only one restriction: You may not impose any more restrictions on subsequent readers and users than have been imposed on you. And it has no authors in any conventional sense. Tens of thousands of people—who have not gotten even the glory of affixing their names to it—have written it collaboratively. The Roosevelt entry, for example, emerged over four years as five hundred authors made about one thousand edits. This extraordinary freedom and cooperation make Wikipedia the most important application of the principles of the free and open-source software movement to the world of cultural, rather than software, production
Wikipedia, autorship, collaboration
Kolbitsch J, Maurer H The Transformation of the Web: How Emerging Communities Shape the Information We Consume Journal of Universal Computer Science 12 (2): 187-213. 2006 [557]
To date, one of the main aims of the World Wide Web has been to provide users with information. In addition to private homepages, large professional information providers, including news services, companies, and other organisations have set up web-sites. With the development and advance of recent technologies such as wikis, blogs, podcasting and file sharing this model is challenged and community-driven services are gaining influence rapidly. These new paradigms obliterate the clear distinction between information providers and consumers. The lines between producers and consumers are blurred even more by services such as Wikipedia, where every reader can become an author, instantly. This paper presents an overview of a broad selection of current technologies and services: blogs, wikis including Wikipedia and Wikinews, social networks such as Friendster and Orkut as well as related social services like del.icio.us, file sharing tools such as Flickr, and podcasting. These services enable user participation on the Web and manage to recruit a large number of users as authors of new content. It is argued that the transformations the Web is subject to are not driven by new technologies but by a fundamental mind shift that encourages individuals to take part in developing new structures and content. The evolving services and technologies encourage ordinary users to make their knowledge explicit and help a collective intelligence to develop.
blogs; collaborative work; community building; emergence; file sharing; information systems; podcasting; self-organisation; social networks; web-based applications; wikis
Kolbitsch J, Maurer H Community Building around Encyclopaedic Knowledge Journal of Computing and Information Technology 14 2006 [558] Despite not mentioning Wikipedia in title or abstract, the paper discusses it as one of the main examples.

This paper gives a brief overview of current technologies in systems handling encyclopaedic knowledge. Since most of the electronic encyclopaedias currently available are rather static and inflexible, greatly enhanced functionality is introduced that enables users to work more effectively and collaboratively. Users have the ability, for instance, to add annotations to every kind of object and can have private and shared workspaces. The techniques described employ user profiles in order to adapt to different users and involve statistical analysis to improve search results. Moreover, a tracking and navigation mechanism based on trails is presented. The second part of the paper details community building around encyclopaedic knowledge with the aim to involve “plain” users and experts in environments with largely editorial content. The foundations for building a user community

are specified along with significant facets such as retaining the high quality of content, rating mechanisms and social aspects. A system that implements large portions of the community-related concepts in a heterogeneous environment of several largely independent data sources is proposed. Apart from online and DVD-based encyclopaedias, potential application areas are e-Learning, corporate documentation and knowledge management systems.
Digital Libraries, Electronic Encyclopaedias, Knowledge Brokering Systems, Active Documents, Annotations, Knowledge Management, Tracking, Adaptation, Community Building
Wagner, Christian Breaking the Knowledge Acquisition Bottleneck through Conversational Knowledge Management Information Resources Management Journal Vol. 19, Issue 1 2006 [559]
Much of today’s organizational knowledge still exists outside of formal information repositories and often only in people’s heads. While organizations are eager to capture this knowledge, existing acquisition methods are not up to the task. Neither traditional artificial intelligence-based approaches nor more recent, less-structured knowledge management techniques have overcome the knowledge acquisition challenges. This article investigates knowledge acquisition bottlenecks and proposes the use of collaborative, conversational knowledge management to remove them. The article demonstrates the opportunity for more effective knowledge acquisition through the application of the principles of Bazaar style, open-source development. The article introduces wikis as software that enables this type of knowledge acquisition. It empirically analyzes the Wikipedia to produce evidence for the feasibility and effectiveness of the proposed approach.
knowledge acquisition; knowledge artifacts; knowledge management; open source development; wiki
Quiggin, John Blogs, wikis and creative innovation International Journal of Cultural Studies Vol. 9, No. 4, 481-496 2006 [560]
In this article, recent developments in the creation of web content, such as blogs and wikis, are surveyed with a focus on their role in technological and social innovation. The innovations associated with blogs and wikis are important in themselves, and the process of creative collaboration they represent is becoming central to technological progress in general. The internet and the world wide web, which have driven much of the economic growth of the past decade, were produced in this way. Standard assumptions about the competitive nature of innovation are undersupported in the new environment. If governments want to encourage the maximum amount of innovation in social production, they need to de-emphasize competition and emphasize creativity and cooperation..
blogs, cooperation, creative commons, innovation, wikis
Altmann U Representation of Medical Informatics in the Wikipedia and its Perspectives Stud Health Technol Inform 116: 755-760 2005 [561]
A wiki is a technique for collaborative development of documents on the web. The Wikipedia is a comprehensive free online encyclopaedia based on this technique which has gained increasing popularity and quality. This paper's work explored the representation of Medical Informatics in the Wikipedia by a search of specific and less specific terms used in Medical Informatics and shows the potential uses of wikis and the Wikipedia for the specialty. Test entries into the Wikipedia showed that the practical use of the so-called WikiMedia software is convenient. Yet Medical Informatics is not represented sufficiently since a number of important topics is missing. The Medical Informatics communities should consider a more systematic use of these techniques for disseminating knowledge about the specialty for the public as well as for internal and educational purposes.
Wiki, Wikipedia, Encyclopaedia, Medical Informatics
Barton M D The future of rational-critical debate in online public spheres Computers and Composition 22 (2): 177-190 2005 [562] Despite not mentioning Wikipedia in title or abstract, the paper discusses it as one of the main examples.
This paper discusses the role of blogs, wikis, and online discussion boards in enabling rationalcritical debate. I will use the work of Jurgen Habermas to explain why wikis, blogs, and online bulletin boards are all potentially valuable tools for the creation and maintenance of a critical public sphere. Habermas’ story ends on a sad note; the public writing environments he argues were so essential to the formation of a critical public sphere failed as commercialism and mass media diminished the role of the community and private persons. Unfortunately, the Internet will likely suffer a similar fate if we do not take action to preserve its inherently democratic and decentralized architecture. Here, I describe the integral role that blogs, wikis, and discussion boards play in fostering public discussion and ways they can be incorporated into college composition courses.
Habermas; Wikis; Blogs; Forums; Public spheres
McKiernan, Gerry WikimediaWorlds Part I: Wikipedia Library Hi Tech News. 22 (8) November 2005 2005 [563]
This article of part 1 of a two part series on wikis. Part 1 focuses on wikipedia. The article is prepared by a library professional and provides a summary of the main features. A wiki is a piece of server software that allows users to freely create and edit web page content using any web browser. Wiki supports hyperlinks and has a simple text syntax for creating new pages and crosslinks between internal pages on the fly. This article is a useful summary of a development of interest to library and information management professionals.
Communication technologies; Computer applications; Computer software
Miller, Nora Wikipedia and the Disappearing "Author" ETC.: A Review of General Semantics, Vol. 62, 2005 2005 [564] no open content
(summary) In this article, Nora Miller examines wikis in the light of authorship theories. She examines authoring a text has meant over the course of history. Miller explains that wikis (and other forms of digital spaces) are redefining the notion of textual ownership through means of collaboration. She mentions copyright laws and the resultant belief that there exists "self-evident" rights for authors to control and own their texts. As Miller shows with her own contributions to an entry in Wikipedia, wikis disrupt these notions of authorial rights. Much of the discussion about wikis and theory is limited to collaboration; I was happy to find one discussing wikis through the lens of authorship theory.
Wikis, Wikipedia, collaboration
Holloway, Todd, Miran Bozicevic, Katy Börner Analyzing and Visualizing the Semantic Coverage of Wikipedia and Its Authors arXiv.org cs. IR/0512085 / Submitted to Complexity, Special issue on Understanding Complex Systems. 2005 [565]
This paper presents a novel analysis and visualization of English Wikipedia data. Our specific interest is the analysis of basic statistics, the identification of the semantic structure and age of the categories in this free online encyclopedia, and the content coverage of its highly productive authors. The paper starts with an introduction of Wikipedia and a review of related work. We then introduce a suite of measures and approaches to analyze and map the semantic structure of Wikipedia. The results show that co-occurrences of categories within individual articles have a power-law distribution, and when mapped reveal the nicely clustered semantic structure of Wikipedia. The results also reveal the content coverage of the article's authors, although the roles these authors play are as varied as the authors themselves. We conclude with a discussion of major results and planned future work.
digital libraries, information storage, information retrieval
Ebersbach, Anja & Glaser, Markus Towards Emancipatory Use of a Medium: The Wiki. International Journal of Information Ethics, 11 2004 [566] Despite not mentioning Wikipedia in title or abstract, the paper discusses it as one of the main examples.
With the rapid growth of the Internet in the 1990ies due to the WWW, many people’s hopes were raised that the spirit of egality, the emancipatory power of the medium then, would be brought to the masses. With the increasing commercialization, the net became and is becoming more and more a one-way medium for advertising. Against this development, a new form of web pages has emerged and is becoming increasingly popular: the Wiki. Its distinctive feature is that any web page can be edited by anyone. Participants attribute the success to this openness and to the resulting collective production of content. In his 1970 article “Constituents of a theory of the media”, Enzensberger developed a list of seven criteria that qualify, in his opinion, the use of a medium as emancipatory. These are used to investigate the question: Can wikis be thought of as a new form of emancipatory use of the medium?
9, Natural language, User Interfaces, Hypertext, Hypermedia, Theory and models; Computer-supported cooperative work; Asynchronous interaction; Web-based interaction
Wagner, Christian Wiki: A Technology for Conversational Knowledge Management and Group Collaboration. Communications of the Association of Information Systems Vol 13 March 2004 2004 [567] Despite not mentioning Wikipedia in title or abstract, the paper discusses it as one of the main examples.
Wikis (from wikiwiki, meaning “fast” in Hawaiian) are a promising new technology that supports “conversational” knowledge creation and sharing. A Wiki is a collaboratively created and iteratively improved set of web pages, together with the software that manages the web pages. Because of their unique way of creating and managing knowledge, Wikis combine the best elements of earlier conversational knowledge management technologies, while avoiding many of their disadvantages. This article introduces Wiki technology, the behavioral and organizational implications of Wiki use, and Wiki applicability as groupware and help system software. The article concludes that organizations willing to embrace the “Wiki way” with collaborative, conversational knowledge management systems, may enjoy better than linear knowledge growth while being able to satisfy ad-hoc, distributed knowledge needs.
16, wiki, knowledge management, conversational knowledgemanagement, weblog, groupware, group decision support system
Ciffolilli, Andrea Phantom authority, self–selective recruitment and retention of members in virtual communities: The case of Wikipedia. First Monday. 8 (12) December 2003 2003 [568]
Virtual communities constitute a building block of the information society. These organizations appear capable to guarantee unique outcomes in voluntary association since they cancel physical distance and ease the process of searching for like–minded individuals. In particular, open source communities, devoted to the collective production of public goods, show efficiency properties far superior to the traditional institutional solutions to the public goods issue (e.g. property rights enforcement and secrecy). This paper employs team and club good theory as well as transaction cost economics to analyse the Wikipedia online community, which is devoted to the creation of a free encyclopaedia. An interpretative framework explains the outstanding success of Wikipedia thanks to a novel solution to the problem of graffiti attacks — the submission of undesirable pieces of information. Indeed, Wiki technology reduces the transaction cost of erasing graffiti and therefore prevents attackers from posting unwanted contributions. The issue of the sporadic intervention of the highest authority in the system is examined, and the relatively more frequent local interaction between users is emphasized. The constellation of different motivations that participants may have is discussed, and the barriers–free recruitment process analysed. A few suggestions, meant to encourage long term sustainability of knowledge assemblages, such as Wikipedia, are provided. Open issues and possible directions for future research are also discussed.
Cedergren, Magnus (2003). Open content and value creation. First Monday. 8 (8) August 2003. 2003 [569] Despite not mentioning Wikipedia in title or abstract, the paper discusses it as one of the main examples.
The borderline between production and consumption of media content is not so clear as it used to be. For example on the Internet, many people put a lot of effort into producing personal homepages in the absence of personal compensation. They publish everything from holiday pictures to complete Web directories. Illegal exchange of media material is another important trend that has a negative impact on the media industry. In this paper, I consider open content as an important development track in the media landscape of tomorrow. I define open content as content possible for others to improve and redistribute and/or content that is produced without any consideration of immediate financial reward — often collectively within a virtual community. The open content phenomenon can to some extent be compared to the phenomenon of open source. Production within a virtual community is one possible source of open content. Another possible source is content in the public domain. This could be sound, pictures, movies or texts that have no copyright, in legal terms. Which are the driving forces for the cooperation between players that work with open content? This knowledge could be essential in order to understand the dynamics of business development, technical design and legal aspects in this field. In this paper I focus on these driving forces and the relationships between these players. I have studied three major open content projects. In my analysis, I have used Gordijn’s (2002) value modeling method "e3value", modified for open content value creation and value chains. Open content value chains look much the same as commercial value chains, but there are also some major differences. In a commercial value chain, the consumers’ needs trigger the entire chain of value creation. My studies indicate that an open content value chain is often triggered by what the creators and producers wish to make available as open content. Motivations in non-monetary forms play a crucial role in the creation of open content value chains and value. My study of these aspects is based on Feller and Fitzgerald’s (2002) three perspectives on motivations underlying participation in the creation of open source software.
Benkler, Yochai Coase's penguin, or, Linux and The Nature of the Firm The Yale Law Journal. v.112, n.3, pp.369-446. 2002 [570] Despite not mentioning Wikipedia in title or abstract, the paper discusses it as one of the main examples.
Commons based peer production (e.g., free software) has emerged in the pervasively networked digital information economy as a third method of production which for some projects, has productivity gains, in the form of information and allocation gains, over market and firm-based production.
property rights, peer production
Stalder, Felix and Hirsh, Jesse Open Source Intelligence First Monday. 7 (6) Jun 2002 2002 [571]
The Open Source movement has established over the last decade a new collaborative approach, uniquely adapted to the Internet, to developing high-quality informational products. Initially, its exclusive application was the development of software (GNU/Linux and Apache are among the most prominent projects), but increasingly we can observe this collaborative approach being applied to areas beyond the coding of software. One such area is the collaborative gathering and analysis of information, a practice we term "Open Source Intelligence". In this article, we use three case studies - the nettime mailing list, the Wikipedia project and the NoLogo Web site - to show some the breadth of contexts and analyze the variety of socio-technical approaches that make up this emerging phenomenon.
James M Heilman, MD CCFP(EM); Eckhard Kemmann, MD FACOG; Michael Bonert, MD MASc; Anwesh Chatterjee, MRCP; Brent Ragar, MD; Graham M Beards, DSc; David J Iberri; Matthew Harvey, BMed; Brendan Thomas, MD; Wouter Stomp, MD; Michael F Martone; Daniel J Lodge, MD; Andrea Vondracek, PhD; Jacob F de Wolff, MRCP; Casimir Liber, MBBS FRANZCP; Samir C Grover1, MD MEd FRCPC; Tim J Vickers, PhD; Bertalan Meskó, MD; Michaël R Laurent, MD Wikipedia: A Key Tool for Global Public Health Promotion J Med Internet Res 2011 [572]
The Internet allows unprecedented opportunities for patients and the general public to retrieve health information from across the globe. Surveys have shown that online health information retrieval is both common and increasing [1-4]. Population-based studies have shown that 61% of American and 52% of European citizens have consulted the Internet for health-related information on at least one occasion [1,4]. Similarly, numerous cross-sectional surveys in patient populations have shown variable but considerable rates of eHealth activities [5-10]. Physicians frequently report that patients have searched the Internet regarding health issues [11,12], although patients do not always discuss these online activities with their doctors [13,14]. Among American e-patients, 44% said this information had a minor impact and 13% said it had a major impact on their decisions about health care [4]. Websites offering medical information differ widely in their quality [15]. While physicians should reasonably view trustworthy information as useful, some have voiced concerns that Internet information may undermine their authority and lead to self-treatment [13]. Furthermore, incorrect medical information could result in patient harm. Indeed, about 3% of users of health care information feel that they or someone they know has been seriously harmed by Web-based information [4]. A potential solution for these drawbacks is that physicians direct online health information seekers to quality resources. This so-called Internet prescription has been evaluated in a few randomized trials, which showed that it increases use of the recommended websites [16-18]. Despite concerns over the quality of health websites, the 2005 Health On the Net survey found that medical Internet users value information availability and ease-of-finding more than accuracy and trustworthiness [13]. General search engines, of which Google is the market leader in Western countries, appear to be the most common starting point for laypeople seeking health information, despite the existence of eHealth quality labels and special search engines to explore health information [4,10,13,19,20]. Search engines commonly lead seekers to Wikipedia [21]. In the 2009 Pew Internet survey on health information, 53% of e-patients had consulted Wikipedia (not necessarily related to health information) [4]. This paper examines the role of Wikipedia as a provider of online health information.
Internet; Wikipedia; public health; health information; knowledge dissemination; patient education; medical education
Haigh, Carol A. Wikipedia as an evidence source for nursing and healthcare students Nurse Education Today 2011 [573]
{{{2}}}
Technology; Evaluation; Evidence; Internet
Gardner PP, Daub J, Tate J, Moore BL, Osuch IH, Griffiths-Jones S, Finn RD, Nawrocki EP, Kolbe DL, Eddy SR, Bateman A Rfam: Wikipedia, clans and the "decimal" release Nucleic Acids Research (Database issue):D141-5 2011 [574]
The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk.
Sylvain Firer-Blaess Wikipedia : an Example for Electronic Democracy? Decision, Discipline and Discourse in the Collaborative Encyclopedia Studies In social and Political Thought, Volume 18 2010 [575]
Wikipédia and e-democracy projects have in common the establishment of a mass-scale decision process. The Wikipedian method to discuss and reach consensus is described in this article by Sylvain Firer-Blaess, using the theoretical frame of Michel Foucault and Jurgen Habermas. Can this method be applied to various e-democracy projects? In part, provided that building a free encyclopedia is not the same as living the life of the city.
Wikipedia ; Social Theory ; Organisation ; Discipline ; Discourse Ethics ; Foucault ; Habermas
Shu-Mei Tseng, Jiao-Sheng Huang The correlation between Wikipedia and knowledge sharing on job performance Expert Systems with Applications 2010 [576] Wikipedia; knowledge management; Knowledge sharing; Job performance
Brendan M. Thomas, MD, Michaël R. Laurent, MD, and Michael Martone Development of Dermatology Resources in Wikipedia Skin & Aging, Volume 18, Issue 9, September 2010 [577] Discusses the high rank wikipedia medicine related articles have on search engines and focuses on dermatology articles. Dermatology, wikipedia, collaborative editing, dermatology task force, article quality and accuracy.
Noriko Hara, Pnina Shachaf & Khe Foon Hew Cross-cultural analysis of the Wikipedia community Journal of the American Society for Information Science and Technology, Volume 61, No. 10, 2097–2108 2010
.
Wikipedia, communities of practice, cross-cultural study
Sylvain Firer-Blaess Wikipedia : exemple pour une future démocratie électronique? Homo-numericus, septembre 2010 2010 [578] organisation of Wikipedia
Dan Wielsch Governance of Massive Multiauthor Collaboration — Linux, Wikipedia, and Other Networks: Governed by Bilateral Contracts, Partnerships, or Something in Between? Jipitec, Volume 1, No. 2 (2010) 96 2010 [579]
Open collaborative projects are moving to the foreground of knowledge production. Some online user communities develop into longterm projects that generate a highly valuable and at the same time freely accessible output. Traditional copyright law that is organized around the idea of a single creative entity is not well equipped to accommodate the needs of these forms of collaboration. In order to enable a peculiar network-type of interaction participants instead draw on public licensing models that determine the freedoms to use individual contributions. With the help of these access rules the operational logic of the project can be implemented successfully. However, as the case of the Wikipedia GFDL-CC license transition demonstrates, the adaptation of access rules in networks to new circumstances raises collective action problems and suffers from pitfalls caused by the fact that public licensing is grounded in individual copyright. Legal governance of open collaboration projects is a largely unexplored field. The article argues that the license steward of a public license assumes the position of a fiduciary of the knowledge commons generated under the license regime. Ultimately, the governance of decentralized networks translates into a composite of organizational and contractual elements. It is concluded that the production of global knowledge commons relies on rules of transnational private law.
wikis as decentralized networks, Wikipedia licensing update
Alison J. Head, Michael B. Eisenberg. How today’s college students use Wikipedia for course-related research First Monday, Volume 15, No. 3 (March 2010) 2010 [580]
Findings are reported from student focus groups and a large-scale survey about how and why students (enrolled at six different U.S. colleges) use Wikipedia during the course-related research process. A majority of respondents frequently used Wikipedia for background information, but less often than they used other common resources, such as course readings and Google. Architecture, engineering, and science majors were more likely to use Wikipedia for course-related research than respondents in other majors. The findings suggest Wikipedia is used in combination with other information resources. Wikipedia meets the needs of college students because it offers a mixture of coverage, currency, convenience, and comprehensibility in a world where credibility is less of a given or an expectation from today’s students.
student use of Wikipedia,
Normann Witzleb Engaging with the World: Students of Comparative Law Write for Wikipedia Legal Education Review (2009) 9, 83-97 2009 [581]
Improving students’ computer literacy, instilling a critical approach to Internet resources and preparing them for collaborative work are important educational aims today. This practice article examines how a writing exercise in the style of a Wikipedia article can be used to develop these skills. Students in an elective unit in Comparative Law were asked to create, and review, a Wikipedia entry on an issue, concept or scholar in this field. This article will describe the rationale for adopting this writing task, how it was integrated into the teaching and assessment structure of the unit, and how students responded to the exercise. In addition to critically evaluating the potential of this novel teaching tool, the article aims to provide some practical guidance on when Wikipedia assignments might be usefully employed.
Wikipedia, eLarning, Student Use of Wikipedia, Comparative Law
Sook Lim How and why do college students use Wikipedia? Journal of the American Society for Information Science and Technology, Volume 60, No. 11, 2189-2202 2009 [582]
A web survey was used to collect data in the spring of 2008. The study sample consisted of students from an introductory undergraduate course at a large public university in the mid-western United States. A total of 134 students participated in the study, resulting in a 32.8% response rate. The major findings of the study include the following: approximately one-third of the students reported using Wikipedia for academic purposes. The students tended to use Wikipedia for checking quick facts and finding background information. They had positive past experiences with Wikipedia: however, interestingly, their perceptions of its information quality were not comparably high. The level of their confidence in evaluating its information quality was, at most, moderate. Respondents’ past experience with Wikipedia, their positive emotional state, their disposition to believe information in Wikipedia, and information utility were positively related to their outcome expectations of Wikipedia. However, when all of the independent variables, including the mediator, outcome expectations, were considered, only the variable, information utility was related to Wikipedia use. Finally, this study supports the knowledge value of Wikipedia (Fallis, 2008), despite students’ cautious attitudes toward Wikipedia. The study suggests that educators and librarians need to provide better guidelines for using Wikipedia, rather than prohibiting Wikipedia use altogether.
student use of Wikipedia, perception of information quality, motivation


Besiki Stvilia, Abdullah Al-Faraj, & Yong Jeong Yi. Issues of cross-contextual information quality evaluation—The case of Arabic, English, and Korean Wikipedias Library & Information Science Research, 31(4), 232-239 2009 [583]

Objective: An initial exploration into the issue of information quality evaluation across different cultural and community contexts based on data collected from the Arabic, English, and Korean Wikipedias showed that different Wikipedia communities may have different understandings of and models for quality. It also showed the feasibility of using some article edit-based metrics for automated quality measurement across different Wikipedia contexts. A model for measuring context similarity was developed and used to evaluate the relationship between similarities in sociocultural factors and the understanding of

information quality by the three Wikipedia communities.
Quality, Quality Models, Context translation
Ðorde Stakic Wiki technology - origin, development and importance Infotheca, No. 1-2, Vol. X, ?une 2009. 2009 [584] Origin, development and importance of Wikipedia, Wiki software (MediaWiki) and Wiki technology.

Objective: Late 20th century and early 21st century are marked by the emergence and expansion of Wiki technology in the field of informational technologies. The largest ever compiled encyclopedia, Wikipedia, emerged from Wiki technology. Compilation of Wikipedia as the most successful project based on Wiki technology showed true potential of Wiki software. This software is now widely used and imposes itself as a new standard. In addition to Wikipedia, local Wiki Web sites are important

as well.
Wiki technology, Wikipedia, MediaWiki, free software, history of computer science
Michaël R. Laurent and Tim J. Vickers Seeking Health Information Online: Does Wikipedia Matter? Journal of the American Medical Informatics Association 16:471-479 2009 [585] English Wikipedia as a prominent source of online health information compared to other health information providers studied in this paper, based on its search engine ranking and page view statistics
{{{2}}}
Health informatics, Health education, Internet
Kristine L. Callis, Lindsey R. Christ, Julian Resasco, David W. Armitage, Jeremy D. Ash, Timothy T. Caughlin, Sharon F. Clemmensen, Stella M. Copeland, Timothy J. Fullman, Ryan L. Lynch, Charley Olson, Raya A. Pruner, Ernane H.M. Vieira-Neto, Raneve West-Singh, Emilio M. Bruna Improving Wikipedia: educational opportunity and professional responsibility Trends in Ecology & Evolution 24(4):177-179 2009 PDF Note Wikipedia as a science-society interface making it a professional responsibility for scientists to improve content on it Ecology
Guido Urdaneta, Guillaume Pierre, Maarten van Steen Wikipedia Workload Analysis for Decentralized Hosting Elsevier Computer Networks 53(11), pp. 1830-1845. 2009 [586]
We study an access trace containing a sample of Wikipedia’s traffic over a 107-day period aiming to identify appropriate replication and distribution strategies in a fully decentralized hosting environment. We perform a global analysis of the whole trace, and a detailed analysis of the requests directed to the English edition of Wikipedia. In our study, we classify client requests and examine aspects such as the number of read and save operations, significant load variations and requests for nonexisting pages. We also review proposed decentralized wiki architectures and discuss how they would handle Wikipedia’s workload. We conclude that decentralized architectures must focus on applying techniques to efficiently handle read operations while maintaining consistency and dealing with typical issues on decentralized systems such as churn, unbalanced loads and malicious participating nodes.
Workload analysis; Wikipedia; Decentralized hosting; P2P
Deborah Perron Tollefsen WIKIPEDIA and the Epistemology of Testimony Episteme, volume 6, number 1, pp. 8-24 2009 [587] Summarized in the Wikipedia Signpost
In “Group Testimony” (2007) I argued that the testimony of a group cannot be understood (or at least cannot always be understood) in a summative fashion; as the testimony of some or all of the group members. In some cases, it is the group itself that testifies. I also argued that one could extend standard reductionist accounts of the justification of testimonial belief to the case of testimonial belief formed on the basis of group testimony. In this paper, I explore the issue of group testimony in greater detail by focusing on one putative source of testimony, that of Wikipedia. My aim is to the answer the following questions: Is Wikipedia a source of testimony? And if so, what is the nature of that source? Are we to understand Wikipedia entries as a collection of testimonial statements made by individuals, some subset of individuals, or is Wikipedia itself (the organization or the Wikipedia community) the entity that testifies? If Wikipedia itself is a source of testimony, what resources do we have for assessing the trustworthiness of such an unusual epistemic source? In answering these questions I hope to further elucidate the nature of collective epistemic agency (Tollefsen 2006), of which group testimony is a paradigm example.
K. Brad Wray The Epistemic Cultures of Science and WIKIPEDIA: A Comparison Episteme, volume 6, number 1, pp. 38-51 2009 [588] Summarized in the Wikipedia Signpost
I compare the epistemic culture of Wikipedia with the epistemic culture of science, with special attention to the culture of collaborative research in science. The two cultures differ markedly with respect to (1) the knowledge produced, (2) who produces the knowledge, and (3) the processes by which knowledge is produced. Wikipedia has created a community of inquirers that are governed by norms very different from those that govern scientists. Those who contribute to Wikipedia do not ground their claims on their reputations as knowers, for they stand to lose nothing if and when their contributions are found to be misleading or false. And the immediacy of the medium encourages gossip and jokes. Hence, though we have some reason to believe that an invisible hand aids scientists in realizing their epistemic goals, we cannot ground our confidence in what is reported on Wikipedia on the fact that an invisible hand ensures quality. Nor is the information on Wikipedia aptly justified in a manner similar to the way testimony can be justified.
Lawrence M. Sanger The Fate of Expertise after WIKIPEDIA Episteme, volume 6, number 1, pp. 52-73 2009 [589] Summarized in the Wikipedia Signpost
Wikipedia has challenged traditional notions about the roles of experts in the Internet Age. Section 1 sets up a paradox. Wikipedia is a striking popular success, and yet its success can be attributed to the fact that it is wide open and bottom-up. How can such a successful knowledge project disdain expertise? Section 2 discusses the thesis that if Wikipedia could be shown by an excellent survey of experts to be fantastically reliable, then experts would not need to be granted positions of special authority. But, among other problems, this thesis is self-stultifying. Section 3 explores a couple ways in which egalitarian online communities might challenge the occupational roles or the epistemic leadership roles of experts. There is little support for the notion that the distinctive occupations that require expertise are being undermined. It is also implausible that Wikipedia and its like might take over the epistemic leadership roles of experts. Section 4 argues that a main reason that Wikipedia’s articles are as good as they are is that they are edited by knowledgeable people to whom deference is paid, although voluntarily. But some Wikipedia articles suffer because so many aggressive people drive off people more knowledgeable than they are; so there is no reason to think that Wikipedia’s articles will continually improve. Moreover, Wikipedia’s commitment to anonymity further drives off good contributors. Generally, some decisionmaking role for experts is not just consistent with online knowledge communities being open and bottom-up, it is recommended as well.
P. D. Magnus On Trusting WIKIPEDIA Episteme, volume 6, number 1, pp. 74-90 2009 [590] Summarized in the Wikipedia Signpost
Given the fact that many people use Wikipedia, we should ask: Can we trust it? The empirical evidence suggests that Wikipedia articles are sometimes quite good but that they vary a great deal. As such, it is wrong to ask for a monolithic verdict on Wikipedia. Interacting with Wikipedia involves assessing where it is likely to be reliable and where not. I identify five strategies that we use to assess claims from other sources and argue that, to a greater of lesser degree, Wikipedia frustrates all of them. Interacting responsibly with something like Wikipedia requires new epistemic methods and strategies.
Piotr Konieczny Governance, Organization, and Democracy on the Internet: The Iron Law and the Evolution of Wikipedia Sociological Forum, Volume 24, Issue 1, Pages 162-192, 31 Jan 2009 2009 [591]
This study examines whether the Iron Law of Oligarchy exists in Wikipedia by analyzing how a key policy of the website regarding verifiability evolved into its current form. The study describes the decision-making processes of Wikipedia and shows that there are many factors preventing or slowing the development of oligarchy on Wikipedia. The study provides data advancing theoretical concepts related to the Iron Law of Oligarchy and the evolution of virtual communities and organizations; results and knowledge gained can also improve Wikipedia policies related to verifiability. Michels wrote: "who says organization, says oligarchy." I argue that we should follow this with a caveat: "who says wiki-organization, says no to oligarchy."
community, democracy, Internet, oligarchy, iron law, organization, Wikipedia
Ryan McGrady Gaming against the greater good First Monday, Volume 14, Number 2 - 2 February 2009 2009 [592]
Wikipedia has grown to be one of the most visited Web sites in the world. Despite its influence on popular culture and the way we think about knowledge production and consumption, the conversation about why and how it works —or whether it’s credible at all — is ongoing. This paper began as an examination of what the concept of “authority” means in Wikipedia and what role rhetoric might play in manufacturing this authority. But Wikipedia’s editors have functioned well as a community, having collaboratively developed a comprehensive set of social norms designed to place the project before any individual. Hence ideas like authority and rhetoric have only marginal roles in day–to–day activities. This paper takes an in–depth look at these norms and how they work, paying particular attention to a relatively new guideline that exemplifies the spirit of the Wikipedia community — “Gaming the system.”.


Sean Hansen, Nicholas Berente, Kalle Lyytinen Wikipedia, Critical Social Theory, and the Possibility of Rational Discourse The Information Society, Volume 25, Number 1, January 2009 , pp. 38-59 2009 [593]
Information systems researchers that apply critical social perspectives frequently emphasize the potential for information technology to serve as a mechanism for increased rationalization, domination, and control. Such theorists often overlook or discount the liberating aspects of information systems. In this study, we apply the ideal of rational discourse developed by Jurgen Habermas to the phenomenon of Wikipedia in an effort to explore empirically the emancipatory potential of information systems. We contend that Wikipedia embodies an approximation of the necessary conditions for rational discourse. While several challenges persist, the example of Wikipedia illustrates the positive potential of information systems in supporting the emergence of more emancipatory forms of communication. The corresponding implications for researchers and design professionals alike are discussed.
communicative action; critical social theory; discursive action; Habermas; rational discourse; social computing; Wikipedia
Don Fallis Toward an Epistemology of Wikipedia Journal of the American Society for Information Science and Technology, Vol. 59, No. 10, pp. 1662-1674 2008 [594]
Wikipedia (the "free online encyclopedia that anyone can edit") is having a huge impact on how a great many people gather information about the world. So, it is important for epistemologists and information scientists to ask whether or not people are likely to acquire knowledge as a result of having access to this information source. In other words, is Wikipedia having good epistemic consequences? After surveying the various concerns that have been raised about the reliability of Wikipedia, this paper argues that the epistemic consequences of people using Wikipedia as a source of information are likely to be quite good. According to several empirical studies, the reliability of Wikipedia compares favorably to the reliability of traditional encyclopedias. Furthermore, the reliability of Wikipedia compares even more favorably to the reliability of those information sources that people would be likely to use if Wikipedia did not exist (viz., websites that are as freely and easily accessible as Wikipedia). In addition, Wikipedia has a number of other epistemic virtues (e.g., power, speed, and fecundity) that arguably outweigh any deficiency in terms of reliability. Even so, epistemologists and information scientists should certainly be trying to identify changes (or alternatives) to Wikipedia that will bring about even better epistemic consequences. This paper suggests that, in order to improve Wikipedia, we need to clarify what our epistemic values are and we need a better understanding of why Wikipedia works as well as it does.
Encyclopedias, Epistemic Values, Mass Collaboration, Reliability, Social Epistemology, Verifiability, Wikipedia, Wisdom of Crowds
Jon W. Huss III, Camilo Orozco, James Goodale, Chunlei Wu, Serge Batalov, Tim J. Vickers, Faramarz Valafar, Andrew I. Su A Gene Wiki for Community Annotation of Gene Function PLoS Biology 2008 [595] Gene Wiki, biology, gene annotation
Yair Amichai–Hamburger, Naama Lamdan, Rinat Madiel, Tsahi Hayat Personality Characteristics of Wikipedia Members CyberPsychology & Behavior 2008 [596] Full article PDF
Wikipedia is an online, free access, volunteer-contributed encyclopedia. This article focuses on the Wikipedians’ (Wikipedia users) personality characteristics, studying Wikipedians’ conceptions of Real-Me and BFI dimensions. To survey these aspects, we posted links to two online web questionnaires; one was targeted at Wikipedians and the second to non-Wikipedia users. One hundred and thirty-nine subjects participated in the study, of which 69 were active Wikipedia members. It was found that Wikipedia members locate their real me on the Internet more frequently as compared to non-Wikipedia members. Variance analysis revealed significant differences between Wikipedia members and non-Wikipedia members in agreeableness, openness, and conscientiousness, which were lower for the Wikipedia members. An interaction was found between Wikipedia membership and gender: introverted women were more likely to be Wikipedia members as compared with extroverted women. The results of this study are discussed with special emphasis on the understanding of the motivators of Wikipedia members.
personality, Big Five Questionnaire
Clauson, Kevin A; Hyla H Polen, Maged N Kamel Boulos & Joan H Dzenowagis Scope, Completeness, and Accuracy of Drug Information in Wikipedia The Annals of Pharmacotherapy 2008 [597]
{{{2}}}
drug information, eHealth, Wikipedia


Katherine Ehmann, Andrew Large, and Jamshid Beheshti Collaboration in context: Comparing article evolution among subject disciplines in Wikipedia First Monday, volume 13, issue 10. 2008 [598]
This exploratory study examines the relationships between article and Talk page contributions and their effect on article quality in Wikipedia. The sample consisted of three articles each from the hard sciences, soft sciences, and humanities, whose talk page and article edit histories were observed over a five–month period and coded for contribution types. Richness and neutrality criteria were then used to assess article quality and results were compared within and among subject disciplines. This study reveals variability in article quality across subject disciplines and a relationship between Talk page discussion and article editing activity. Overall, results indicate the initial article creator’s critical role in providing a framework for future editing as well as a remarkable stability in article content over time.
Wikipedia, open source, encyclopedias, reference materials, information sources, article quality, article development
Joachim Schroer and Guido Hertel Voluntary engagement in an open web-based encyclopedia: Wikipedians, and why they do it. Media Psychology, volume 12, issue 1, 96-120 2009 [599] [600]
The online encyclopedia Wikipedia is a highly successful “Open Content” project, written and maintained completely by volunteers. Little is known, however, about the motivation of these volunteers. Results from an online survey among 106 contributors to the German Wikipedia project are presented. Both motives derived from social sciences (perceived benefits, identification with Wikipedia, etc.) as well as perceived task characteristics (autonomy, skill variety, etc.) were assessed as potential predictors of contributors’ satisfaction and self-reported engagement. Satisfaction ratings were particularly determined by perceived benefits, identification with the Wikipedia community, and task characteristics. Engagement was particularly determined by high tolerance for opportunity costs and by task characteristics, the latter effect being partially mediated by intrinsic motivation. Relevant task characteristics for contributors’ engagement and satisfaction were perceived autonomy, task significance, skill variety, and feedback. Models from social sciences and work psychology complemented each other by suggesting that favorable task experiences might counter perceived opportunity costs in Wikipedia contributors. Moreover, additional data reported by Wikipedia authors indicate the importance of generativity motives.
Volunteerism, Wikipedia, Open Content, Open Source, Intrinsic Motivation, Task

Characteristics, Generativity


Sheizaf Rafaeli and Yaron Ariel Online motivational factors: Incentives for participation and contribution in Wikipedia. Psychological aspects of cyberspace: Theory, research, applications pp. 243-267 2008 [601] Cambridge, UK: Cambridge University Press. motivations, users, user generated content UGC
N.J. Schweitzer Wikipedia and psychology: Coverage of concepts and its use by undergraduate students. Teaching of Psychology, Vol. 35, p.81-85 2008 [602] Routledge/Taylor & Francis on behalf of Division 2 of the American Psychological Association Use of WP by undergraduate college students; Survey of WP coverage of psychology-related concepts.
Matthijs den Besten and Jean-Michel Dalle Keep it Simple: A Companion for Simple Wikipedia? Industry & Innovation 15(2):169-178 2008 [603]
In this paper, we inquire about some of the ways in which the community around Simple Wikipedia—an offspring of Wikipedia, the notorious free online encyclopedia—manages the online collaborative production of reliable knowledge. We focus on how it keeps its collection of articles “simple” and easy to read. We find that the labeling of pages as “unsimple” by core members of the community plays a significant but seemingly insufficient role. We suggest that the nature of this mode of decentralized knowledge production and the structure of Wiki-technology might call for the implementation of an editorial companion to the community.
readability, companions, quality assurance
Diomidis Spinellis and Panagiotis Louridas The Collaborative Organization of Knowledge Communications of the ACM 51(8):68–73 2008 [604]
Wikipedia is an ongoing endeavor to create a free encyclopedia through an open computer-mediated collaborative effort. A longitudinal study of Wikipedia's evolution shows that although Wikipedia's scope is increasing, its coverage is not deteriorating. This can be explained by the fact that referring to an non-existing entry typically leads to the establishment of an article for it. Wikipedia's evolution also demonstrates the creation of a large real world scale-free graph through a combination of incremental growth and preferential attachment.
Wikipedia, references, growth, coverage, scale-free graph
Müller, C., Meuthrath, B., Baumgraß, A. Analyzing Wiki-based Networks to Improve Knowledge Processes in Organizations Journal of Universal Computer Science, 14(4) 2008 [605]
Increasingly wikis are used to support existing corporate knowledge exchange processes. They are an appropriate software solution to support knowledge processes. However, it is not yet proven whether wikis are an adequate knowledge management tool or not. This paper presents a new approach to analyze existing knowledge exchange processes in wikis based on network analysis. Because of their dynamic characteristics four perspectives on wiki networks are introduced to investigate the interrelationships between people, information, and events in a wiki information space. As an analysis method the Social Network Analysis (SNA) is applied to uncover existing structures and temporal changes. A scenario data set of an analysis conducted with a corporate wiki is presented. The outcomes of this analysis were utilized to improve the existing corporate knowledge processes.
collaboration network, knowledge work, network analysis, social software, wiki
Stvilia, B., Gasser, L. An activity theoretic model for information quality change First Monday, 13(4) 2008 [606]
To manage information quality (IQ) effectively, one needs to know how IQ changes over time, what causes it to change, and whether the changes can be predicted. In this paper we analyze the structure of IQ change in Wikipedia, an open, collaborative general encyclopedia. We found several patterns in Wikipedia’s IQ process trajectories and linked them to article types. Drawing on the results of our analysis, we develop a general model of IQ change that can be used for reasoning about IQ dynamics in many different settings, including traditional databases and information repositories.
Wikipedia, Activity Theory, Information Quality
Stvilia, B., Twidale, M., Smith, L. C., Gasser, L. Information quality work organization in Wikipedia JASIST, 59(6), 983–1001 2008 [607]
The classic problem within the information quality (IQ) research and practice community has been the problem of defining IQ. It has been found repeatedly that IQ is context sensitive and cannot be described, measured, and assured with a single model. There is a need for empirical case studies of IQ work in different systems to develop a systematic knowledge that can then inform and guide the construction of context-specific IQ models. This article analyzes the organization of IQ assurance work in a large-scale, open, collaborative encyclopedia—Wikipedia. What is special about Wikipedia as a resource is that the quality discussions and processes are strongly connected to the data itself and are accessible to the general public. This openness makes it particularly easy for researchers to study a particular kind of collaborative work that is highly distributed and that has a particularly substantial focus, not just on error detection, but also on error correction. We believe that the study of those evolving debates and processes and of the IQ assurance model as a whole has useful implications for the improvement of quality in other more conventional databases.
Collaborative Quality Control, Collaborative Content Creation, Information Quality, Distributed Collective Practices
Marek Meyer, Christoph Rensing and Ralf Steinmetz Using community-generated contents as a substitute corpus for metadata generation. International Journal of Advanced Media and Communications, Vol. 2, No. 1, 2008 2008 [608]
Metadata is crucial for reuse of Learning Resources. However, in the area of e-Learning, suitable training corpora for automatic classification methods are hardly available. This paper proposes the use of community-generated substitute corpora for classification methods. As an example for such a substitute corpus, the free online Encyclopaedia Wikipedia is used as a training corpus for domain-independent classification and keyword extraction of Learning Resources.
e-learning, classification, categorization, metadata generation, Wikipedia, substitute corpus, online learning, learning resourses, reuse
Shaul Oreg and Oded Nov Exploring motivations for contributing to open source initiatives: The roles of contribution context and personal values. Computers in Human Behavior, volume 24, issue 5, 2055-2073 2008 [609]
We explore contextual and dispositional correlates of the motivation to contribute to open source initiatives. We examine how the context of the open source project, and the personal values of contributors, are related to the types of motivations for contributing. A web-based survey was administered to 300 contributors in two prominent open source contexts: software and content. As hypothesized, software contributors placed a greater emphasis on reputation-gaining and self-development motivations, compared with content contributors, who placed a greater emphasis on altruistic motives. Furthermore, the hypothesized relationships were found between contributors' personal values and their motivations for contributing.
Personal values; Motivations; Open source; Wikipedia
Alexander Halavais, Derek Lackaff An Analysis of Topical Coverage of Wikipedia Journal of Computer-Mediated Communication, Vol. 13, No. 2. (2008), pp. 429-440. 2008 [610]
Many have questioned the reliability and accuracy of Wikipedia. Here a different issue, but one closely related: how broad is the coverage of Wikipedia? Differences in the interests and attention of Wikipedia’s editors mean that some areas, in the traditional sciences, for example, are better covered than others. Two approaches to measuring this coverage are presented. The first maps the distribution of topics on Wikipedia to the distribution of books published. The second compares the distribution of topics in three established, field-specific academic encyclopedias to the articles found in Wikipedia. Unlike the top-down construction of traditional encyclopedias, Wikipedia’s topical coverage is driven by the interests of its users, and as a result, the reliability and completeness of Wikipedia is likely to be different depending on the subject-area of the article.
collaboration, measurement, wiki, wikipedia
Beate Elvebakk Philosophy democratized? A comparison between Wikipedia and two other Web–based philosophy resources First Monday, volume 13, issue 2 2008 [611]

This article compares the individuals categorized as twentieth century philosophers in Wikipedia with the selection found in two major edited and widely used online philosophy resources, The Stanford Encyclopaedia of Philosophy (http://plato.stanford.edu), and the Internet Encyclopedia of Philosophy (http://www.iep.utm.edu). These are both free online resources, but unlike Wikipedia, they are written and edited by members of the academic community, and thus sanctioned by the established communities. The individuals presented as twentieth century philosophers are compared along the parameters of year of birth, gender, and national and disciplinary backgrounds. The results show that although the types of academics listed in Wikipedia are generally similar to those in the other encyclopaedias, their relative youth and their very numbers may still serve to give the user a very different impression on philosophy as a field.

Contents.
Luyt, Brendan, Tay Chee Hsien,Aaron, Lim Hai Thian, Cheng Kian Hong Improving Wikipedia's accuracy: Is edit age a solution? Journal of the American Society for Information Science and Technology, volume 59, issue 2 2008 [612]
Wikipedia is fast becoming a key information source for many despite criticism that it is unreliable and inaccurate. A number of recommendations have been made to sort the chaff from the wheat in Wikipedia, among which is the idea of color-coding article segment edits according to age (Cross, 2006). Using data collected as part of a wider study published in Nature, this article examines the distribution of errors throughout the life of a select group of Wikipedia articles. The survival time of each error edit in terms of the edit counts and days was calculated and the hypothesis that surviving material added by older edits is more trustworthy was tested. Surprisingly, we find that roughly 20% of errors can be attributed to surviving text added by the first edit, which confirmed the existence of a first-mover effect (Viegas, Wattenberg, & Kushal, 2004) whereby material added by early edits are less likely to be removed. We suggest that the sizable number of errors added by early edits is simply a result of more material being added near the beginning of the life of the article. Overall, the results do not provide support for the idea of trusting surviving segments attributed to older edits because such edits tend to add more material and hence contain more errors which do not seem to be offset by greater opportunities for error correction by later edits.
error correction; Internet information resources; editing; accuracy; temporal currency
Nielsen, Finn Årup Scientific citations in Wikipedia First Monday, volume 12, issue 8 2007 [613]
The Internet–based encyclopædia Wikipedia has grown to become one of the most visited Web sites on the Internet, but critics have questioned the quality of entries. An empirical study of Wikipedia found errors in a 2005 sample of science entries. Biased coverage and lack of sources are among the “Wikipedia risks.” This paper describes a simple assessment of these aspects by examining the outbound links from Wikipedia articles to articles in scientific journals with a comparison against journal statistics from Journal Citation Reports such as impact factors. The results show an increasing use of structured citation markup and good agreement with citation patterns seen in the scientific literature though with a slight tendency to cite articles in high–impact journals such as Nature and Science. These results increase confidence in Wikipedia as a reliable information resource for science in general.
Wikipedia; impact factor; citation

Not peer reviewed[edit]

Reviews[edit]

  • Remy, Melanie (2002). Wikipedia: The Free Encyclopedia. Online Information Review 26(6):434. Emerald
  • Levack, Kinley (2003). If Two Heads Are Better than One, Try 7,000 with Wikipedia. Econtent Magazine 26(4):12–13, April 2003. [614]
  • Crawford, Walt; Wikipedia and Worth. Cites & Insights, Oct 2004[615].
  • Crawford, Walt; Wikipedia and Worth [Revisited]. Cites & Insights, Feb 2005[616].
  • Denning, Peter; Jim Horning; David Parnas; and Lauren Weinstein (2005). Wikipedia risks. Communications of the ACM 48(12):152, December 2005. doi:10.1145/1101779.1101804
  • Giles, Jim (2005). Internet encyclopaedias go head to head. Nature 438, 900-901 (15 Dec 2005) [617]
  • Lipczynska, Sonya (2005). Power to the people: the case for Wikipedia. Reference Reviews 19(2):6–7.Emerald Ingenta (abstract)
  • Lawler, Cormac. A ‘resource review’ of Wikipedia. Counselling and Psychotherapy Research. 1473-3145 (Print) 1746-1405 (Online). Volume 6, Number 3/September 2006
  • Clauson, Kevin A; Hyla H Polen, Maged N Kamel Boulos & Joan H Dzenowagis (2008). Scope, Completeness, and Accuracy of Drug Information in Wikipedia. The Annals of Pharmacotherapy. Vol. 42, No. 12, pp. 1814-1821

Books and book chapters[edit]

See Wikipedia:Wikipedia in books

Editorials[edit]

Magazine articles[edit]

Theses[edit]

Author Title Type Institution Year Language Notes Abstract
Gehl, R. A cultural and political economy of Web 2.0 [621] Ph.D George Mason University, Virginia 2010 English
In this dissertation, I explore Web 2.0, an umbrella term for Web-based software and services such as blogs, wikis, social networking, and media sharing sites. This range of Web sites is complex, but is tied together by one key feature: the users of these sites and services are expected to produce the content included in them. That is, users write and comment upon blogs, produce the material in wikis, make connections with one another in social networks, and produce videos in media sharing sites. This has two implications. First, the increase of user-led media production has led to proclamations that mass media, hierarchy, and authority are dead, and that we are entering into a time of democratic media production. Second, this mode of media production relies on users to supply what was traditionally paid labor. To illuminate this, I explore the popular media discourses which have defined Web 2.0 as a progressive, democratic development in media production. I consider the pleasures that users derive from these sites. I then examine the technical structure of Web 2.0. Despite the arguments that present Web 2.0 as a mass appropriation of the means of media production, I have found that Web 2.0 site owners have been able to exploit users' desires to create content and control media production. Site owners do this by deploying a dichotomous structure. In a typical Web 2.0 site, there is a surface, where users are free to produce content and make affective connections, and there is a hidden depth, where new media capitalists convert user-generated content into exchange-values. Web 2.0 sites seek to hide exploitation of free user labor by limiting access to this depth. This dichotomous structure is made clearer if it is compared to the one Web 2.0 site where users have largely taken control of the products of their labor: Wikipedia. Unlike many other sites, Wikipedia allows users to see into and determine the legal, technical, and cultural depths of that site. I conclude by pointing to the different cultural formations made possible by eliminating the barrier between surface and depth in Web software architecture.
Jordan, Christopher Contextual retrieval of single Wikipedia articles to support the reading of academic abstracts Ph.D Dalhousie University (Canada) 2009 English
Google style search engines are currently some of the most popular tools that people use when they are looking for information. There are a variety of reasons that people can have for conducting a search, although, these reasons can generally be distilled down to users being engaged in a task and developing an information need that impedes them from completing that task at a level which is satisfactory to them. The Google style search engine, however, is not always the most appropriate tool for every user task. In this thesis, our approach to search differs from the traditional search engine as we focus on providing support to users who are reading academic abstracts. When people do not understand a passage in the abstract they are reading, they often look for more detailed information or a definition. Presenting them with a list of possibly relevant search results, as a Google style search would, may not immediately meet this information need. In the case of reading, it is logical to hypothesize that userswould prefer to receive a single document containing the information that they need. Developed in this thesis are retrieval algorithms that use the abstract being read along with the passage that the user is interested in to retrieve a single highly related article from Wikipedia. The top performing algorithm from the experiments conducted in this thesis is able to retrieve an appropriate article 77\% of the time. This algorithm was deployed in a prototype reading support tool. {LiteraryMark,} in order to investigate the usefulness of such a tool. The results from the user experiment conducted in this thesis indicate that {LiteraryMark} is able to significantly improve the understanding and confidence levels of people reading abstracts.
Purdy, J. Digital archives and the turn to design [622] Ph.D University of Illinois at Urbana-Champaign 2006 English
Much existing archival work productively examines the contents of archives and their role in historical research; this dissertation offers a fresh perspective on archives by adding to studies of archival texts research on archival technologies. This dissertation argues that digital archives are technologies that shape writing and research practices through their design. Rather than being neutral spaces, they are built on claims about what constitutes appropriate writing and research behaviors in the new media age. In their designs, these technologies situate print as the standard by which to evaluate their effectiveness, illustrating anxiety about the reliability and integrity of the digital. They, moreover, consistently privilege linguistic text, a challenge to embracing multimodality as a frame for composing. While the idea that archives are dynamic spaces is not new, much of the anxiety regarding digital archives continues to be that they do not fix texts---and that singular, stable processes for engaging with them are not knowable. Yet rather than distrust digital archives, I argue for viewing them as spaces that can help us understand composing and researching as dynamic, multimodal processes. The argument proceeds through case studies of three different digital archive technologies: digital document repositories (web sites that store and provide access to archival collections online), wikis (dynamic, collaboratively authored web sites that anyone can add to or change), and plagiarism detection services (web sites that test uploaded papers to determine if they include language copied directly from other sources). Specifically, my primary objects of analysis are {JSTOR} {(Journal} Storage, the Scholarly Journal Archive), Wikipedia, and Turnitin, respectively. Because technologies are both discursive and material constructions, I study the discourse surrounding and the functionality of each technology using a design approach that builds on Gunther Kress' notion of design but extends it beyond the visual to the structural. As increasing numbers of texts take digital form, the problems and promise of digital archives will demand thoughtful responses. The ways in which these spaces are designed will determine the kinds of texts that will be produced and valued in the future.
Hwang, H. Dynamic link-based ranking over large-scale graph-structured data [623] Ph.D University of California, San Diego 2010 English
Information Retrieval techniques have been the primary means of keyword search in document collections. However, as the amount and the diversity of avail- able semantic connections between objects increase, link-based ranking methods including {ObjectRank} have been proposed to provide high-recall semantic keyword search over graph-structured data. Since a wide variety of data sources can be modeled as data graphs, supporting keyword search over graph-structured data greatly improves the usability of such data sources. However, it is challenging in both online performance and result quality. We first address the performance issue of dynamic authority-based ranking methods such as personalized {PageRank} and {ObjectRank.} Since they dynamically rank nodes in a data graph using an expensive matrix-multiplication method, the online execution time rapidly increases as the size of data graph grows. Over the English Wikipedia dataset of 2007, {ObjectRank} spends 20-40 seconds to compute query-specific relevance scores, which is unacceptable. We introduce a novel approach, {BinRank,} that approximates dynamic link-based ranking scores efficiently. {BinRank} partitions a dictionary into bins of relevant keywords and then constructs materialized subgraphs {(MSGs)} per bin in preprocessing stage. In query time, to produce highly accurate {top-K} results efficiently, {BinRank} uses the {MSG} corresponding to the given keyword, instead of the original data graph. {PageRank} and {ObjectRank} calculate the global importance score and the query-specific authority score of each node respectively by exploiting the link structure of a given data graph. However, both measures favor nodes with high in-degree that may contain popular yet generic content, and thus those nodes are frequently included in {top-K} lists, regardless of given query. We propose a novel ranking measure, Inverse {ObjectRank,} which measures the content-specificity of each node by traversing the semantic links in the data graph in the reverse direction. Then, we allow users to adjust the importance of the three ranking measures (global importance, query-relevance, and content-specificity) to improve the quality of search results.
Zhu, F. Dynamics of platform-based markets [624] Ph.D Harvard University, Massachusetts 2008 English
Platform-based markets are prevalent in today's economy. Understanding the driver of platform success is of critical importance for platform providers. In this dissertation, I first develop a dynamic model to characterize conditions under which different factors drive the success of a platform, and then use the theoretical framework to analyze market-level data from the video game industry. I find that game players' marginal utility decreases rapidly with additional games after the number of games reaches a certain point, and quality is more influential than indirect network effects in driving the success of video game consoles. I also use individual-level data from Chinese Wikipedia to examine contributors' incentives to contribute. I take advantage of China's block of Chinese Wikipedia in mainland China in 2005 as a natural experiment to establish the causal relationship between contributors' incentives to contribute and the number of the beneficiaries of their contributions. I find that while on average contributors' incentives to contribute drop significantly after the block, the contribution levels of those contributors with small collaboration networks do not decrease after the block. In addition, these contributors join Wikipedia significantly earlier than the average contributor. The results suggest that other market factors such as altruism could be more influential than indirect network effects in encouraging user participation in the early stage of Chinese Wikipedia. The overall research casts doubt on the popular belief that indirect network effects are the primary force driving platform success and suggests that in many cases, other market forces could be dominant. Late movers could therefore take over market leaderships by exploiting these market forces.
Rahman, Mohammad M. Essays analyzing blogs and Wikipedia [625] Ph.D The University of Kansas 2006 English
Zhang, X. Exploiting external/domain knowledge to enhance traditional text mining using graph-based methods [626] Ph.D Drexel University, Pennsylvania 2009 English
Finding the best way to utilize external/domain knowledge to enhance traditional text mining has been a challenging task. The difficulty centers on the lack of means in representing a document with external/domain knowledge integrated. Graphs are powerful and versatile tools, useful in various subfields of science and engineering for their simple illustration of complicated problems. However, the graph-based approach on knowledge representation and discovery remains relatively unexplored. In this thesis, I propose a graph-based text mining system to incorporate semantic knowledge, document section knowledge, document linkage knowledge, and document category knowledge into the tasks of text clustering and topic analysis. I design a novel term-level graph knowledge representation and a graph-based clustering algorithm to incorporate semantic and document section knowledge for biomedical literature clustering and topic analysis. I present a Markov Random Field {(MRF)} with a Relaxation Labeling {(RL)} algorithm to incorporate document linkage knowledge. I evaluate different types of linkage among documents, including explicit linkage such as hyperlink and citation link, implicit linkage such as coauthor link and co-citation link, and pseudo linkage such as similarity link. I develop a novel semantic-based method to integrate Wikipedia concepts and categories as external knowledge into traditional document clustering. In order to support these new approaches, I develop two automated algorithms to extract multiword phrases and ontological concepts, respectively. The evaluations of news collection, web dataset, and biomedical literature prove the effectiveness of the proposed methods. In the experiment of document clustering, the proposed term-level graph-based method not only outperforms the baseline k-means algorithm in all configurations but also is superior in terms of efficiency. The {MRF-based} algorithm significantly improves spherical k-means and model-based k-means clustering on the datasets containing explicit or implicit linkage; the Wikipedia knowledge-based clustering also improves the document-content-only-based clustering. On the task of topic analysis, the proposed graph presentation, sub graph detection, and graph ranking algorithm can effectively identify corpus-level topic terms and cluster-level topic terms.
Thom-Santelli, J. Expressing territoriality in online collaborative environments [627] Ph.D Cornell University, New York 2010 English
Territoriality, the expression of ownership towards an object, can emerge when social actors occupy a shared social space. In this research, I extend the study of territoriality beyond previous work in physical space in two key ways: (1) the object in question is non-physical and (2) the social context is an online collaborative activity. To do this, I observe the emergence of characteristic territorial behaviors (e.g. marking, control, defense) in 3 studies of social software systems. Study 1 describes a qualitative interview study observes the behaviors of 15 Maintainers, a small group of lead users on Wikipedia. Findings suggest that The Maintainers communicate their feelings of ownership to other editors by appropriating features of the system, such as user templates and activity monitoring, to preserve control over the articles they maintain and communicate their knowledge of the article editing process to potential contributors. Study 2 describes a qualitative interview study observing the behaviors of 33 users of social tagging systems deployed within a large enterprise organization. Findings suggest that self-designated experts express territoriality regarding their knowledge and their status within the organization through their tagging strategies. Study 3 describes a field study of expert and novice users of a mobile social tagging system deployed within an art museum. Findings suggest that compared to novices, experts feel more personal ownership towards the museum and their tags and express territoriality regarding their expertise through higher levels of participation and are more likely to vote down novice-generated tags in a defensive manner. My dissertation draws from observations from these three studies to construct a theoretical framework for online territoriality to provide researchers and designers of groupware with guidelines with which to encourage ownership expression when appropriate. Topics for discussion and future work include clarifying the characteristics of non-physical territories, closer study of the possible reactions to territoriality, and describing the potential of territoriality as design resource for motivating experts to contribute.
Gabrilovich, Evgeniy Feature Generation for Textual Information Retrieval Using World Knowledge [628] Ph.D Technion ?Israel Institute of Technology 2006 English
Imagine an automatic news filtering system that tracks company news. Given the news item {FDA} approves ciprofloxacin for victims of anthrax inhalation" how can the system know that the drug mentioned is an antibiotic produced by Bayer? Or consider an information professional searching for data on {RFID} technology - how can a computer understand that the item {"Wal-Mart} supply chain goes real time" is relevant for the search? Algorithms we present can do just that.
Morell, Mayo Fuster Governance of online creation communities: Provision of infrastructure for the building of digital commons [629] Ph.D European University Institute, Florence 2010 English

This doctoral research is framed by the notion of a transition in which distinct commons organizational forms are gaining in importance at a time when the institutional principles of the nation state are in a state of profound crisis, and those of the private market are undergoing dramatic change. Additionally, the transformation of industrial society into a knowledge-based one is raising the importance of knowledge management, regulation and creation. This doctoral research addresses collective action for knowledge-making in the digital era from a double perspective of organizational and political conflict through the case of global online creation communities. From the organizational perspective, it provides an empirically grounded description of the organizational characteristics of emerging collective action. The research challenges previous literature by questioning the neutrality of infrastructure for collective action and demonstrating that infrastructure governance shapes collective action. Importantly, the research provides an empirical explanation of the organizational strategies most likely to succeed in creating large-scale collective action in terms of the size of participation and complexity of collaboration. From the political conflict perspective, this research maps the diverse models of governance of knowledge-making processes, addresses how these are embedded in each model of governance, and suggests a set of dimensions of democratic quality adapted to these forms. Importantly, it provides an empirically grounded characterization of two conflicting logics present in the conditions for collective action in the digital era: a commons versus a corporate logic of collective action. Additionally, the research sheds lights on the emerging free culture and access to knowledge movement as a sign of this conflict. In hypothesizing that the emerging forms of collective action are able to increase in terms of both participation and complexity while maintaining democratic principles, this research challenges Olson? assertion that formal organizations tend to overcome collective action dilemmas more easily, and challenges the classical statements of Weber and Michels that as organizations grow in size and complexity, they tend to create bureaucratic forms and oligarchies. This research concludes that online creation communities are able to increase in complexity while maintaining democratic principles. Additionally, in the light of this research, the emerging collective action forms are better characterized as hybrid ecosystems which succeed by networking and combining several components, each with differens degrees of formalization and organizational and democratic

logics.
Cosley, Daniel Regis Helping hands: Design for member-maintained online communities [630] Ph.D University of Minnesota 2006 English
Online communities provide millions of people every day with information, companionship, support, and fun. These communities need regular maintenance to function. Tasks such as welcoming new members, reviewing contributions, and building community-specific databases typically fall to a few dedicated members. Concentrating responsibility in the hands of a few valuable leaders makes communities vulnerable to leaders' leaving and limits communities' ability to grow and provide value. We study the design of member-maintained online communities, systems where many members help perform upkeep. A key design challenge is motivating members to contribute toward maintenance. Social science theories help to explain why people contribute to groups. We use these theories to design two general mechanisms for increasing people's motivation to contribute. The collective effort model from social psychology suggests people are more likely to contribute to a group if they believe their contributions matter. Editorial review can foster this belief by promoting good content and suppressing bad content. We build review systems that involve the whole community, where review is performed by peers, experts, or no one. Peer review performs about as well as expert review in both motivating contributions and providing effective review, but no review does very poorly. We also explore whether contributions must be reviewed before being made available to the community. Mathematical models suggest that making contributions available right away increases value more quickly, and does just as well in the long run, as requiring prior review. These models can inform the design of review systems. Public goods theory from economics suggests people will contribute more to group resources if the cost of contributing drops. We use intelligent task routing---matching people with tasks they are likely to do---to reduce contribution costs. We develop a number of generally useful task routing algorithms. Experiments in a movie database and in Wikipedia show these algorithms are very effective at increasing people's motivation to contribute. By using theory to support our designs, testing them in multiple domains, and distilling our results into usable artifacts such as guidelines, models, and algorithms, we hope to help designers build better systems and better communities.
Liu, Shuang Improve text retrieval effectiveness and robustness [631] Ph.D University of Illinois at Chicago 2006 English
Retrieval effectiveness and robustness are two of the most important criteria of text retrieval. Over the past decades, numerous techniques have been introduced to enhance text retrieval performance including those using phrases, passages, general dictionaries such as {WordNet,} word sense disambiguation, automatic query expansion, pseudo-relevance feedback, and external sources assisted feedback. This {Ph.D.} dissertation study focuses on improving the text retrieval effectiveness and robustness by extending existing retrieval model and providing new techniques which include: {(1)?Designing} and implementing a new retrieval model. {(2)?Utilizing} concept in text retrieval. {(3)?Designing} and implementing a highly accurate word sense disambiguation algorithm and incorporating it to our information retrieval system. {(4)?Expanding} queries by using multiple dictionaries such as {WordNet} and Wikipedia. {(5)?Employing} different pseudo relevance feedback into the retrieval system including local, web-assisted, and Wikipedia-assisted feedback and adopting semantic information to pseudo relevance feedback. In this {Ph.D.} study, our design decisions are verified through experiments in the retrieval system. Results are evaluated by standard evaluation metrics: precision, recall, mean average precision (MAP), and geometric mean average precision (GMAP)
Reagle, J. In good faith: Wikipedia collaboration and the pursuit of the universal encyclopedia [632] Ph.D New York University 2008 English
Csomai, A. Keywords in the mist: Automated keyword extraction for very large documents and back of the book indexing [633] Ph.D University of North Texas 2008 English
This research addresses the problem of automatic keyphrase extraction from large documents and back of the book indexing. The potential benefits of automating this process are far reaching, from improving information retrieval in digital libraries, to saving countless man-hours by helping professional indexers creating back of the book indexes. The dissertation introduces a new methodology to evaluate automated systems, which allows for a detailed, comparative analysis of several techniques for keyphrase extraction. We introduce and evaluate both supervised and unsupervised techniques, designed to balance the resource requirements of an automated system and the best achievable performance. Additionally, a number of novel features are proposed, including a statistical informativeness measure based on chi statistics; an encyclopedic feature that taps into the vast knowledge base of Wikipedia to establish the likelihood of a phrase referring to an informative concept; and a linguistic feature based on sophisticated semantic analysis of the text using current theories of discourse comprehension. The resulting keyphrase extraction system is shown to outperform the current state of the art in supervised keyphrase extraction by a large margin. Moreover, a fully automated back of the book indexing system based on the keyphrase extraction system was shown to lead to back of the book indexes closely resembling those created by human experts.
Bunescu, R. Learning for information extraction: From named entity recognition and disambiguation to relation extraction [634] Ph.D The University of Texas at Austin 2007 English
Information Extraction, the task of locating textual mentions of specific types of entities and their relationships, aims at representing the information contained in text documents in a structured format that is more amenable to applications in data mining, question answering, or the semantic web. The goal of our research is to design information extraction models that obtain improved performance by exploiting types of evidence that have not been explored in previous approaches. Since designing an extraction system through introspection by a domain expert is a laborious and time consuming process, the focus of this thesis will be on methods that automatically induce an extraction model by training on a dataset of manually labeled examples. Named Entity Recognition is an information extraction task that is concerned with finding textual mentions of entities that belong to a predefined set of categories. We approach this task as a phrase classification problem, in which candidate phrases from the same document are collectively classified. Global correlations between candidate entities are captured in a model built using the expressive framework of Relational Markov Networks. Additionally, we propose a novel tractable approach to phrase classification for named entity recognition based on a special Junction Tree representation. Classifying entity mentions into a predefined set of categories achieves only a partial disambiguation of the names. This is further refined in the task of Named Entity Disambiguation, where names need to be linked to their actual denotations. In our research, we use Wikipedia as a repository of named entities and propose a ranking approach to disambiguation that exploits learned correlations between words from the name context and categories from the Wikipedia taxonomy. Relation Extraction refers to finding relevant relationships between entities mentioned in text documents. Our approaches to this information extraction task differ in the type and the amount of supervision required. We first propose two relation extraction methods that are trained on documents in which sentences are manually annotated for the required relationships. In the first method, the extraction patterns correspond to sequences of words and word classes anchored at two entity names occurring in the same sentence. These are used as implicit features in a generalized subsequence kernel, with weights computed through training of Support Vector Machines. In the second approach, the implicit extraction features are focused on the shortest path between the two entities in the word-word dependency graph of the sentence. Finally, in a significant departure from previous learning approaches to relation extraction, we propose reducing the amount of required supervision to only a handful of pairs of entities known to exhibit or not exhibit the desired relationship. Each pair is associated with a bag of sentences extracted automatically from a very large corpus. We extend the subsequence kernel to handle this weaker form of supervision, and describe a method for weighting features in order to focus on those correlated with the target relation rather than with the individual entities. The resulting Multiple Instance Learning approach offers a competitive alternative to previous relation extraction methods, at a significantly reduced cost in human supervision.
Forte, A. Learning in public: Information literacy and participatory media [635] Ph.D Georgia Institute of Technology, Georgia 2009 English
Simma, A. Modeling events in time using cascades of Poisson processes [636] Ph.D University of California, Berkeley 2010 English
For many applications, the data of interest can be best thought of as events--entities that occur at a particular moment in time, have features and may in turn trigger the occurrence of other events. This thesis presents techniques for modeling the temporal dynamics of events by making each event induce an inhomogeneous Poisson process of others following it. The collection of all events observed is taken to be a draw from the superposition of the induced Poisson processes, as well as a baseline process for some of the initial triggers. The magnitude and shape of the induced Poisson processes controls the number, timing and features of the triggered events. We provide techniques for parameterizing these processes and present efficient, scalable techniques for inference. The framework is then applied to three different domains that demonstrate the power of the approach. First, we consider the problem of identifying dependencies in a computer network through passive observation and provide a technique based on hypothesis testing for accurately discovering interactions between machines. Then, we look at the relationships between Twitter messages about stocks, using the application as a test-bed to experiment with different parameterizations of induced processes. Finally, we apply these tools to build a model of the revision history of Wikipedia, identifying how the community propagates edits from a page to its neighbors and demonstrating the scalability of our approach to very large datasets.
Lin, M. Sharing knowledge and building communities: A narrative of the formation, development and sustainability of OOPS [637] Ph.D University of Houston, Texas 2006 English
Antin, J. Social operational information, competence, and participation in online collective action [638] Ph.D University of California, Berkeley 2010 English
Recent advances in interactive web technologies, combined with widespread broadband and mobile device adoption, have made online collective action commonplace. Millions of individuals work together to aggregate, annotate, and share digital text, audio, images, and video. Given the prevalence and importance of online collective action systems, researchers have increasingly devoted attention to questions about how individuals interact with and participate them. I investigate these questions with the understanding that an individual's behaviors and attitudes depend in part on what they know and believe about how the online collaborative system operates--the nuts and bolts so to speak. In this dissertation I examine how social operational information --information and beliefs about the other people who act in online collective action systems--can influence individuals' attitudes, assumptions, behaviors, and motivations with respect to those systems. I examine the role of social operational information from two distinct but related perspectives. First, I employed a social psychological laboratory study to examine the influence of a specific type of social operational information: relative competence feedback. Experimental findings demonstrate that individuals who received information that they were of low relative competence compared to others contributed less to a collective good compared to those who received either average or high relative competence feedback. Two key attitudes about abilities and responsibilities in inter-dependent situations-- self-efficacy and social responsibility --mediated the competence-contribution relationship. Furthermore, individual participants' stable preferences about the distribution of rewards for themselves and other people (social value orientation) moderated the observed changes in contribution rates across experimental conditions. Secondly, I conducted a qualitative interview study of Wikipedia's infrequent editors and readers. The study focused on documenting and understanding participants' attitudes, beliefs, and assumptions about Wikipedia's social system and the other individuals who contribute to it. Interviews focused on questions about the nature of Wikipedia and its' user-generated system, the characteristics of the people who write Wikipedia, and the motivations that encourage their participation. Qualitative analysis revealed a variety of tensions around the nature of Wikipedia as an open, user-generated system, as well as between widespread negative stereotypes of contributors as geeks, nerds, and hackers and equally prevalent positive assumptions about their pro-social motivations for contributing to Wikipedia. I argue that these tensions reveal a transition towards a view of online collaborative work as open, creative, and focused on collaboration, dominated by intrinsic motivations such as passion, interest, and a desire to contribute something to the world. This emerging view of work on Wikipedia is captured by Himanen's notion of The Hacker Ethic. Finally, I explore how qualitative and experimental findings can speak to each other, and discuss some methodological challenges and best practices for combining experimental and qualitative methods. I argue that triangulating qualitative and experimental results in the context of this study facilitates: (1) lending detail and nuance to our understanding of complex attitudes such as social responsibility, and (2) improving the ecological validity of experimental findings by vetting assumptions about competence and social roles/responsibilities in a real-world context.
Chu, E. Sparse relational data sets: Issues and an application [639] Ph.D The University of Wisconsin - Madison 2008 English
Kennedy, K. Textual curators and writing machines: Authorial agency in encyclopedias, print to digital [640] Ph.D University of Minnesota 2009 English
Wikipedia is often discussed as the first of its kind: the first massively collaborative, Web-based encyclopedia that belongs to the public domain. While it's true that wiki technology enables large-scale, distributed collaborations in revolutionary ways, the concept of a collaborative encyclopedia is not new, and neither is the idea that private ownership might not apply to such documents. More than 275 years ago, in the preface to the 1728 edition of his Cyclop?dia , Ephraim Chambers mused on the intensely collaborative nature of the volumes he was about to publish. His thoughts were remarkably similar to contemporary intellectual property arguments for Wikipedia , and while the composition processes involved in producing these texts are influenced by the available technologies, they are also unexpectedly similar. This dissertation examines issues of authorial agency in these two texts and shows that the {Author} Construct" is not static across eras genres or textual technologies. In contrast to traditional considerations of the poetic author the encyclopedic author demonstrates a different form of authorial agency that operates within strict genre conventions and does not place a premium on originality. This and related variations challenge contemporary ideas concerning the divide between print and digital authorship as well as the notion that new media intellectual property arguments are without historical precedent."
Banchuen, T. The geographical analog engine: Hybrid numeric and semantic similarity measures for U.S. cities [641] Ph.D The Pennsylvania State University 2008 English
This dissertation began with the goal to develop a methodology for locating climate change analogs, and quickly turned into a quest for computational means of locating geographical analogs in general. Previous work in geographical analogs either only computed on numeric information, or manually considered qualitative information. Current and emerging technologies, such as electronic document collections, the Internet, and the Semantic Web, make it possible for people and organizations to store millions of books and articles, share them with the world, or even author some themselves. The amount of electronic and online content is expanding at an exponential speed, such that analysts are increasingly overwhelmed by the sheer volumes of accessible information. The dissertation explores techniques from knowledge engineering, artificial intelligence, information sciences, linguistics and cognitive science, and proposes a novel, automatic methodology that computes similarity within online/offline textual information, and graphically and statistically combines the results with those of numeric methods. {U.S.} cities with populations larger than 25,000 people are selected as a test case. Places are evaluated based on their numeric characteristics in the County and City Data Book and qualitative characteristics from Wikipedia entries. The dissertation recommends a way to convert Wikipedia entries into the Web Ontology Language {(OWL)} ontologies, which computer algorithms can read, understand and compute. The dissertation initially experiments with Mitra and Wiederhold's semantic measure to quantify similarity between places in the qualitative space. Many shortfalls are identified, and a series of experimental enhancements are explored. The experiments demonstrate that good semantic measures should employ a comprehensive stop-words list and a complete, but succinct vocabulary. A semantic measure that can recognize synonyms must understand the intended senses of words in a place description. Furthermore, analysts need to be careful with two styles of descriptions: descriptions of places that are (1) created by following a template, or (2) laden with statistical statements can result in falsely high similarity between the places. It is illustrated that scatter plots of numeric similarity scores versus semantic similarity scores can effectively help analysts consider similarity between places in two-space. Analysts can visually observe whether the numeric ranks of places agree with the semantic ranks. The dissertation also shows that the Spearman's rank correlation test and the {Kruskal-Wallis} test of means can provide statistical confirmation for visual observations. The proposed hybrid methodology enables analysts to automatically discover geographical analogs in ways that strictly numeric methods or manual semantic analysis cannot offer.
LANGLOIS, GANAELE The TechnoCultural dimensions of meaning [microform]: towards a mixed semiotics of the World Wide Web [] Ph.D York University, Canada 2008 English
This dissertation project argues that the study of meaning-making practices on the Web, and particularly the analysis of the power relations that organize communicational practices, needs to involve an acknowledgement of the importance of communication technologies. This project assesses the technocultural impact of software that automatically produces and dynamically adapts content to user input through a case study analysis of amazon.com and of the {MediaWiki} software package. It offers an interdisciplinary theoretical framework that borrows from communication studies (discourse analysis, medium theory, cultural studies of technology), from new media studies (software criticism) and from Actor-network theory and Felix Guattari's mixed semiotics. In so doing, the research defines a new methodological framework through which the question of semiotics and discourse can be analyzed thanks to an exploration of the technocultural conditions that create communicative possibilities. The analysis of amazon.com examines how the deployment of tools to track, shape and predict the cultural desires of users raises questions related to the imposition of specific modes of interpretation. In particular, I highlight the process through which user-produced meanings are incorporated within software-produced semiotic systems so as to embed cultural processes within a commercial imperative. While amazon.com is an instance of the commercial use of dynamic content production techniques on the Web, Wikipedia stands as a symbol of non-commercial knowledge production. The Wikipedia model is not only cultural, but also technical as mass collaborative knowledge production depends on a suite of software tools - the {MediaWiki} architecture - that enables new discursive practices. The Wikipedia model is the result of a set of articulations between technical and cultural processes, and the case study examines how this model is captured, modified and challenged by other websites using the same wiki architecture as Wikipedia. In particular, I examine how legal and technical processes on the Web appropriate discursive practices by capitalizing on user-produced content as a source of revenue.
Coursey, K. The value of everything: Ranking and association with encyclopedic knowledge [642] Ph.D University of North Texas 2009 English
This dissertation describes {WikiRank,} an unsupervised method of assigning relative values to elements of a broad coverage encyclopedic information source in order to identify those entries that may be relevant to a given piece of text. The valuation given to an entry is based not on textual similarity but instead on the links that associate entries, and an estimation of the expected frequency of visitation that would be given to each entry based on those associations in context. This estimation of relative frequency of visitation is embodied in modifications to the random walk interpretation of the {PageRank} algorithm. {WikiRank} is an effective algorithm to support natural language processing applications. It is shown to exceed the performance of previous machine learning algorithms for the task of automatic topic identification, providing results comparable to that of human annotators. Second, {WikiRank} is found useful for the task of recognizing text-based paraphrases on a semantic level, by comparing the distribution of attention generated by two pieces of text using the encyclopedic resource as a common reference. Finally, {WikiRank} is shown to have the ability to use its base of encyclopedic knowledge to recognize terms from different ontologies as describing the same thing, and thus allowing for the automatic generation of mapping links between ontologies. The conclusion of this thesis is that the knowledge access heuristic" is valuable and that a ranking process based on a large encyclopedic resource can form the basis for an extendable general purpose mechanism capable of identifying relevant concepts by association which in turn can be effectively utilized for enumeration and comparison at a semantic level."
Priedhorsky, R. The value of geographic wikis [643] Ph.D University of Minnesota 2010 English
This thesis responds to the dual rising trends of geographic content and open content, where the core value of an information system is derived from the work of users. We define the essential properties of an emerging technology, the geographic wiki or geowiki, as well as two variations we invented: the computational geowiki, where user wiki input feeds an algorithm, and the personalized geowiki, where the system provides a personalized interpretation. We focus on two systems to develop these ideas. First, Cyclopath, a research geowiki we founded, serves the bicycle navigation needs of cyclists. We also present analysis in the context of Wikipedia, the well-known and highly successful wiki encyclopedia, using its size and maturity to draw lessons for smaller, younger systems which are far more numerous but hope to grow. We ask three questions with respect to this new technology. First, can it be built? Yes. This thesis describes the design and implementation of Cyclopath, which has grown to be a production system with thousands of users. Second, is it useful? Yes. We identified a representative geographic community, bicyclists, and they both tell us that the information in the Cyclopath geowiki is useful and show us by using the system in great numbers. We also present new ways to measure value in wikis, introducing new techniques for doing so from the perspective of information consumers. In particular, user work in Cyclopath has shortened the average route by 1 km. Also, we present techniques for obtaining more contributions (familiarity matters - sometimes - and users do work beyond what they are asked to) and algorithms for increasing the value of geowiki content by personalizing it, showing that traditional rating prediction algorithms (collaborative filtering) are not effective but simple algorithms based on clustering are. Finally, who cares? Many people. There are numerous communities with great interest in geographic information but limited, incomplete, or awkward access because the relevant knowledge is distributed among members of the community and otherwise unavailable. As our results demonstrate, geowikis are an effective way of gathering and disseminating geographic information, more so than previous techniques. Thus, this research has broad value.
Santos, M. Toward another rhetoric: Web 2.0, Levinas, and taking responsibility for response ability [644] Ph.D Purdue University, Indiana 2009 English
This dissertation explores the relationship between public considerations of the impact of contemporary dynamic technologies and the metaphysical ethics of Emmanuel Levinas. Both share an interest in interactivity, plurality, transience, and risk. This shared interest rejects the fundamental values of literacy and print identified by media theorists such as Walter J Ong, Eric Havelock, and Marshall {McLuhan--autonomy,} singularity, permanence, and security. The values of these mediums deeply impacted the development of Platonic Idealism and the Modern Enlightenment. My concluding argument suggests that, in the wake of these new mediums, the discipline of rhetoric and composition, in addition to the entire research University that houses it, should pay attention to how digital communities such as Wikipedia balance the Modern desire for ontological knowledge alongside the postmodern and digital emphasis on ethics. Such a balancing suggests that the primary values of literacy and print, and the institutions they helped to engender, are not ideally suited for a digital world.
Ortega, Felix Wikipedia. A quantitative analysis [645] Ph.D Universidad Rey Juan Carlos, Spain 2009 English
In this doctoral thesis, we undertake a quantitative analysis of the top-ten language editions of Wikipedia, from different perspectives. Our main goal has been to trace the evolution in time of key descriptive and organizational parameters of Wikipedia and its community of authors. The analysis has focused on logged authors (those editors who created a personal account to participate in the project). Among the distinct metrics included, we can ?nd the monthly evolution of general metrics (number of revisions, active editors, active pages); the distribution of pages and its length, the evolution of participation in discussion pages. We also present a detailed analysis of the inner social structure and strati?cation of the Wikipedia community of logged authors, ?tting appropriate distributions to the most relevant metrics. We also examine the inequality level of contributions from logged authors, showing that there exists a core of very active authors who undertake most of the editorial work. Regarding articles, the inequality analysis also shows that there exists a reduced group of popular articles, though the distribution of revisions is not as skewed as in the previous case. The analysis continues with an in-depth demographic study of the community of authors, focusing on the evolution of the core of very active contributors (applying a statistical technique known as survival analysis). We also explore some basic metrics to analyze the quality of Wikipedia articles and the trustworthiness level of individual authors. This work concludes with an extended analysis of the evolution of the most in?uential parameters and metrics previously presented. Based on these metrics, we infer important conclusions about the future sustainability of Wikipedia. According to these results, the Wikipedia community of authors has ceased to grow, remaining stable since Summer 2006 until the end of 2007. As a result, the monthly number of revisions has remained stable over the same period, restricting the number of articles that can be reviewed by the community. On the other side, whilst the number of revisions in talk pages has stabilized over the same period, as well, the number of active talk pages follows a steady growing rate, for all versions. This suggests that the community of authors is shifting its focus to broaden the coverage of discussion pages, which has a direct impact in the ?nal quality of content, as previous research works has shown. Regarding the inner social structure of the Wikipedia community of logged authors, we ?nd Pareto-like distributions that ?t all relevant metrics pertaining authors (number of revisions per author, number of different articles edited per author), while measurements on articles (number of revisions per article, number of different authors per article) follow lognormal shapes. The analysis of the inequality level of revisions performed by authors, and revisions received by arti- cles shows highly unequal distributions. The results of our survival analysis on Wikipedia authors presents very high mortality percentages on young authors, revealing an endemic problem of Wikipedias to keep young editors on collaborating with the project for a long period of time. In the same way, from our survival analysis we obtain that the mean lifetime of Wikipedia authors in the core (until they abandon the group of top editors) is situated between 200 and 400 days, for all versions, while the median value is lower than 120 days in all cases. Moreover the analysis of the monthly number of births and deaths in the community of logged authors reveals that the cause of the shift in the monthly trend of active authors is produced by a higher number of deaths from Summer 2006 in all versions, surpassing the monthly number of births from then on. The analysis of the inequality level of contributions over time, and the evolution of additional key features identi?ed in this thesis, reveals a worrying trend towards progressive increase of the effort spent by core authors, as time elapses. This trend may eventually cause that these authors will reach their upper limit in the number of revisions they can perform each month, thus starting a decreasing trend in the number of monthly revisions, and an overall recession of the content creation and reviewing process in Wikipedia. To prevent this probable future scenario, the number of monthly new editors should be improved again, perhaps through the adoption of speci?c policies and campaigns for attracting new editors to Wikipedia, and recover older top- contributors again. Finally, another important contribution for the research community is {WikiXRay,} the soft- ware tool we have developed to perform the statistical analyses included in this thesis. This tool completely automates the process of retrieving the database dumps from the Wikimedia public repositories, process them to obtain key metrics and descriptive parameters, and load them in a local database, ready to be used in empirical analyses. As far as we know, this is the ?rst research work implementing a comparative analysis, from an quantitative point of view, of the top-ten language editions of Wikipedia, presenting results from many different scienti?c perspectives. Therefore, we expect that this contribution will help the scienti?c community to enhance their understanding of the rich, complex and fascinating work- ing mechanisms and behavioral patterns of the Wikipedia project and its community of authors. Likewise, we hope that {WikiXRay} will facilitate the hard task of developing empirical analyses on any language version of the encyclopedia, boosting in this way the number of comparative studies like this one in many other scienti?c disciplines.
Syed, Z. Wikitology: A novel hybrid knowledge base derived from wikipedia [646] Ph.D University of Maryland, Baltimore County 2010 English
World knowledge may be available in different forms such as relational databases, triple stores, link graphs, meta-data and free text. Human minds are capable of understanding and reasoning over knowledge represented in different ways and are influenced by different social, contextual and environmental factors. By following a similar model, we have integrated a variety of knowledge sources in a novel way to produce a single hybrid knowledge base i.e., Wikitology, enabling applications to better access and exploit knowledge hidden in different forms. Wikipedia proves to be an invaluable resource for generating a hybrid knowledge base due to the availability and interlinking of structured, semi-structured and un-structured encyclopedic information. However, Wikipedia is designed in a way that facilitates human understanding and contribution by providing interlinking of articles and categories for better browsing and search of information, making the content easily understandable to humans but requiring intelligent approaches for being exploited by applications directly. Research projects like Cyc [61] have resulted in the development of a complex broad coverage knowledge base, however, relatively few applications have been built that really exploit it. In contrast, the design and development of Wikitology {KB} has been incremental and has been driven and guided by a variety of applications and approaches that exploit the knowledge available in Wikipedia in different ways. This evolution has resulted in the development of a hybrid knowledge base that not only incorporates and integrates a variety of knowledge resources but also a variety of data structures, and exposes the knowledge hidden in different forms to applications through a single integrated query interface. We demonstrate the value of the derived knowledge base by developing problem specific intelligent approaches that exploit Wikitology for a diverse set of use cases, namely, document concept prediction, cross document co-reference resolution defined as a task in Automatic Content Extraction {(ACE)} [1], Entity Linking to {KB} entities defined as a part of Text Analysis Conference - Knowledge Base Population Track 2009 [65] and interpreting tables [94]. These use cases directly serve to evaluate the utility of the knowledge base for different applications and also demonstrate how the knowledge base could be exploited in different ways. Based on our work we have also developed a Wikitology {API} that applications can use to exploit this unique hybrid knowledge resource for solving real world problems. The different use cases that exploit Wikitology for solving real world problems also contribute to enriching the knowledge base automatically. The document concept prediction approach can predict inter-article and category-links for new Wikipedia articles. Cross document co-reference resolution and entity linking provide a way for specifically linking entity mentions in Wikipedia articles or external articles to the entity articles in Wikipedia and also help in suggesting redirects. In addition to that we have also developed specific approaches aimed at automatically enriching the Wikitology {KB} by unsupervised discovery of ontology elements using the inter-article links, generating disambiguation trees for entities and estimating the page rank of Wikipedia concepts to serve as a measure of popularity. The set of approaches combined together can contribute to a number of steps in a broader unified framework for automatically adding new concepts to the Wikitology knowledge base.
Darren Hardy

Volunteered geographic information in Wikipedia

PhD University of California, Santa Barbara 2010 English Includes data [647], presentation [648], and source code [649]

Volunteered geographic information (VGI) refers to the geographic subset of user-generated content. My research focuses on spatial behaviors in VGI production and collects empirical data from 32 million Wikipedia contributions. My results find that geographic proximity is a factor in contributions, and that its influence decays exponentially and varies

categorically.
Markus Fuchs

Aufbau eines wissenschaftlichen Textcorpus auf der Basis der Daten der englischsprachigen Wikipedia [650]

Master's University of Regensburg 2009 German
With the growth in popularity over the last eight years, Wikipedia has become a very promising resource in academic studies. Some of its properties make it attractive for a wide range of research fields (information retrieval, information extraction, natural language processing, ...), e.g. free availability and up to date content. However, efficient and structured access to this information is not easy, as most of Wikipedia's contents are encoded in its own markup language (wikitext). And, unfortunately, there is no formal definition of wikitext, which makes parsing very difficult and burdensome. In this thesis, we present a system that lets the researcher automatically build a richly annotated corpus containing the information most commonly used in research projects. To this end, we built our own wikitext parser based on the original converter used by Wikipedia itself to convert wikitext into HTML. The system stores all data in a relational database, which allows for efficient access and extensive retrieval functionality.


Daniel Hasan Dalip

Um Método Automático para Estimativa da Qualidade de Enciclopédias Colaborativas On-Line: Um Estudo de Caso Sobre a Wikipédia [651]

Master's Universidade Federal de Minas Gerais 2009 Portuguese
The old dream of a universal repository containing all the human knowledge and culture is becoming possible through the Internet and the Web. Moreover, this is happening with the direct collaborative, participation of people. Wikipedia is a great example. It is an enormous repository of information with free access and edition, created by the community in a collaborative manner. However, this large amount of information, made available democratically and virtually without any control, raises questions about its relative quality. In this work we explore a significant number of quality indicators, some of them proposed by us and used here for the first time, and study their capability to assess the quality of Wikipedia articles. Furthermore, we explore machine learning techniques to combine these quality indicators into one single assessment judgment. Through experiments, we show that the most important quality indicators are the easiest ones to extract on a open digital library, namely, textual features related to length, structure and style. We were also able to determine which indicators did not contribute significantly to the quality assessment. These were, coincidentally, the most complex features, such as those based on link analysis. Finally, we compare our combination method with state-of-the-art solutions and show significant improvements in terms of effective quality prediction.
A. Belani Vandalism Detection in Wikipedia: a Bag-of-Words Classifier Approach Master's Cornell University 2009 English
A bag-of-words based probabilistic classifier is trained using regularized logistic regression to detect vandalism in the English Wikipedia. Isotonic regression is used to calibrate the class membership probabilities. Learning curve, reliability, ROC, and cost analysis are performed.
Niels Møller Christensen Wiki Culture: En analyse af organisatorisk samarbejde på Wikipedia Master's IT University of Copenhagen 2009 Danish

This thesis investigates the organisational structures behind the collaborative work on Wikipedia. The objectives are to identify formal an informal group dynamics, the relation between the individual writer and the community and furthermore how culture and norms develop in relation to the collaborative article writing. In addition the thesis discusses the power structures of Wikipedia, and whether or not informal hierarchical structures emerge despite the open and relatively flat structure of Wikipedia. This study is made through a discourse analysis of a Wikipedia article and the underlying communication between the writers of the article. The analysis is supplemented visually by discourse mapping and quantitatively by the use of statistical material obtained by data gathering and processing. The analysis is based on both sociological and media theories – especially concerning social media. The primary sociological theories are Jean Lave and Etienne Wenger's theory of community of practice and Michel Maffesoli's theories concerning neotribalism. The primary media theories are Adrian Mackenzie's theory of pre-individualism and Pierre Levy's theories of anthropological spaces and collective intelligence. The analysis shows that informal and hierarchical group structures emerge through the collaboration of Wikipedia articles despite the open and flat principles of the encyclopedia. Furthermore the analysis indicates that these group structures are upheld through both practical work and development of mutual norms, culture and language and that an interplay between individual and collective identity of the writers is necessary for the collaboration on articles.

Mohammed N M Abu-Shaaban Motivational voluntary knowledge sharing among users in the open source encyclopedia; case study (the Arabic Wikipedia)[652] Master's Edinburgh Napier University 2009 English

This research aims to examine the motives behind participation in the Arabic version of Wikipedia. Building on previous research, it argues that there are four main motivational factors: egoism, altruism, principalism and collectivism. Data is collected to examine the importance of each factor. Data collection occurs in two stages; first, Wikipedia content is researched and examined for statements that denote motivational behaviour and second, interviews are carried out with a sample of Arabic Wikipedians to discuss the results of the first stage and reach more definitive conclusions.

The research also finds a correlation between the various contents of the articles (religious, political or country-related) and the motivational factor behind contributing users in a given article.
Amir Hossein Jadidinejad Mining Structured Information from Unstructured Texts using Wikipedia-based Semantic Analysis Master's Islamic Azad University 2009 Persian
Benjamin Grassineau La dynamique des réseaux coopératifs : l'exemple des logiciels libres et du projet d'encyclopédie libre et ouverte Wikipédia PhD Université Paris Dauphine 2009 French
Liam Wyatt The Academic Lineage of Wikipedia: Connections and Disconnections in the Theory and Practice of History Bachelor's (Hons) University of New South Wales 2008 English Written in the field of History. Awarded 1st class, the history prize and the university medal. Not publicly available due to potential for publication. Please ask for a copy.

The theory and practice of Wikipedia has a common heritage with professional history. In spite of the project being very new, the number and variety of its authors and the ambivalence of academia towards it, Wikipedians have created an encyclopedia that upholds high standards of scholarship and encyclopedism. Simultaneously it provides universal easy access to knowledge. The policies and practices enacted by Wikipedia to achieve these standards are rarely unique. Facing the same challenges that encyclopedists, lexicographers, translators, librarians and archivists have before, it does not achieve a uniformly high standard but it is a new chapter in a very old book.

This thesis divides the relevant fields of historiography into three parts. The first discusses how the idea of “free” is related to history production and disseminationthe concept of the “author” over time to argue that it has never been static nor is Wikipedia unique. Rather, it is a new form. Specifically discussed are ideas of readership; of mass authorship; the authority of knowledge; cultures of reading and the universalist ideal. The third part deconstructs “truth” to show that Wikipedia is not undermining the importance of this complex idea. Elements examined are the value of professionalism as opposed to amateurism; the fixity of knowledge; and concepts of verifiability, neutrality and objectivity. by looking at methods by which it is curtailed—through copyright; censorship; destruction; price and language. Wikipedia is the latest in a long line of defenders of the ideal of free knowledge. The second part looks at

Having demonstrated the relationship of Wikipedia’s theory and practice to the discipline of history, the final chapter uses Wikipedia’s articles to highlight practical means by which historians might engage with the project as a historical source and still maintain professional standards. Discussion pages, several associated paratexts and the statistics demonstrating article popularity are considered. Finally, there is a discussion about how historians can be directly involved in the Wikipedia project—by editing it.



Dennis Hoppe Automatic Edit-War Detection in Wikipedia Bachelor's Bauhaus-University Weimar 2008 German Winner of the second prize of a thesis competition in Middle Germany, sponsored by the Society for the Promotion of Open Source Systems, GAOS e.V.
Daniel Kinzler Automatischer Aufbau eines multilingualen Thesaurus durch Extraktion semantischer und lexikalischer Relationen aus der Wikipedia Diploma Universtität Leipzig 2008 German More info: http://brightbyte.de/page/WikiWord - Paper for Wikimania 2009 (English): commons:File:Wikimania2009-WikiWord-Paper.pdf
Robert Gerling Automatic Vandalism Detection in Wikipedia Diploma Bauhaus-University Weimar 2008 German
Joel Nothman Learning Named Entity Recognition from Wikipedia Honours[which?] University of Sydney 2008
We present a method to produce free, enormous corpora to train taggers for Named Entity Recognition (NER), the task of identifying and classifying names in text, often solved by statistical learning systems. Our approach utilises the text of Wikipedia, a free online encyclopedia, transforming links between Wikipedia articles into entity annotations. Having derived a baseline corpus, we found that altering Wikipedia’s links and identifying classes of capitalised non-entity terms would enable the corpus to conform more closely to gold-standard annotations, increasing performance by up to 32% F score. The evaluation of our method is novel since the training corpus is not usually a variable in NER experimentation. We therefore develop a number of methods for analysing and comparing training corpora. Gold-standard training corpora for NER perform poorly (F score up to 32% lower) when evaluated on test data from a different gold-standard corpus. Our Wikipedia-derived data can outperform manually-annotated corpora on this cross-corpus evaluation task by up to 7% on held-out test data. These experimental results show that Wikipedia is viable as a source of automatically-annotated training corpora, which have wide domain coverage applicable to a broad range of NLP applications.
Joachim Schroer Wikipedia: Auslösende und aufrechterhaltende Faktoren der freiwilligen Mitarbeit an einem Web-2.0-Project PhD University of Würzburg 2008 German
Andreas Möllenkamp Wer schreibt die Wikipedia? Die Online-Enzyklopädie in der Vorstellungs- und Lebenswelt ihrer aktivsten Autoren. Master's University of Leipzig 2007 German
Mark W. Bell The transformation of the encyclopedia : a textual analysis and comparison of the Encyclopædia Britannica and Wikipedia Master's Ball State University 2007


Sylvain Firer-Blaess Wikipédia: le Refus du Pouvoir Master's Institut d'études politiques de Lyon 2007 French
Seah Ru Hong Knowledge contribution in Wikipedia Honors National University of Singapore 2007
Benjamin Keith Johnson Wikipedia as Collective Action: Personal incentives and enabling structures Master's Michigan State University 2007

Wikipedia is an online encyclopedia created by volunteers, and is an example of how developments in software platforms and the low cost of sharing and coordinating on the Internet are leading to a new paradigm of creative collaboration on a massive scale. This thesis addresses the questions of why individuals choose to give away their time and effort and how the challenges associated with collective action are addressed by Wikipedia’s technologies, organization, and community. Interviews with editors of the encyclopedia were used to identify what personal gains and other motivations compel contributors, what challenges to collaboration exist, and what technological and social structures aid their ability to create a freely available repository of human knowledge.

Julian Madej Wolnosc i Wiedza: Aksjonormatywny Wymiar Wikipedii (Freedom and Knowledge: The Axiomatic Dimension of Wikipedia) Master's Warsaw Univeristy 2007
Maik Anderka Methoden zur sprachübergreifenden Plagiaterkennung Master's University of Paderborn 2007 German
Sylvain Firer-Blaess Wikipedia : Governance, Mode of Production, Ethics Master of Arts' University of Sussex 2008 English
This dissertation is an inquiry on the functioning of the free web-based encyclopaedia Wikipedia, using different theoretical perspectives. The first perspective shall bring the analysis of the governance of Wikipedia, with the application of Michel Foucault’s theoretical framework about governance and discipline. The second perspective shall focus on the mode of production of Wikipedia with the use of Marxian political economy. The third and last perspective shall be the construction of an ethics from a Hegelian framework, its application to the previous results, and the possible extension to other network practices.


Chun-yu Huang A Study of Phenomena of Knowledge Sharing in Wikipedia Master's National Central University, Taiwan 2006 Chinese

Wikipedia is an encyclopedia on the Internet. It provides a lot of knowledge for the user. The first Wikipedia appeared in 2001 and was only in English. After six year of development, there are now various versions in more than 250 languages. Contents in Wikipedia were contributed and edited not by authorities, but by users of Wikipedia. As long as one wants, one can contribute to the contents of Wikipedia. Many users spent their time and energy to devote themselves to Wikipedia. Wikipedia gives no monetary reward to its contributor, but there are more and more users sharing their knowledge to Wikipedia. Does this reveal a massive pro-social phenomenon? This study thus attempts to look into factors that effect knowledge sharing of these sharing individuals. A web based questionnaire was designed, and known Wikipedia users were invited as informants. 156 valid samples were tallied out of a total of 181 returns. Empirical results reveal that reputation and altruism have positive effects on attitude of knowledge sharing, while expected reward has significant but negative effect on attitude of knowledge sharing. External control and community identification have moderating effect on the relationship between attitude of knowledge sharing and behavior of knowledge sharing. However, we failed to find evidence that support the effect of attitude of knowledge sharing on behavior of knowledge sharing. This is an issue that calls for more studies.

Natalia Kozlova Automatic Ontology Extraction for Document Classification Master's Saarland University 2006

The amount of information in the world is enormous. Millions of documents in electronic libraries, thousands of them on each personal computer waiting for the expert to organize this information, to be assigned to appropriate categories. Automatic classification can help. However, synonymy, polysemy and word usage patterns problems usually arise. Modern knowledge representation mechanisms such as ontologies can be used as a solution to these issues. Ontology-driven classification is a powerful technique which combines the advantages of modern classification methods with semantic specificity of the ontologies. One of the key issues here is the cost and difficulty of the ontology building process, especially if we do not want to stick to any specific field. Creating a generally applicable but simple ontology is a challenging task. Even manually compiled thesauri such as WordNet can be overcrowded and noisy. We propose a flexible framework for efficient ontology extraction in document classification purposes. In this work we developed a set of ontology extraction rules. Our framework was tested on the manually created corpus of Wikipedia, the free encyclopedia. We present a software tool, developed with regard to the claimed principles. Its architecture is open for embedding new features in. The ontology-driven document classification experiments were performed on the Reuters collection. We study the behavior of different classifiers on different ontologies, varying our experimental setup. Experiments show that the performance of our system is better, in comparison to other approaches. In this work we observe and state the potential of automatic ontology extraction techniques and highlight directions for the further investigation.


Lectures[edit]

Survey and poll results[edit]

Reminder: this is for academic or semi-academic surveys and polls of Wikipedia aimed at increasing our understanding of Wikipedia. For Wikipedia's own surveys used for determining consensus, policy making and dispute resolution, see Wikipedia:Requests for comment and Wikipedia:Straw polls.
See also: Category:Wikipedia surveys and polls, Wikipedia:Centralized discussion, MediaWiki talk:Sitenotice, meta:Category:Surveys, meta:Category:Polls and meta:CentralNotice.

To do: parse and analyze this

Unpublished (non-thesis)[edit]

Data sets[edit]

Bibliographies of Wikipedia Research[edit]

External links[edit]