Jump to content

Wikipedia:Wikipedia Signpost/2014-07-30/Recent research

From Wikipedia, the free encyclopedia
Recent research

Shifting values in the paid content debate; cross-language vandalism detection; translations from 53 Wiktionaries

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Understanding shifting values underlying the paid content debate on the English Wikipedia

Related articles
Does Wikipedia pay?

How paid editors squeeze you dry
31 January 2024

"Wikipedia and the assault on history"
4 December 2023

The "largest con in corporate history"?
20 February 2023

Truth or consequences? A tough month for truth
31 August 2022

The oligarchs' socks
27 March 2022

Fuzzy-headed government editing
30 January 2022

Denial: climate change, mass killings and pornography
29 November 2021

Paid promotional paragraphs in German parliamentary pages
26 September 2021

Enough time left to vote! IP ban
29 August 2021

Paid editing by a former head of state's business enterprise
25 April 2021

A "billionaire battle" on Wikipedia: Sex, lies, and video
28 February 2021

Concealment, data journalism, a non-pig farmer, and some Bluetick Hounds
28 December 2020

How billionaires rewrite Wikipedia
29 November 2020

Ban on IPs on ptwiki, paid editing for Tatarstan, IP masking
1 November 2020

Paid editing with political connections
27 September 2020

WIPO, Seigenthaler incident 15 years later
27 September 2020

Wikipedia for promotional purposes?
30 August 2020

Dog days gone bad
2 August 2020

Fox News, a flight of RfAs, and banning policy
2 August 2020

Some strange people edit Wikipedia for money
2 August 2020

Trying to find COI or paid editors? Just read the news
28 June 2020

Automatic detection of covert paid editing; Wiki Workshop 2020
31 May 2020

2019 Picture of the Year, 200 French paid editing accounts blocked, 10 years of Guild Copyediting
31 May 2020

English Wikipedia community's conclusions on talk pages
30 April 2019

Women's history month
31 March 2019

Court-ordered article redaction, paid editing, and rock stars
1 December 2018

Kalanick's nipples; Episode #138 of Drama on the Hill
23 June 2017

Massive paid editing network unearthed on the English Wikipedia
2 September 2015

Orangemoody sockpuppet case sparks widespread coverage
2 September 2015

Paid editing; traffic drop; Nicki Minaj
12 August 2015

Community voices on paid editing
12 August 2015

On paid editing and advocacy: when the Bright Line fails to shine, and what we can do about it
15 July 2015

Turkish Wikipedia censorship; "Can Wikipedia survive?"; PR editing
24 June 2015

A quick way of becoming an admin
17 June 2015

Meet a paid editor
4 March 2015

Is Wikipedia for sale?
4 February 2015

Shifting values in the paid content debate; cross-language bot detection
30 July 2014

With paid advocacy in its sights, the Wikimedia Foundation amends their terms of use
18 June 2014

Does Wikipedia Pay? The Moderator: William Beutler
11 June 2014

PR agencies commit to ethical interactions with Wikipedia
11 June 2014

Should Wikimedia modify its terms of use to require disclosure?
26 February 2014

Foundation takes aim at undisclosed paid editing; Greek Wikipedia editor faces down legal challenge
19 February 2014

Special report: Contesting contests
29 January 2014

WMF employee forced out over "paid advocacy editing"
8 January 2014

Foundation to Wiki-PR: cease and desist; Arbitration Committee elections starting
20 November 2013

More discussion of paid advocacy, upcoming arbitrator elections, research hackathon, and more
23 October 2013

Vice on Wiki-PR's paid advocacy; Featured list elections begin
16 October 2013

Ada Lovelace Day, paid advocacy on Wikipedia, sidebar update, and more
16 October 2013

Wiki-PR's extensive network of clandestine paid advocacy exposed
9 October 2013

Q&A on Public Relations and Wikipedia
25 September 2013

PR firm accused of editing Wikipedia for government clients; can Wikipedia predict the stock market?
13 May 2013

Court ruling complicates the paid-editing debate
12 November 2012

Does Wikipedia Pay? The Founder: Jimmy Wales
1 October 2012

Does Wikipedia pay? The skeptic: Orange Mike
23 July 2012

Does Wikipedia Pay? The Communicator: Phil Gomes
7 May 2012

Does Wikipedia Pay? The Consultant: Pete Forsyth
30 April 2012

Showdown as featured article writer openly solicits commercial opportunities
30 April 2012

Does Wikipedia Pay? The Facilitator: Silver seren
16 April 2012

Wikimedia announcements, Wikipedia advertising, and more!
26 April 2010

License update, Google Translate, GLAM conference, Paid editing
15 June 2009

Report of diploma mill offering pay for edits
12 March 2007

AstroTurf PR firm discovered astroturfing
5 February 2007

Account used to create paid corporate entries shut down
9 October 2006

Editing for hire leads to intervention
14 August 2006

Proposal to pay editors for contributions
24 April 2006

German Wikipedia introduces incentive scheme
18 July 2005


More articles

How paid editors squeeze you dry
31 January 2024

"Wikipedia and the assault on history"
4 December 2023

The "largest con in corporate history"?
20 February 2023

Truth or consequences? A tough month for truth
31 August 2022

The oligarchs' socks
27 March 2022

Fuzzy-headed government editing
30 January 2022

Denial: climate change, mass killings and pornography
29 November 2021

Paid promotional paragraphs in German parliamentary pages
26 September 2021

Enough time left to vote! IP ban
29 August 2021

Paid editing by a former head of state's business enterprise
25 April 2021

A "billionaire battle" on Wikipedia: Sex, lies, and video
28 February 2021

Concealment, data journalism, a non-pig farmer, and some Bluetick Hounds
28 December 2020

How billionaires rewrite Wikipedia
29 November 2020

Ban on IPs on ptwiki, paid editing for Tatarstan, IP masking
1 November 2020

Paid editing with political connections
27 September 2020

WIPO, Seigenthaler incident 15 years later
27 September 2020

Wikipedia for promotional purposes?
30 August 2020

Dog days gone bad
2 August 2020

Fox News, a flight of RfAs, and banning policy
2 August 2020

Some strange people edit Wikipedia for money
2 August 2020

Trying to find COI or paid editors? Just read the news
28 June 2020

Automatic detection of covert paid editing; Wiki Workshop 2020
31 May 2020

2019 Picture of the Year, 200 French paid editing accounts blocked, 10 years of Guild Copyediting
31 May 2020

English Wikipedia community's conclusions on talk pages
30 April 2019

Women's history month
31 March 2019

Court-ordered article redaction, paid editing, and rock stars
1 December 2018

Kalanick's nipples; Episode #138 of Drama on the Hill
23 June 2017

Massive paid editing network unearthed on the English Wikipedia
2 September 2015

Orangemoody sockpuppet case sparks widespread coverage
2 September 2015

Paid editing; traffic drop; Nicki Minaj
12 August 2015

Community voices on paid editing
12 August 2015

On paid editing and advocacy: when the Bright Line fails to shine, and what we can do about it
15 July 2015

Turkish Wikipedia censorship; "Can Wikipedia survive?"; PR editing
24 June 2015

A quick way of becoming an admin
17 June 2015

Meet a paid editor
4 March 2015

Is Wikipedia for sale?
4 February 2015

Shifting values in the paid content debate; cross-language bot detection
30 July 2014

With paid advocacy in its sights, the Wikimedia Foundation amends their terms of use
18 June 2014

Does Wikipedia Pay? The Moderator: William Beutler
11 June 2014

PR agencies commit to ethical interactions with Wikipedia
11 June 2014

Should Wikimedia modify its terms of use to require disclosure?
26 February 2014

Foundation takes aim at undisclosed paid editing; Greek Wikipedia editor faces down legal challenge
19 February 2014

Special report: Contesting contests
29 January 2014

WMF employee forced out over "paid advocacy editing"
8 January 2014

Foundation to Wiki-PR: cease and desist; Arbitration Committee elections starting
20 November 2013

More discussion of paid advocacy, upcoming arbitrator elections, research hackathon, and more
23 October 2013

Vice on Wiki-PR's paid advocacy; Featured list elections begin
16 October 2013

Ada Lovelace Day, paid advocacy on Wikipedia, sidebar update, and more
16 October 2013

Wiki-PR's extensive network of clandestine paid advocacy exposed
9 October 2013

Q&A on Public Relations and Wikipedia
25 September 2013

PR firm accused of editing Wikipedia for government clients; can Wikipedia predict the stock market?
13 May 2013

Court ruling complicates the paid-editing debate
12 November 2012

Does Wikipedia Pay? The Founder: Jimmy Wales
1 October 2012

Does Wikipedia pay? The skeptic: Orange Mike
23 July 2012

Does Wikipedia Pay? The Communicator: Phil Gomes
7 May 2012

Does Wikipedia Pay? The Consultant: Pete Forsyth
30 April 2012

Showdown as featured article writer openly solicits commercial opportunities
30 April 2012

Does Wikipedia Pay? The Facilitator: Silver seren
16 April 2012

Wikimedia announcements, Wikipedia advertising, and more!
26 April 2010

License update, Google Translate, GLAM conference, Paid editing
15 June 2009

Report of diploma mill offering pay for edits
12 March 2007

AstroTurf PR firm discovered astroturfing
5 February 2007

Account used to create paid corporate entries shut down
9 October 2006

Editing for hire leads to intervention
14 August 2006

Proposal to pay editors for contributions
24 April 2006

German Wikipedia introduces incentive scheme
18 July 2005

See related Signpost content: "Extensive network of clandestine paid advocacy exposed", "With paid advocacy in its sights, the Wikimedia Foundation amends their terms of use"
Reviewed by Heather Ford

Kim Osman has performed a fascinating study[1] on the three 2013 failed proposals to ban paid advocacy editing in the English language Wikipedia. Using a Constructivist Grounded Theory approach, Osman analyzed 573 posts from the three main votes on paid editing conducted in the community in November, 2013. She found that editors who opposed the ban felt that existing policies of neutrality and notability in WP already covered issues raised by paid advocacy editing, and that a fair and accurate encyclopedia article could be achieved by addressing the quality of the edits, not the people contributing the content. She also found that a significant challenge to any future policy is that the community 'is still not clear about what constitutes paid editing'.

Osman uses these results to argue that there has been a transition in the values of the English language Wikipedia editorial community from seeing commercial involvement as direct opposition to Wikipedia's core values (something repeated at the institutional level by the Wikimedia Foundation and Jimmy Wales who see a bright line between paid and unpaid editing) to an acceptance of paid professions and a resignation to their presence.

Osman argues that the romantic view of Wikipedia as a system somehow apart from the commercial market that characterized earlier depictions (such as those by Yochai Benkler) has been diluted in recent years and that sustainability in the current environment is linked to a platform's ability to integrate content across multiple places and spaces on the web. Osman also argues that these shifts reflect wider changes in assumptions about commerciality in digital media and that the boundaries between commercial and non-profit in the context of peer production are sometimes fuzzy, overlapping and not clearly defined.

Osman's close analysis of 573 posts is a valuable contribution to the ongoing policy debate about the role of paid editing in Wikipedia and will hopefully be used to inform future debates.

"Pivot-based multilingual dictionary building using Wiktionary"

Reviewed by Maximilian Klein (talk)
Straight edges represent translation pairs extracted directly from the Wiktionaries. The pair guildbreaslawas found via triangulating.

To build multilingual dictionaries to and from every language is combinatorially a lot of work. If one uses triangulation–if A means B, and B means C, then A means C (see figure)–then a lot of the work can be done by machine. A large closed-source effort did this in 2009[supp 1], but a new paper by Ács[2] defends "while our methods are inferior in data size, the dictionaries are available on our website"[supp 2]. Their approach used the translation tables from 53 Wiktionaries, to make 19 million inferred translations more than the 4 million already occurring in Wiktionary. The researchers steered clear of several classical problems like polysemy, one word having multiple meanings, by using a machine learning classifier. The features used in the classifier were based on the graph-theoretic attributes of each possible word pair. For instance, if two or more languages can be an intermediate "pivot" language for translation, that turned out to be a good indicator of a valid match. In order to test the precision of these translations, manual spot checking was done and found a precision of 47.9% for newly found word-pairs versus 88.4% for random translations coming out of Wiktionary. As for recall, which tested the coverage of a collection of 3,500 common words, 83.7% of words were accounted for by automatic triangulation in the top 40 languages. That means that right now if we were to try and make a 40-language pocket phrasebook to travel around most of the world just using Wiktionary, about 85% of the time there would be a translation, and it would be between 50-85% correct.

This performance would likely need to increase before any results could be operationalized and contributed back into Wiktionary. However, given the fact that the code used to parse and compare 43 different Wiktionaries was also released on GitHub[supp 3], that goal is a possibility. It's yet another testament to the open ecosystem to see a Wikimedia project along with Open Researcher efforts make a resource to rival a closed standard. While Ács' research isn't the holy grail of translation between arbitrary languages, it cleverly mixes established theory and open data, and then contributes it back to the community.

"Cross Language Learning from Bots and Users to detect Vandalism on Wikipedia"

Reviewed by Han-Teng Liao (talk)

A new study[3] by Tran and Christen is the latest example of academic research on vandalism detection which has been developed over the years[supp 4] in the context of the PAN workshop[supp 5], where researchers develop both corpus data and tools to uncover plagiarism, authorship, and the misuse of social media/software. This work should be of interests to both researchers and Wikipedians because of (a) the need to detect vandalism and (b) the interesting question whether such vandalism-fighting data and tools are transferable or portable from one language version to another. Both the vandalism-fighting corpus and tools have both practical and theoretical implications for understanding the cross-lingual transfer in knowledge and bots.

In 2010 and 2011, Wikipedia vandalism detection competitions were included by the PAN as workshops. It started with Martin Potthast's work on building the free-of-charge PAN Wikipedia vandalism corpus, PAN-WVC-10 for research, which compiled 32452 edits based on 28468 Wikipedia articles, among which 2391 vandalism instances were identified by human coders recruited from Amazon's Mechanical Turk[supp 6]. In 2011, a larger crowdsourced corpus of 30,000+ Wikipedia edits is released in three languages: English, German, and Spanish[supp 7], with 65 features to capture vandalism.

Based on even larger datasets of over 500 million revisions across five languages (en:English, de:German, es:Spanish, fr:French, and ru:Russian), Tran & Christen's latest work adds to the efforts by applying several supervised machine learning algorithms from the Scikit-learn toolkit[supp 8], including Decision Tree (DT), Random Forest (RF), Gradient Tree Boosting (GTB), Stochastic Gradient Descent (SGD) and Nearest Neighbour (NN).

What Tran & Christen confirm from their findings is that "distinguishing the vandalism identified by bots and users show statistically significant differences in recognizing vandalism identified by users across languages, but there are no differences in recognizing the vandalism identified by bots" (p.13) This demonstrates human beings can recognize a much wider spectrum of vandalism than bots, but still bots are shown to be trainable to be more sophisticated to capture more and more nonobvious cases of vandalism.

Tran & Christen try to further make the case for the benefits of cross language learning of vandalism. They argue that the detection models are generalizable, based on the positive results of transferring the machine-learned capacity from English to other smaller Wikipedia languages. While they are optimistic, they acknowledge such generalization has at best been proven among some of the languages they studied (these languages are all Roman-alphabet-based languages except for Russian), and the poor performance of the Russian language model. Thus, Tran & Christen rightly point out the need for research on non-English and especially non-European language versions. They also recognize that many word based features are no longer useful for some languages such as Mandarin Chinese, because of tokenization and other language-specific issues.

Tran & Christen call for next research projects to include languages such as Arabic and Mandarin Chinese to complete the United Nations working set of languages. It will be interesting to see how such research projects can be executed and how the greater Wikipedia research and editor community can help and/or use such research efforts.

Readers' interests differ from editors' preferences

Reviewed by Piotr Konieczny.

A conference paper titled "Reader Preferences and Behavior on Wikipedia"[4] deals with the under-studied population of Wikipedia readers. The paper provides a useful literature review on the few studies about reading preference of that group. The researchers used publicly available page view data, and more interestingly, were able to obtain browsing data (such as time spend by a reader on a given page). Since such data is unfortunately not collected by Wikipedia, the researchers obtained this data through volunteers using a Yahoo! toolbar. The authors used Wikipedia:Assessment classes to gauge article's quality.

The paper offers valuable findings, including important insights to the Wikipedia community, namely that "the most read articles do not necessarily correspond to those frequently edited, suggesting some degree of non-alignment between user reading preferences and author editing preference". This is not a finding that should come as much surprise, considering for example the high percentage of quality military history articles produced by the WikiProject Military History, one of the most active if not the most active wikiproject in existence - and of how little importance this topic is to the general population. Statistics on topics popularity and quality of corresponding articles can be seen in Table 1, page 3 of the article. Figure 1 on page 4 is also of interest, presenting a matrix of articles grouped by popularity and length. For example, the authors identify the area of "technology" as the 4th most popular, but the quality of its articles lags behind many other fields, placing it around the 9th place. It would be a worthwhile exercise for the Wikipedia community to identify popular articles that are in need of more attention (through revitalizing tools like Wikipedia:Popular pages, perhaps using code that makes WikiProject popular pages listing work?) and direct more attention towards what our readers want to read about (rather than what we want to write about). Finally, the authors also identify different reading patterns, and suggest how those can be used to analyze article's popularity in more detail.

Overall, this article seems like a very valuable piece of research for the Wikipedia community and the WMF, and it underscores why we should reconsider collecting more data on our readers' behavior. In order to serve our readers as best as we can, more information on their browsing habits on Wikipedia could help to produce more valuable research like this project.

Wikipedia from the perspective of PR and marketing

Reviewed by Piotr Konieczny.

An article[5] in "Business Horizons", written in a very friendly prose (not a common finding among academic works), looks at Wikipedia (as well as some other forms of collaborative, Web 2.0 media) from the business perspective of a public relations/marketing studies. Of particular interest to the Wikipedia community is the authors goal of presenting "the three bases of getting your entry into Wikipedia, as well as a set of guidelines that help manage the potential Wikipedia crisis that might happen one day." The authors correctly recognize that Wikipedia has policies that must be adhered to by any contributors, though a weakness of the paper is that while it discusses Wikipedia concepts such as neutrality, notability, verifiability, and conflict of interest, it does not link to them. The paper provides a set of practical advice on how to get one's business entry on Wikipedia, or how to improve it. While the paper does not suggest anything outright unethical, it is frank to the point of raising some eyebrows. While nobody can disagree with advice such as "as a rule of thumb, try to remain as objective and neutral as possible" and "when in doubt, check with others on the talk page to determine whether proposed changes are appropriate", given the lack of consensus among Wikipedia's community on how to deal with for-profit and PR editors, other advice such as "maximize mentions in other Wikipedia entries" (i.e. gaming WP:RED), "be associated with serious contributors...leverage the reputation of an employee who is already a highly active contributor... [befriend Wikipedians in real life]", "When correcting negative information is not possible, try counterbalancing it by adding more positive elements about your firm, as long as the facts are interesting and verifiable", "...you might edit the negative section by replacing numerals (99) with words (ninety-nine), since this is also less likely to be read. Add pictures to draw focus away from the negative content" might be seen as more controversial, falling into the gaming the system gray area. The "Third, get help from friends and family" section in particular seems to fall foul of meatpuppetry.

In the end, this is an article worth reading in detail by all interested in the PR/COI topics, though for better or worse, the fact that it is closed access will likely reduce its impact significantly. On an ending note, one of the two article's co-authors has a page on Wikipedia at Andreas Kaplan, which was restored by a newbie editor in 2012, two years after its deletion, has been maintained by throw-away SPAs, and this reviewer cannot help but notice that it still seems to fail Wikipedia:Notability (academics)...

"No praise without effort: experimental evidence on how rewards affect Wikipedia's contributor community"

Reviewed by Piotr Konieczny.

In 2012, the authors of this paper[6] have given out over a hundred barnstars to the top 1% most active Wikipedians, and concluded that such awards improve editors productivity. This time they repeated this experiment while broadening their sample size to the top 10% most active editors. After excluding administrators and recently inactive editors, they handed out 300 barnstars "with a generic positive text that expressed community appreciation for their contributions", divided between the 91st–95th, 96th–99th, and 100th percentiles of the most active editors (this corresponds to an average of 282, 62 and 22 edits per month) and then tracked the activity of those editors, as well as of the corresponding control sample which did not receive any award. The experiment was designed to test the hypothesis that less active contributors will be responsive to rewards, similar to the most highly-active contributors from the prior research.

The authors found, however, that rewarding less productive editors did not stimulate higher subsequent productivity. They note that while the top 1% group responded to an award with an increase in productivity (measured at a rather high 60% increase), less productive subjects did not change their behavior significantly. The researchers also noted that while some of the top 1% editors received an additional award from other Wikipedians, not a single subject from the less active group was a recipient of another award.

The researchers conclude that "this supports the notion that peer production’s incentive structure is broadly meritocratic; we did not observe contributors receiving praise or recognition without having first demonstrated significant and substantial effort." While this will come as little surprise to the Wikipedia community, their other observation - that outside the top 1% of editors, awards such as barnstars have little meaningful impact - is more interesting.

Further, the authors found that while rewarding the most active editors tends to increase their retention ratio, it may counter-intuitively decrease the retention ratio of the less active editors. The authors propose the following explanation: "Premature recognition of their work may convey a different meaning to these contributors; instead of signaling recognition and status in the eyes of the community, these individuals may perceive being rewarded as a signal that their contributions are sufficient, for the time being, or come to expect being rewarded for their contributions." They suggest that this could be better understood through future research. For the community in general, it raises an interesting question: how should we recognize less active editors, to make sure that thanking them will not be taken as "you did enough, now you can leave"?

Briefly

  • Wikipedia assignments improve students' research skills: It is refreshing to see a continuing and growing stream of academic works endorsing various aspects of teaching with Wikipedia paradigm. A study[7] of eleven students "enrolled in a semester-long academic literacy course in a preparatory program for study at an Australian university... showed an educationally statistical improvement in the students’ research skills, while qualitative comments revealed that despite some technical difficulties in using the Wikipedia site, many students valued the opportunity to write for a ‘real’ audience and not just for a lecturer."
  • A split in the growing field of Chinese-language Wikipedia research: A blog post[8] by Han-Teng Liao (廖漢騰) presents an interesting exploratory overview of a Chinese language research on Wikipedia. The findings suggest that Chinese-language scholars and academic publication outlets are increasingly doing research in the field of Wikipedia studies; however there's "a divide between mainland Chinese academic sources/search results on one hand, and Hong Kong/Taiwanese ones on the other." The reason for this seems to be primarily technical, as scholars from different regions seem to publish in different outlets, which in turn are not indexed in the academic search engines preferred by those from other region.

Other recent publications

A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.

  • "Uneven Openness: Barriers to MENA [Middle East/North Africa] Representation on Wikipedia"[9] (blog post)
  • " Detecting epidemics using Wikipedia article views: A demonstration of feasibility with language as location proxy"[10]
  • "The Reasons of People Continue Editing Wikipedia Content - Task Value Confirmation Perspective"[11]
  • "Circling the Infinite Loop, One Edit at a Time: Seriality in Wikipedia and the Encyclopedic Urge"[12]
  • "Identifying Duplicate and Contradictory Information in Wikipedia"[13]
  • "The impact of elite vs. non-elite contributor groups in online social production communities: The case of Wikipedia"[14]
  • "What do we Think an Encyclopaedia is?"[15] From the abstract: "Based on survey and interview research carried out with publishers, librarians and higher education students, [this article] demonstrates that certain physical features and qualities are associated with the encyclopaedia and continue to be valued by them. Having identified these qualities, the article then explores whether they apply to three incidences of electronic encyclopaedias, Britannica Online, The Stanford Encyclopedia of Philosophy and Wikipedia."
  • " Crowdsourcing Knowledge Interdiscursive Flows from Wikipedia into Scholarly Research"[16]. From the abstract: "using a dataset collected from the Scopus research database, which is processed with a combination of bibliometric techniques and qualitative analysis [this article finds] that there has been a significant increase in the use of Wikipedia as a reference within all areas of science and scholarship. Wikipedia is used to a larger extent within areas like Computer Science, Mathematics, Social Sciences and Arts and Humanities, than in Natural Sciences, Medicine and Psychology."
  • "How Readers Shape the Content of an Encyclopedia: A Case Study Comparing the German Meyers Konversationslexikon (1885-1890) with Wikipedia (2002-2013)"[17]

References

  1. ^ Osman, Kim (2014-06-17). "The Free Encyclopaedia that Anyone can Edit: The Shifting Values of Wikipedia Editors". Culture Unbound: Journal of Current Cultural Research. 6 (3): 593–607. doi:10.3384/cu.2000.1525.146593.
  2. ^ Ács, Judit (May 26–31, 2014). "Pivot-based multilingual dictionary building using Wiktionary" (PDF).
  3. ^ Tran, Khoi-Nguyen; Christen, P. (2014). "Cross Language Learning from Bots and Users to detect Vandalism on Wikipedia". IEEE Transactions on Knowledge and Data Engineering. 27 (3): 673–685. doi:10.1109/TKDE.2014.2339844.
  4. ^ Reader Preferences and Behavior on Wikipedia. HT’14, September 1–4, 2014, Santiago, Chile. http://www.dcs.gla.ac.uk/~mounia/Papers/wiki.pdf
  5. ^ Kaplan, Andreas; Haenlein, Michael (2014). "Collaborative projects (social media application): About Wikipedia, the free encyclopedia". Business Horizons. 57 (5): 617–626. doi:10.1016/j.bushor.2014.05.004. Closed access icon
  6. ^ Restivo, Michael; van de Rijt, Arnout (2014). "No praise without effort: experimental evidence on how rewards affect Wikipedia's contributor community". Information, Communication & Society. 17 (4): 451–462. doi:10.1080/1369118X.2014.888459.
  7. ^ Miller, Julia (2014-06-13). "Building academic literacy and research skills by contributing to Wikipedia: A case study at an Australian university". Journal of Academic Language and Learning. 8 (2): A72–A86.
  8. ^ Liao, Han-Teng (2014-06-20). "Chinese-language literature about Wikipedia: a meta-analysis of academic search engine result pages".
  9. ^ Graham, Mark; Hogan, Bernie (2014-04-29). "Uneven Openness: Barriers to MENA Representation on Wikipedia". SSRN 2430912.
  10. ^ Generous, Nicholas; Fairchild, Geoffrey; Deshpande, Alina; Del Valle, Sara Y.; Priedhorsky, Reid. "Detecting epidemics using Wikipedia article views: A demonstration of feasibility with language as location proxy". arXiv:1405.3612v1 [cs.SI].
    Revised and published as Generous, Nicholas; Fairchild, Geoffrey; Deshpande, Alina; Del Valle, Sara Y.; Priedhorsky, Reid (2014). "Global Disease Monitoring and Forecasting with Wikipedia". PLOS Computational Biology. 10 (11): e1003892. arXiv:1405.3612. Bibcode:2014PLSCB..10E3892G. doi:10.1371/journal.pcbi.1003892. PMC 4231164. PMID 25392913.
  11. ^ Lai, Cheng-Yu; Heng-Li Yang (2014). "The Reasons of People Continue Editing Wikipedia Content - Task Value Confirmation Perspective". Behaviour & Information Technology. 33 (12): 1371–1382. doi:10.1080/0144929X.2014.929744.
  12. ^ Salor, E.: Circling the Infinite Loop, One Edit at a Time: Seriality in Wikipedia and the Encyclopedic Urge. In Allen, R. and van den Berg, T. (eds.) Serialization in Popular Culture. London: Routledge p.170 ff.
  13. ^ Weissman, Sarah; Ayhan, Samet; Bradley, Joshua; Lin, Jimmy (2014-06-04). "Identifying Duplicate and Contradictory Information in Wikipedia". arXiv:1406.1143 [cs.IR].
  14. ^ Mihai Grigore, Bernadetta Tarigan, Juliana Sutanto and Chris Dellarocas: "The impact of elite vs. non-elite contributor groups in online social production communities: The case of Wikipedia". SCECR 2014 PDF
  15. ^ Schopflin, Katharine (2014-06-17). "What do we Think an Encyclopaedia is?". Culture Unbound: Journal of Current Cultural Research. 6 (3): 483–503. doi:10.3384/cu.2000.1525.146483.
  16. ^ Lindgren, Simon (2014-06-17). "Crowdsourcing Knowledge Interdiscursive Flows from Wikipedia into Scholarly Research". Culture Unbound: Journal of Current Cultural Research. 6 (3): 609–627. doi:10.3384/cu.2000.1525.146609.
  17. ^ Spree, Ulrike (2014-06-17). "How Readers Shape the Content of an Encyclopedia: A Case Study Comparing the German Meyers Konversationslexikon (1885-1890) with Wikipedia (2002-2013)". Culture Unbound: Journal of Current Cultural Research. 6 (3): 569–591. doi:10.3384/cu.2000.1525.146569.
Supplementary references and notes:
  1. ^ Mausam; Soderland, Stephen; Etzioni, Oren; Weld, Daniel S.; Skinner, Michael; Bilmes, Jeff (2009). Compiling a Massive, Multilingual Dictionary via Probabilistic Inference. pp. 262–270. ISBN 978-1-932432-45-9.
  2. ^ "Hungarian Front Page".
  3. ^ "wiki2dict github". GitHub.
  4. ^ For example, in 2013 only two languages are studied [1] in contrast to the five languages reported in this 2014 journal article.
  5. ^ http://pan.webis.de/
  6. ^ See [2]
  7. ^ See [3]
  8. ^ Scikit-learn is an open source project in Python for machine-learning