Artwork for podcast Research Culture Uncovered
(S3 E4) The open knowledge revolution: contributing to the global commons with Wikimedian Dr Martin Poulter
Episode 429th March 2023 • Research Culture Uncovered • Research Culturosity, University of Leeds
00:00:00 00:46:46

Share Episode

Shownotes

In our weekly Research Culture Uncovered conversations we are asking what is Research Culture and why does it matter? This episode is part of Season 3, hosted by Nick Sheppard who will be speaking to colleagues from both the University of Leeds and from other universities and organizations about open research, what it is, how it's practiced in different disciplines, and how it relates to research culture. In this episode Nick is joined by Dr Martin Poulter from the University of Bristol where he is website editor and technical developer for the School of Economics, Finance and Management.

Martin is former Wikimedian in Residence at the Bodleian Library, University of Oxford, and in addition to his day job at Bristol, works with academics, librarians, and other experts with Wikimedia to help institutions get the maximum benefit in terms of reach, engaging the crowd and creating educational resources. He is currently Wikimedian for the Khalili Collections, a set of private art collections including the world's largest collection of Islamic art.

With Martin's help we have been exploring the use of Wikimedia platforms at the University of Leeds, including a Wikimedia Champions project you can hear all about in next week's instalment.

In this episode we talk about:

  • how his academic background in Philosophy and Psychology led him to become a Wikimedian
  • how Wikimedia is more than just Wikipedia, and comprises 16 different platforms including Wikimedia Commons, Wikidata and WikiSource
  • opportunities for education and research for universities, including sharing openly licensed research and student assignments
  • barriers to engaging with the Wikimedia platforms as an institution and where to go for support
  • some of the misconceptions about how Wikipedia actually works
  • the issue of bias in Wikipedia, how it reflects a primarily Western white male cultural outlook, and the importance of diversifying Wikipedia through initiatives like Women in Red and translation projects

Be sure to check out the other episodes in this season!

Links:

Follow us on twitter: @ResDevLeeds, @OpenResLeeds, @ResCultureLeeds

If you would like to contribute to a podcast episode get in touch: academicdev@leeds.ac.uk

Transcripts

[:

[00:00:24] Nick Sheppard: Hi, it's Nick, and for those who don't know me, I'm Open Research Advisor based in the library here at the University of Leeds. You're joining us in season three of the Research Culture Uncovered podcast where we'll be speaking to colleagues from both the University of Leeds and from other universities and organizations about open research, what it is, how it's practiced in different disciplines, and how it relates to research.

s from the REDS Conference of:

But now I'd like to introduce my guest for today, Dr. Martin Poulter from the University of Bristol, where he is website editor and technical developer for the School of Economics, Finance and Management. He is former Wikimedian in Residence at the Bodleian Library, University of Oxford, and in addition to his day job at Bristol works with academics, librarians, and other experts with Wikimedia to - I'm quoting your website, uh, Infobomb here Martin, "demystify geek stuff and help institutions get the maximum benefit in terms of reach, engaging the crowd and creating educational resources".

So, hello Martin and welcome to the podcast.

[:

[00:01:39] Nick Sheppard: Now, as you were just saying, your affiliations seem quite complex, you've got quite...how many jobs have you actually got, or how many people do you work for?

[:

[00:02:42] Nick Sheppard: So I mean, we'll get onto some of the Wikimedia stuff, um, in a while...we've worked with you in the library here at Leeds, um, quite a lot. Thank you for your, you know...and I've certainly learned a lot from you.

[:

[00:03:14] Nick Sheppard: But I suppose I'm just interested before we get into sort of Wikimedia, and you've already mentioned Wikidata et cetera, what's your, um, sort of academic background? Philosophy, I think, is that right?

[:

[00:05:05] Nick Sheppard: Right and, you, you're taking credit for that through your Wikipedia article?

[:

[00:05:50] Nick Sheppard: yeah, I mean that, that's fascinating because, I believe...I don't think you've told me that story before, but you organized a Wikipedia science conference in 2015, I think, for which you were awarded Wikimedian of the year? I suppose I'm interested because, I mean, the received wisdom and perhaps for people listening would think, well, Wikipedia is unreliable, isn't it? Anyone can edit it, so, you know, why should we engage it from a research perspective? And you've obviously given a bit of background to your interest, but so, you know, can we trust Wikipedia more than perhaps the received wisdom would suggest?

[:

So it actually being bad and incomplete and unreliable is the opportunity. If it was perfect and finished, there'd be no point engaging with it. Um, And it's the, it's kind of joining up, not just joining up what we're doing in universities or cultural institutions with Wikipedia...joining up across different research outputs, or different cultural institutions. That's kind of a theme, uh, yeah, kind of a theme of my career in that somewhere I'm working with educational objects and some is with culture, and some is with opening up science. So I wanted to open knowledge this conference to showcase the ways science can learn from Wikipedia and, uh, example of projects that are really using it, there's a proteins families database and an RNA families database, and I think there's Gene Wiki. And these are standard databases used by these molecular biologists that are, that use Wikidata and Wikipedia as a platform so they're actually sharing their content on Wikipedia, harvesting it back, people can change it, and in principle, they can monitor for changes. They can monitor for vandalism, but they don't get vandalism. Vandals and trolls don't want to put out misinformation about proteins. That doesn't excite them, they want to, they want to put something in a celebrity's biography. Uh, so actually letting people edit these crucial databases has led to improvements. It's not led to vandalism, it's led to people who otherwise wouldn't have contributed having this opportunity to write something and then that's...but, but it, the changes can be checked by experts before they're imported into the formally published version.

[:

[00:09:48] Martin Poulter: Yeah, so there's two aspects to that. It's not just Wikipedia, as you say, that, um, there's a family of projects. So Wikipedia is the, where you have narrative articles about topics. Um, Wikidata is where you have secondary data, so you have like the findings of research expressed in a database. Um, So, whereas you might have some texts saying the Taj Mahal was built by Shah Jahan, like a computer doesn't naturally know how to parse that text, but you could put that relationship in as a, um, a line in a database and Wikidata has billions of those. Um, there's Commons, which is the digital media archive, so this has tens of millions of freely reusable photos, diagrams, maps, uh, images, video clips. Wiki Source I'm a big fan of. So this is where people are transcribing out of source texts, and these might be poems or political documents or reference works, um, or novels. Um, so that's kind of filling in gaps in codified and uh, it doesn't take too much, it's not so demanding because you're not thinking how, what's the proper way to phrase this, or what's the proper way to summarize knowledge about this. You've basically got a scan of an old book and you've got electronic text and you're fixing the electronic text to, so it's, it's an authentic transcript of that book.

And so there's loads of ways if you are a kind of an open codified activist, if you're someone who wants to share. I've seen the term codified philanthropist. So a codified philanthropist is someone who learns and shares their learning so other people can benefit. So I'm studying psychology and I write a Wikipedia article about what I'm studying. That's kind of codified philanthropy. But I read a book and I check it in Wiki Source so that other people can read that book and they can get it in an e-book format. That's another kind of codified for philanthropy. So there's different activities we can do and some of it, some of us get very, um, activist about it. I was, uh, sitting in a conference with, uh, Mike Peele, a friend who's very much a Wikipedian, and he's a very dedicated photographer, and while the speaker was setting up, Mike looked up the speaker in Wikipedia and he didn't have a photo in his article. So Mike brings out his camera cos he is a dedicated photograph, he has really good equipment, photographs he speaker who's setting up, transfers it from his camera to his laptop, uploads to Wikipedia and embeds the image in the Wikipedia article in time for the guy to start speaking. And in this time, I'd opened my laptop, I'd opened my notes, but I thought, oh, right there are Wikipedia activists more hardcore than me. Uh, but yeah it kind of transforms how you look at...how I look at my books. I go back to my books. Oh, this has got interesting stuff about the Nobel Prize in, could I put this in the Nobel Prize article? So you kind of dig through stuff you have access to. Uh, somebody showed me like a book with an old portrait painting scanned in it and, oh, this is really old, so it's out of copyright, so I could scan this and put this in Commons and make this available for educational content or Wikipedia articles and so on.

[:

[00:14:11] Martin Poulter: Yeah. I'm really glad what Leeds is doing, uh, and the different ways you are kind of interfacing with the Wikimedia projects. And I think just about everything we do in universities can connect in some way...so it might be, you've got the example of making a map or a diagram, or publishing a paper which has figures in it, maps and diagrams, or could be suitably licensed and reused and used in the Wikipedia article or other educational materials just made freely available.

It relates to education, it relates to kind of, uh, data science and data visualization. There's projects that can be done, um, and I think they're building skills in the relevant staff and students. They're giving another layer of exposure to the work we're already doing, and, uh, yeah, connect... there's a really important point about creating content, not platforms that I think I've learned from my career, and I think a lot people are getting the same message, that in my 25 years working universities, there's been lots of project which been funded, and they're to open knowledge educational material or they're to open knowledge reference material or to open knowledge some research resources for a particular subject. And they get a server and they get a brand name and they, there could be a name for the project and it's probably gonna be named after an animal or something. And loads and loads of these spring up, and there's a loss because, uh, the funding runs out and the server isn't updated or maintained, or eventually somebody switches it off, or maybe this is an academic's project, to open knowledge this stuff and that academic retires or goes to another university or goes to live in the woods. Uh, and there's this loss of these things that were a, a big song and dance about and taxpayers' money were put into, but they don't survive till now, and so one element is things weren't remixable, they weren't open knowledge d with...how will people adapt these in the future, how will future generations adapt them and keep them going. Um, uh, and can we, having open knowledge content, put it on some platform which links it up, like with a web of other codified. How can we add to the existing web of interconnected codified about thew world, about history, about culture?

[:

[00:16:45] Martin Poulter: Which is...well all of these projects but, yeah, Wikidata is the one that's truly a web of interconnected codified because it's multilingual. So Wikipedia has 300 different language versions, which each have their own rules and their own differences in coverage and so on. Wikidata is one site with all of these different language communities working on it, and with that, it can answer questions about, where is a particular kind of thing? Like where, where are all the objects in British museums related to Shah Jahan? Where are all the paintings that are self-portraits by women? Where are the poems by 18th century people in French and so on? So these are all things that you could go to a hundred different sites and search for...or, or you could just Google search and it would give you the thing you're looking for in amongst millions of other things that you're not looking for. Uh, so Wikidata you can ask o "show me an image gallery of self-portraits by women" and get a growing set of results for that. You type that in as a search result into Google or onto different museum websites, you're not guaranteed to get anything because of the way websites work, or a search engine. So we should be think...anything we produce, um, special collections in the library or a research project, we can be thinking about how does this join up? How do we describe what we've published or what else we've open knowledge d? Data sets, um, images and so on. How do we not just put it online, cause it's not enough for it to be online now, it's gotta be findable in the places where people are looking for it. How can we put this, uh, in a web where it connects to what the rest of the world are doing and, and, is findable by,where people are looking

[:

[00:19:14] Martin Poulter: Yeah, there is this huge problem of kind of a bias of emphasis and it's not even like a bias in the individual articles, it's what the articles are about. So I did a research project about the coverage of visual art in Wikipedia, and, uh, there's so much coverage of, uh, say John Constable or John Ruskin or the sort of...say figures from British history who are quite minor in the global sense. And then there'll be people from Chinese art, Persian art, African art who are really celebrated in cultures, but there's almost nothing about on Wikipedia. So the content of Wikipedia really reflects a Yeah, primarily North European, white, male sort of Western culture. Um, that's partly the outcome of scholarly publishing in general, that were...the sources that we use are published by, uh, publishers in North of Europe and so much scholarly research is done. So there's, um, the sources available, but there's also the interests of the volunteer contributors and what they're interested in writing about and what their hobbies are.

So improving Wikipedia is often about diversifying Wikipedia, and that's a lot of the theme what I've been doing that, that at Oxford, it was Women in Science editathons or editathons, uh, yeah, to put more women into Wikipedia...presently working on putting like Islamic art and Japanese art into Commons and Wikipedia. Um, and yeah, reaching out to more people beyond who have the, um, yeah, the privilege. It does come from privilege, uh, being able to...having the head space and spare time and access to sources to be able to improve Wikipedia.

[:

[00:21:31] Martin Poulter: They're great.

[:

[00:21:44] Martin Poulter: Yeah. So Wikimedia needs translators. It needs Wikipedia articles to be translated, labels in Wikidata to be translated, image captions in Commons, and they're using that and other universities worldwide, this as an opportunity, again, it's incomplete, that's the opportunity. If you want, um, samples of business Chinese or a Chinese article about a scientific topic, well then you get that from Chinese Wikipedia and have a student on a translation course, translate that. And, and you've got a big choice of, um, source and target languages and, um, and different topics. So, um, uh, they have, yeah, focused events on particular topics, editathons, uh, translation...

[:

[00:22:52] Martin Poulter: And the witches project is just a great example. Again, I was talking about deadweight loss and another reason for the loss of resources that lots of effort's gone into is technology. And there's so many databases that have been open knowledge d that...this is a database in Access, and Access isn't really a thing anymore. And I've certainly known of educational materials open knowledge d in Flash, and Flash isn't a thing anymore. So there's a huge opportunity to find this stuff that's in outdated formats and convert it over to open formats, standards... and Wikidata and...they're kind of a standard, um, and there's no kind of intellectual property restrictions on them, and so they're open legally and technically, so that, um, that gives longevity to that content, whatever it is, a data set or a set of photographs or whatever.

[:

[00:24:23] Martin Poulter: Yeah, there's barriers on kind of the Wikimedia side and there's barriers that are kind of legal and organisational within a university. So wikis are a technology from 1997, and you go on Wikipedia and it's kind of visible that it is a very old, in internet terms, technology that hasn't updated much and the usability is admittedly terrible. And uh, maybe this is why a lot of academics think you don't get attribution for writing on Wikipedia. Actually, you do, there's a couple of clicks you can do and it'll show you all of the usernames, and it has actually has what's called micro attribution in it. A individual letter of punctuation mark, you can track back and find who added that. So it's great for kind of attributing shared authorship. Uh, so there's lots kind of under the bonnet...people don't realise you can get an article and you can find out so much about it like, when was this written? By whom was this written? Uh, what's the quality rating? People are often unaware, there's actually a quality scale on Wikipedia, the Wikipedians care...

[:

[00:25:48] Martin Poulter: Yeah, the first, um, the first half decade of Wikipedia's existence, it was all about 'open knowledge as many articles' as possible. And this was the time when it started appearing in search results a lot, but had this terrible reputation for hoaxes and so on, and a terrible reputation for unreliability, but since then there was a transition, yeah, kind of 2005, 2006 I think when I, like when I started to get involved, to focus on quality and there is a quality scale, uh, there's a lot of this informal quality review, but there are formal review processes where you can put something up and someone who hasn't been involved in writing the article but is interested in the subject will check it against a lot of criteria. So it's a very open review process. It's not like submitting to a journal and hearing months later, or an anonymous, uh, reviewer's opinion. It's actually a dialogue and it takes place in public, it takes place on the talk page about the article. In getting the featured article badge on what I'd written, I had to go through three review processes and I think there were 10 reviewers involved in that, so that's really demanding. That subjected every sentence I'd written to some examination and critique and that...is this the best way to word this? Does is exactly match, you know, what is the fact you're stating here, the fact that's in the source that you are coming from. So I found it more demanding than publishing in peer reviewed journals in my experience. I mean, it's a different kind of thing because it's not reviewing the quality of the research, that's already been done, when the research is published in a peer reviewed journal, so on Wikipedia we're reviewing the quality of the writing and the style of the writing. And is this a good, fair, comprehensive summary?

Uh, so there's a lot of review going on, and Wikipedians are really quite scathing about the content of Wikipedia. So there's only half a percent of Wikipedia articles in English Wikipedia that have one of these formal quality badges, and the great majority have stub quality or start quality, which are the two lowest on this, I think seven point scale. So you might look up a Wikipedia article, think, oh, this is really not very good, there's not much content here, it's not very systematic, it's not very well written. If you look at the talk page, you'll probably find Wikipedians agreeing with that and saying, yeah, this needs more...or there maybe a tag on it, like, this needs more references or this needs a more diverse points of view.

Um, so that is...that's a barrier to researchers. Lectures engaging . The perception that it's gonna...it's like the wild west. You throw it in, but then trolls and vandals are making their edits as well, and who knows what way it'll end up. Whereas it's more like an evolutionary system. It's...there's changes made, but it's preserving the changes that are made that are beneficial. And then the other kind of barrier to sharing, to improving. Is the barrier that we have in our own institutions, which is maybe the desire for control and the desire for like full copyright and let's put a license on this, which says that no one can alter it. And, um, I mean this is an issue for libraries and museums in particular, that they often have income generation targets and they have a generation from image licensing. So even though they're kind of public institutions preserving culture for future generations, they've got to, to reserve or, or reserve the uses of some things, and, uh, yeah, they're reluctant to share, and people are concerned about, yeah, what I've written or the images have made being taken by someone else and used for some other purpose and may be altered. That does scare people. But that's where the advantage comes in, that you can't know all of the useful uses to which your content will be put. Somebody making a figure for research paper can't anticipate all the ways that could be used to illustrate other research or in educational materials about the topic or illustrating a dictionary or something about topic. So, um, we have to like, feel the fear, but still upload the stuff, and just see, it's like, I'll give this to you, I'm keen to see what you'll do with it, I don't know what you will. And, and there are images that are uploaded that people think...maybe there's several different uses for, in Wikipedia that get hundreds of uses. So, Van Gogh Starry Night that illustrates the article about Starry Night, it illustrates the article about Van Gogh and maybe the Van Gogh Museum, but also illustrates lots of mental health articles because of the state Van Gogh was in when he painted it.

And also...so I'm finding what I'm doing, I'm uploading images and they're used in ways or they're used in articles I never would've thought of. They're used in languages I haven't heard of. So there's an Indonesian language, Bahasa Indonesia, that I didn't know about the existence of until I saw one of my articles being translated into it. And I could look at this and see kind of the sentence structure and how it was definitely based on my English text. Um, and if I'd been trying to do that with money, get my article translated into Indonesian as well as French and German, and so on, I don't know how I would do that, but that just organically happened.

[:

[00:32:46] Martin Poulter: Yeah, how the wider public, how prospective students see your subject, um, will be shaped by this because it's so convenient and it's fee and maybe there's something much, much better and written entirely by experts, properly reviewed, but might be behind a paywall or an obscure site or not joined up. It really helps that Wikipedia is encyclopedic in that there's not a Wikipedia of sport and Wikipedia of art and Wikipedia of science, that it's all one thing. So, Everyone I talk to has this experience where they're looking up something, maybe they're looking up a TV show they're watching, but the Wikipedia article has all of these links to other related things. So I've started reading about this, um, thing on TV, but now I'm reading about the speed of light controversy and now I'm...how have I got here? You've got 30 tabs open with different topics, and you're not gonna do that with kind of a PDF that's been published properly in a journal by a research group. There's not so many ways into it. And uh, so there's a lot to be said for yes, stumbling upon, stumbling upon content like a Wikipedia article, stumbling upon uses for things.

Like I said, finding a book with a portrait painting in it, and oh, well that could go into the Wikipedia article of the guy in the portrait.

[:

[00:35:06] Martin Poulter: Yeah, and that was something you provide...you did the research for you provided the meat and I did some kind of stylistic and formatting

[:

[00:35:28] Martin Poulter: But my point is you don't have to get the style exactly right for conformity with the huge, expansive manual of style that Wikipedia runs on, and you don't have to get all of the code right, to format it correctly. Um, if you get just the basic text right and cited, other people will come along and fix things. So it solves the division of labor problem really well, and if you think, if we were to do this by a traditional publishing process, uh oh, so someone would have to start with a draft of the text and then someone else would have to kind of copyedit that and then someone else proofreads that. And we'd have to like move draft between us in email or something. And in Wikipedia things happen, but just in whatever order. So just put in an article and, uh, yeah, the categorisation or the way a table's formatted isn't proper, but someone else, there's some pedant out there who likes fixing exactly that problem or likes formatting citations, the exactly right way. And they will go through a bunch of articles and fix that aspect of articles. Um, and so people who are, who have a subject focus or like you had, you had interest in this particular bit of research and other people whose interest is in kind of the page layout or particular grammar things, we all work on making that one article better in a short space of time. Yeah.

[:

[00:37:35] Martin Poulter: There's not many, um, there are kind of independent consultants like me, and I'm, like I said, someone who'd be brought in, different universities have brought me in for a day to run an event and talk to different people in the institution and talk about what they can do with Wikimedia. It's Wikimedia UK, which is the charity, which is a good first point of contact, and they have contacts with lots of lecturers who are leading courses with some kind of Wikipedia aspect. They have contacts with researchers, they have contracts with cultural institutions, and, um, they know the field, they know who is there and who has skills and, um, yeah, can, can introduce you to...or point you in directions of who might have the skill help with a particular problem. But yeah, it is a market. Yeah. The Radio Times...there's other Wikimedians are available.

[:

[00:38:58] Martin Poulter: There's a load to be set for skill sharing days and another university I visited, they had lots of people in different parts of the organization who had some kind of Wikimedia interest or interest in improving Wikipedia with their work. But they were different roles, they were lecturers and librarians and so on, but they weren't connected up in any way. So just having people meeting and then sharing what they're doing and then having me come along and give a training workshop that particular...things that happen in other universities or what they can do, that was an advantageous thing. So it's kind of a grassroots...it's usually the grassroots effort to bring people together, share what you can do and then sometimes it's top down as well, and you get somebody high up the organization backing it, and then you can have like a formal, um, some formal involvement like a Wikimedian residence with the university.

And then once you have that, there's loads more that can happen because that person can help out a lot of different projects and parts of the university.

[:

But I guess, you know, thinking beyond, I mean, we hope to continue working with you, you know, on some of the projects that we've already started, but what's the big picture, I suppose, you know, where we are in 2022, you know, and the Global Commons. What's the Wikimedia community hoping to achieve in the future? Is it more the same? Are there any big initiatives on the cards or...?

[:

[00:42:56] Nick Sheppard: And is it, uh, I suppose just to, to finish on the question of sustainability, I mean, obviously I think everybody, including Jimmy Wales himself, was surprised at the success of Wikipedia, and it's how many years old now?

[:

[00:43:14] Nick Sheppard: But it is a charity. It's, it's supported through...as a, as a charity I think?

[:

[00:44:42] Nick Sheppard: Well, that's a good note to finish on. I mean, it truly is the Global Commons and, you know, obviously thanks for your time and we would, um, encourage people to contribute to it. I mean, there's no reason if you, people haven't done so, they could just sign up to for Wikipedia account, can't they and start editing?

[:

[00:45:14] Nick Sheppard: Or drop us a line on this podcast. So thank you very much Martin, thanks for your time. Really interesting to go in a bit of a deeper dive with you with some of the stuff around Wikipedia and Wikimedia and we'll speak again. Thanks very much.

[:

Email us at academicdev@leeds.ac.uk. Thanks for listening, and here's to you on your research culture.

Links

Chapters