Artwork for podcast The Ho Center for Buddhist Studies at Stanford
Marcus Bingenheimer: AI and Total Translation
Episode 51st August 2025 • The Ho Center for Buddhist Studies at Stanford • The Ho Center for Buddhist Studies at Stanford
00:00:00 00:43:23

Share Episode

Shownotes

Marcus Bingenheimer talks about why new tools in the Digital Humanities demand new genres of scholarship, what network analysis reveals about the transmission of religious ideas in medieval China, and how AI’s large language models will help arcane texts reach a new global readership.

Marcus Bingenheimer is Associate Professor in the Department of Religion at Temple University in Philadelphia. He taught Buddhism and Digital Humanities in Taiwan at Dharma Drum (2005 to 2011) and held visiting positions and fellowships at universities in Korea, Japan, Thailand, Singapore, and France. Since 2001 he has supervised various projects concerning the digitization of Buddhist culture. His main research interests are the history and historiography of Buddhism, early sūtra literature, and how to apply computational approaches to research in the Humanities. He has published some sixty peer-reviewed articles and a handful of books.

Interview by Miles Osgood.

Transcripts

MILES OSGOOD:

Welcome to "The Ho Center for Buddhist Studies at Stanford" podcast. Come join us by the tree.

00:17:

If Buddhist Studies is a field that takes a unique interest in the workings of the mind, through the study of Abhidharma texts and meditation practices, and one that prizes the work of the philologist, who can understand the subtleties of ancient languages, what should we make of intelligence that is "artificial" and language models that are "large"? Are AIs and LLMs simply new tools for transcription and translation, like other writing technologies that Buddhist scribes and scholars have adopted in the past?

MARCUS BINGENHEIMER:

It seems to me that Buddhists have always been early adopters of information technologies, right? In India, the earliest Indian epigraphy mentions Buddhism and Buddhist texts, the "Aśokan Inscriptions." The earliest Indian fragments of written material are Buddhist texts, and then the "Diamond Sūtra" is the first dated printed book from the ninth century—and now in the British Library.

MILES OSGOOD:

Another possibility: is the AI chatbot instead, a kind of interlocutor for the researcher, a software update on the ancient prophets?

MARCUS BINGENHEIMER:

They're vastly improvement over the old oracles. I mean, the old oracles: you had to take some tokens from a feature space, and then you have these gnomic instructions. Or you have people in trance who sing some songs. And now you have to do a lot of hermeneutical reading into these gnomic messages. And now you get much easier answers from these machines.

MILES OSGOOD:

Today: peering into the future of Buddhist Studies and the Digital Humanities. I'm your host, Miles Osgood, recording from The Ho Center for Buddhist Studies at Stanford.

02:02:

My guest today is Marcus Bingenheimer, Associate Professor in the Department of Religion at Temple University in Philadelphia. Professor Bingenheimer received his DPhil from Würzburg University in his native Germany and an MA in Communications from Nagoya University in Japan before teaching Buddhism and Digital Humanities at Dharma Drum in Taiwan for six years. With other academic appointments around the world—in Korea, Japan, Thailand, France, and Singapore—it is fitting that Professor Bingenheimer joined us at Stanford to talk about mapping Buddhism's geographies and networks.

02:36:

This is ordinarily the point where I dive into the articles and monographs that our guest has published, and there are some great books to mention, including "Island of Guanyin: Mount Putuo and its Gazetteers" from Oxford University Press in twenty sixteen, "Studies in Āgama Literature" from the Dharma Drum Buddhist College Series in two thousand six, and an array of Chinese Buddhist translation projects.

02:58:

But insofar as Professor Bingenheimer is an authority in the Digital Humanities and the digitization of Buddhist culture, there is also an entire second library of scholarship that we need to attend to: one that is virtual, online, and interactive. This side of Bingenheimer's scholarship includes searchable maps of the Chinese Buddhist temples that have spread across Southeast Asia, digital editions of medieval Chinese texts and multi-layered visualizations of Buddhist biographies. You can interact with these tools and databases yourself at mbingenheimer.net.

03:33:

There's one other piece of Marcus's recent work that I should share with you, and that's a slide deck. Over the last few years with the rise of ChatGPT and other generative large language models, Marcus has been presenting to fellow scholars on possible outcomes of the AI revolution in Buddhist studies. And before our interview, he shared those slides with me. In the next 25 years or so, Marcus predicts, "All ancient texts for which there is enough training data can and will be translated."

04:03:

This is what Marcus calls a future of "total translation," where the most obscure texts in the most obscure languages will become available to readers all around the world: not just translated into English or Chinese, but in hundreds of living languages too.

04:18:

Marcus has noticed three kinds of reactions to this prediction, two of which are skeptical, and the last one, a bit more sanguine. The first and most common is the defensive response on the part of Buddhist scholars. Something along the lines of, "There goes my superpower." The languages that they trained for decades to read will now no longer stand after all as the same barrier to others. But Marcus argues knowledge of those languages will still matter. Scholars will shift from translators to evaluators as curators of the new digital corpus, or as Marcus puts it, as "gleaners and cleaners."

04:51:

A second more general criticism questions the unregulated acceleration of AI more fundamentally, asking why the current control of this technology seems to serve the greed of corporations or the interest of surveillance states, stripping down existing intellectual property and limited energy resources for private or political gain. And that is a more substantial worry. As you'll see, Marcus adds other concerns of his own. Nevertheless, he insists that humanities scholars and especially younger scholars starting out, can't merely close their eyes and ears and ignore a technology that is already here.

05:23:

Finally, there is a rare third more optimistic view. For scholars devoted to Buddhist ideas, there can be something exciting about a new age of scriptural circulation. As Marcus puts it, "The 'buddhavacana' is going to be available in all languages in different registers. Everybody can now read and listen to the Dharma in her own language."

05:46:

So with that, let's turn to the interview and you can decide for yourself where you land.

05:50:

(bell dinging)

05:55:

Well, thank you very much, Marcus for joining us on this episode.

MARCUS BINGENHEIMER:

Thanks for having me.

MILES OSGOOD:

Yeah. I thought I'd start by going back a little bit to some of your more traditional research, your monograph, "Island of Guanyin," because I think one thing we're going to talk about today is the shape that research and scholarship takes and the structure that you give to a reader or an audience in different forms. And so when I was looking back at your work, I was really struck that in this book you're also thinking consciously and carefully about how to give form to a book that is itself about a very particular form of an object that you're studying: these gazettes in Guanyin. So could you tell us a little bit about that project, about how you came to the idea of how to go about your own research and your own articulation of what you were finding in this archive?

MARCUS BINGENHEIMER:

Yeah, sure. So these gazetteers—local chronicle collections of texts on a particular location—they are very important in Chinese historiography and they're a bit of a weird format. They're a little bit like a "TAR" file in computing and the old Unix systems, where you combine different files into one file. So actually it's not a genre in itself. It's more of a container format for other formats, for other genres like poetry or annals or lists of places or lists of biographies, lists of people. So when I understood I wanted to write about gazetteers, I thought, "How do I explain the structure to readers?" And the best way to do it I felt was to write a book, which in a way recreates the experience of the traditional readers of gazetteers where they would have parts on poetry and then there would be the biographies and then there would be illustrations and so on.

MILES OSGOOD:

Did you end up learning something about the logic of the gazette in the process of kind of structuring your own table of contents that way? Because I would imagine it might be frustrating at times to say, "Oh, I want to say this thing about the poems, but they come at the end of the gazette, so they're going to come at the end of my book, and the reader won't know that they're there until they arrive." Does that then tell you something like, "Oh, well that must be the experience of people in—"

MARCUS BINGENHEIMER:

Yeah, no, the order is flexible, right? I mean these gazetteers are not meant to be read in order in general. I mean there is a sort of logic in the sense that the first couple of chapters will have the maps or so—but or... and some introductory... there will be prefaces and there might be some text by the emperor or so. But generally whether the biographies come first or the poetry comes first, that is negotiable.

MILES OSGOOD:

I see.

MARCUS BINGENHEIMER:

And what I've learned was that—when I compared a lot of different gazetteers—that about the average size of these blocks... So for example, poetry and biographies are relatively large part of these gazetteers, more so than I had expected. And then topographical descriptions are comparatively shorter, more contained. So these are these minimal descriptions of places for people to look around. For us today they're very interesting because you can extract all kinds of data. So when you datify that, then you have lots of data points with very short descriptions, which is good for digitization really, for digitization purposes.

MILES OSGOOD:

Yeah, it's interesting the number of ways that now you're conceiving of this project—and maybe you conceived of this project at the time—as having these kind of digital analogues, as it were: that you're not only thinking about it in terms of a kind of compiled file structure, but also that you're thinking, "Well actually the way that readers of the scholarship are going to approach this is they are going to skip around from chapter to chapter according to their preferences," which of course now I'm realizing is how people read monographs anyway. So why not lean into that? Why not give it a form of organization that makes that easy? Is that something that then kind of informed your work moving into the Digital Humanities? Because I think this just lends an opening to me to think about form in that space, right, how you do scholarship in a novel way such that it can be interactive, maybe up to the reader or viewer in certain ways...

MARCUS BINGENHEIMER:

Yeah. The monograph is a bit of a dinosaur, really, right? I mean, it's a genre that is from an age of information scarcity in a way. And it's a difficult form, because I think it was actually meant to be read from cover to cover, right? From what you said, you read it like I do it. I read the introduction, then I read the chapters I'm interested in and not necessarily in order, and then perhaps I read the conclusion, but very few. I mean they have to be really right on the thing I'm just researching for me to read a monograph from cover to cover. But this is how they were designed originally. And in a way for the humanities, I see the humanities struggling, to move beyond the monograph and find a way to... It's still the coin of the realm, right? You have to publish a monograph in order to get tenure, get hired, get promoted, and so on—at least in some fields.

11:26:

And I think we see sort of, this does not mesh very well with the way people will want to read these days, where we generally read for information, because you don't need to give the whole background story of a certain topic anymore within a chapter. You don't need to fill in people on everything like you used to have in the 19th century when you didn't have Wikipedia, right? You couldn't look things up. So it's really... Now when I tend to consume research and output research, I try to focus on the question, "Okay, so this is where we are. Here's my research question. Now what have we found out? What new things have we found out about this thing?" And this goes a little bit against the grain of publishing in the humanities where often overviews are given over things that are known already to contextualize the new.

MILES OSGOOD:

Yeah. Specifically within Buddhist Studies, what are the potential opportunities, challenges of doing digital work? I think about the prologue that you and your team wrote for the "Sutra2DNA" project, which I thought was... I mean just generally the research project was so fun and fascinating, but specifically the way that you framed it with this sort of self-aware irony, that there was something funny about taking an object, the "Diamond Sūtra," that had this kind of long history such that we think of it perhaps even as the first book, and then thinking, "Okay, can we reestablish that kind of permanence in some new form of encoding in particular by encoding it into DNA, something that seems like it could be reliably replicable and preservable?" and yet to think about the fact that the "Diamond Sūtra" itself is telling you about how all concepts and phenomena are momentary, not to be relied upon, the concept of "apratiṣṭha." There's a kind of fun irony there. Is there something about the content of Buddhist Studies that feels like it offers particular modes of reflection on this moment of changing scholarship?

MARCUS BINGENHEIMER:

It seems to me that Buddhists have always been early adopters of information technologies, right? In India, the earliest Indian epigraphy mentions Buddhism and Buddhist texts, the "Aśokan Inscriptions," the earliest Indian fragments of written material are Buddhist texts. And then as you said, the "Diamond Sūtra" is the first dated printed book from the ninth century—and now in the British Library. And also within the humanities, when I remember in the 90s and early 2000s, it was also Buddhists who in East Asia for example, were among the first who digitized large corpora of texts. So they were very early on concerned with the task of say, "Okay, so we have our Buddhist canon, so how do we get this in the computer? And how do..." At that time it was very difficult because East Asian, Chinese character encoding was a difficult problem until Unicode then came out as the solution for it.

14:49:

But in the 80s and 90s before Unicode was widely used, this was tricky, but people tried it and there were various communities who digitized the Buddhist canon as soon as they could basically. And so within Buddhism, I think this is just a big great openness. And also traditionally we're not bound... Like for example in Islam, if I understand this right, it's really important to have the "Quran," the sacred book in the original, in Arabic. And I think for Buddhism in the Buddhist tradition, this is not a major concern. You can translate, you can add to the canon, right? The canon is fairly open in various ways, at least the Tibetan and the Chinese canon. And yeah, so this is an easy in for Buddhism.

MILES OSGOOD:

Yeah, that's great. And so in your work in particular, you've done a lot with Buddhist biographies, right? Like one of the things you are speaking about at Stanford has to do with what we can learn about social networks of particular Buddhist monks and the like. And it seems like you're pushing there the opportunities of what different kind of media or different kind of layers of visualization of textual biographical material could offer us. Could you tell us about how you went about that project, what you saw as being your motivation for finding these different digital forms and maybe what the precedent was in that particular thread?

MARCUS BINGENHEIMER:

Yeah, I'm interested in computational methods, right? I try to find out what kind of questions can we ask now that much of our textual data is digitized, what kind of questions can we now ask and answer that we couldn't before? Network analysis, it's a field of information science that is very widely applicable. So it's used from physics to chemistry to biology and math and graph theory or so—graph theory is an underlying paradigm, which is very important.

17:15:

And then you basically try to find how can these formal networks—where I just try to abstract from reality and say all people are nodes and when they meet and they have a connection between them—if I model my reality in that way, then is there anything I see when I visualize it or is there any metric, network analysis metric that I can calculate, which then tells me something about the reality which is recognizable to me first? I mean to check whether that actually works, right? I mean, does it actually look like I expect it to look? And then when everything sort of looks like I think it looks, the question is then what new is there? What's the thing that I haven't seen before, I didn't know before, which now I can see and wasn't there before if you talk about visualization, and that's the challenge.

MILES OSGOOD:

Yeah, could we take those two principles kind of one by one? So first, this is kind of an early digital humanities dictum, right? That's sort of like, "Yes, maybe sometimes the digital analysis of a text or of a network shows us something that we kind of already knew intuitively about a history or about an author or what have you, but now we truly know it," would be the argument. Were there moments like that with regard to this corpus?

MARCUS BINGENHEIMER:

Yes, for sure. I mean, if it doesn't look like anything you recognize, then something must be wrong, right? Something must have gone wrong. You must have made a mistake in digitization or in just in the visualization process or whatever. I mean it must reflect... We're not going to totally rewrite the history of Buddhist, of Chinese Buddhism, right, just because you have data abstracted from texts on which these previous histories are being based. So we're still using the same kind of text. We just add a layer of abstraction to it and then we can do new inquiries and then these new inquiries sometimes turn up interesting things. I mean, for example, in the case of Chinese Buddhism, we knew that Dao'an, Huiyuan, and Kumārajīva were important players in their time, but it was not clear to me how absolutely important they were in the social transmission of Chinese Buddhism.

19:29:

The network view turned out to really strongly emphasize, depict, of how little before this triangle actually connects to the later parts of the network without going through the triangle. So basically everything before 400 or so somehow has to move through Dao'an, Huiyuan, Kumārajīva, and their students to then sort of, as an information flow, to flow into the fifth century then. And that was something which I didn't expect in the same way before I saw it. And then there are other things like for example, in network analysis, then you can ask questions. Okay, you have these three important players, tell me the people who knew all three of them, right? And that's a question that if you are in a computational environment, you can ask that questions and answer it in five minutes. And if you want to do this in a text, if you want to, you have to read all these biographies and then you have to make notes and it takes you three months, right?

MILES OSGOOD:

Yeah.

MARCUS BINGENHEIMER:

So there are questions which you can ask quickly which are meaningful to telling the story about what happened at that time, which datafication and particular toolsets of computational methods then afford you.

MILES OSGOOD:

That's a sort of natural segue for me to ask a little bit about... Yeah, a number of your projects have had to do with translation, and I think as you look a little bit at AI and give presentations on the potential of this new technology, it seems like this is one of the major areas of interest, that this field has always cared a great deal about producing and analyzing translations of various kinds. It seems like a real opportunity now to be able to go into more languages to digitize and render accessible more texts and to do so through large language model AIs. Right. And it sounds like when you've been taking this idea around and trying to get people excited about the prospects of it, you sometimes encounter some resistance, right? I mean, this is a new technology, it's hard for folks to understand.

21:36:

In some cases people feel like we are handing over the keys to a particular castle that we have guarded as scholars, in terms of having the authority and expertise to be the translators. What kinds of concerns there do you take seriously? Which ones do you think are misguided? How do you address folks who are nervous or otherwise maybe resistant to where scholarship might be going with AI?

MARCUS BINGENHEIMER:

It's interesting to see. So for Buddhist studies, the first thing that comes to mind, basically in every article I wrote, in every book I wrote, basically there's always a piece of translation in there, something or the other has been translated and I quote something and I translate it. And now machines can translate much faster than I can and often they're accurate to a surprisingly high degree. So it's quite extraordinary how in the last two, three years these machines have been evolved, have been trained, to work with even low-resource languages like Chinese and Sanskrit and Pali and so on.

22:49:

So this is quite extraordinary and we did not expect this 10 years ago. So here we are, and now the question is, "What are we going to do with this?" And you can say, "Oh, I don't care. I run out the clock." But if you are young, like say less than 60 or so, then you can't really sort of not say, "I'm never going to use LLMs," right? It's not happening because everybody, all the students are using it. And you have to make yourself at home in a world where LLMs are part of the knowledge landscape.

MILES OSGOOD:

Yeah. That makes sense. So when it comes to translation in particular, I was looking at examples of maybe how you would translate a 12th-century classical Chinese text or how they've been typically translated, and then what the LLM, what ChatGPT in particular gives you. I thought this was really fun as an overture to thinking about what might be possible with translation, but I also wonder if it raises new concerns about reliability and trust. So one example was... So I won't read the Chinese, but I'll read your translation first, which was—and I'm picking a row that felt particularly about Buddhist philosophy here... Your translation had said, "We worry about so many things, but when the last hour comes, all will be lost and gone. Even our own body turns to waste, not to mention everything else." And then the GPT version was, "All our concerns are rooted in our attachment to material things. However, when our time comes to an end, we must leave everything behind, including our bodies. All these possessions that we worked so hard to acquire are ultimately just temporary and fleeting."

24:19:

And you underlined that first line, "All our concerns are rooted in our attachment to material things." And I was pondering that and of course I'm curious as to what you think about it. It felt a little like, there the GPT was almost not just doing a pure translation but was also drawing on a commentarial tradition that followed afterward by having a phrase like "attachment to material things," where your translation was much more neutral than that and wasn't assuming a whole phraseology of Buddhist Studies. Is that what's happening there? Is it because it's amalgamating different things or eating its own tail on other things it has said about Buddhism?

MARCUS BINGENHEIMER:

Yeah. So one thing is that it's really too early to say, so these translations are still getting better. Right. We haven't fully plateaued out. We are plateauing now, but we haven't plateaued out, so we don't really know where we end up with in that space. It's clear... One of the things about... One of the sort of "Translation Theory 101" or so, what you learn, is the indeterminability of translation. So there are an unfixed number, an undetermined number of possible correct translations of any sentence. So there is not "the" correct translation. And then there are other mistaken translations, things which are just wrong. And then there are things which people who know both languages can both be happy with, but then some people will have preferences for one or the other. So this I think will be replicated with these machines very easily. You can also ask an LLM to give you three different translations, right, and then you choose one of them or, so it's just what is impressive is the speed of it and the variability, right?

26:10:

You can sort of say, "Okay, translate it like in an 18th century British idiom," or so, right? Or, "Translate it in a way that it is easy to understand for a 14-year-old," or so. So you can ask it to produce translations on a much broader spectrum than any individual translator, human translator could do it. The real issue is accuracy and how much you want to believe it and where the liability lies. So people who work on machine translation for modern texts, they say, "Well, there are, like, there are legal contracts which have to be translated, right, medical issues, diagnosis," and so on. And there you run on... So this is how these translations are used in the social environment. Who is going to sign off on them?

27:08:

And for Buddhist Studies, and for scholarship I think in general, there is a similar problem. So who assumes responsibility for that translation, right? I don't read Tibetan, for example, if I ask it to translate a Tibetan text for me, which someone has identified as useful, or if I see in a machine-translated translation a passage which I think I could use in my article or so, then what do I do with it, right? Do I just say, "Okay, I don't know, Claude translated that for me and I don't actually know whether it's right or not." I mean, I can't: this doesn't sound right. Right. I have to somehow find a way to at least ask another specialist to check that or so and make sure that it's accurate. Whether it's stylistically nice or not is another issue, but I must somehow guarantee the accuracy of what I put into my scholarship.

28:02:

So this is the part that is important when we talk about machine translation. Machine translation is basically there and it has a certain degree of accuracy like every human translator, right? There's no perfect human translator either. So the question is now how do we evaluate the output of different machines, for example? That's a really difficult question because... And machine translation, the field of computer science that does these machine translation studies: that's a vast subfield of machine translation, is evaluation.

28:35:

So there are all kinds of interesting metrics which are really very ingeniously crafted, that have evolved over the last 20, 25 years or so where you try to... Using, sometimes using human reference translation to check whether the output is—how accurate it is, how good it is. And that's really... that's a field where I think we as Buddhist studies people can also come in because for that you need this domain knowledge. Otherwise, you can't contribute much to... I mean, we're not going to contribute to the development of LLMs, right, in Buddhist studies. We are not trained for that. This is not what we do in graduate school. But every field of scholarship I think can see how to evaluate what machines do on their domain and how we can encourage it or nudge it to help us producing new knowledge, which is what we are supposed to do, right?

MILES OSGOOD:

Yeah, because I was going to say, it sounds like maybe for a new generation of scholars there was going to be a period of transition of moving from the expectation of "I am the translator" to "I am merely the editor." But it sounds like you're arguing there's more continuity than that, that we have always been evaluators of a certain kinds of texts that requires a certain kind of conceptual knowledge of what the language means and what precise words ought to be rendered as. And actually, yes, certainly some work is now shifting from the human to the machine, but nevertheless, this kind of scholarly knowledge that allows us to select or edit the correct words is one that's always been there and it's just being applied in a different way. Is that right?

MARCUS BINGENHEIMER:

Yes. I think of it very much as a continuum of interests. I mean, what we get through using LLMs to translate is—and then also it's not only about translations, it's also about summary, right? So I can ask, what is this unknown text in Sanskrit about? Tell me in five minutes, right? And then you can decide whether you want to zoom in and then say, "Okay, give me an English translation." And then you have, "Okay, this is the interesting part for me, this is where it talks about what I'm researching." And then I say, "Okay, now show me the Sanskrit."

30:58:

And then I'll see, "Okay, does this English and the Sanskrit... How does that work, right? Is this actually translated accurately?" This kind of back and forth, the dialogue between these models and our data is I think where we will be. This is I think how it's going to look like in the next 10, 20 years or so, that you keep asking these language models to show you what is already known, and then you try to push into areas where, yeah, they can't help you because they're not trained for that and they don't have the agency and the volition to sort of... the curiosity to do anything about it, right?

MILES OSGOOD:

Yeah.

MARCUS BINGENHEIMER:

So far we are providing the curiosity, right, and they provide knowledge to us.

MILES OSGOOD:

But it's like working with an all-powerful librarian who can take you to the one bit of microfilm that you need or to the bookshelf that you need and show you at different levels of scope what might be useful to you. And then you still have to do the work of the analysis between the lines, the translation of the particular word, what have you.

MARCUS BINGENHEIMER:

So far. Yeah. Yeah. I mean, I often find myself surprised over the last year of how much knowledge there is encoded in these machines. Right. So they do sound a little bit, I mean, they're vastly improvement over the old oracles. I mean the old oracles, you had to take some tokens from a feature space, and then you have these gnomic instructions, and so. Or you have people in trance who sing some songs or so. And now you have to do a lot of hermeneutical reading into these gnomic messages. And now you get much easier answers from these machines. So I think we have much improved over divination now.

MILES OSGOOD:

Laughing I really like that analogy. Well, so I've talked in various ways about things that might provoke resistance, or ways we might be skeptical, or ways we might need to build trust in these machines, but there's an optimism to the presentations that you've given. And one way that you phrase that is the potential of "total translation." Can you say a little bit about that phrase and what you see as the possible future of the accessibility of Buddhist texts using these tools?

MARCUS BINGENHEIMER:

So when I started to go into Buddhist studies, the landscape was that you had to learn a lot of ancient difficult languages and then you were up against a vast ocean of untranslated texts. There were a certain number of texts which were translated, and the very few of them translated many times. But very generally, you were lucky if you had one translation and then much of the interesting stuff was not translated at all. And that will be probably gone within the next 10 years or so. You can always generate, you can quickly generate on the fly an okay translation of an ancient Sanskrit or Chinese text or so, as long as you have it in digital format. I mean things might take longer for Tangut or so. I mean, when you have, like there are I mean Tocharian or Khotanese. So there are super small niche languages which haven't been fully digitized yet, for which there might not be a fully Unicode encoding. Tangut has one, but there are a few languages for which there is no proper encoding. And of course they then can't be part of the training data for these machines, and so they can't deal with them very well.

34:46:

But everything... I mean, talking about the larger languages that we are likely to encounter during our research, we will be able to have very decent translations on the fly from different machines and you can, compare them. And if you get really, really excited about a part, a passage, then you have to ascertain it somehow. You have to corroborate it by either consulting a specialist or by learning the language yourself. And that is "total translation."

35:21:

So basically all the literary textual heritage of humanity is available in basically all languages. I mean, all of the large languages, like 99% of humans speak right now. I mean, of course I know there are 7,000 languages and some of them have 200 speakers. And it's important to preserve them and it's important to create spaces for those low-resource languages. It's very important. And actually the LLMs... I have seen papers now where people try to use LLMs to, say, model certain dialects which are on the verge of extinction of a language. And so there is indeed a hope that you will have virtual speakers of languages which have died out because they have been captured as an LLM in the 21st century...

MILES OSGOOD:

Yeah.

MARCUS BINGENHEIMER:

... but no human speaks that anymore. So this is a bit like the recordings that were made in the early 20th century of Indian languages or so, for which we now have no speakers anymore. So this is something that is a part of the research, but "total translation," the way I want to raise awareness for is that... Yeah, it used to be that if you couldn't read this book and there was no translation, you just couldn't read it. But now you can take your phone and run it over it and it appears in the language of your liking. And the great thing also is that we get away from an Anglocentric understanding of translation. English for 200 years or 150 years perhaps was the language into which everything was translated first and then from there into Arabic and Hindi and German or whatever.

37:20:

And now you can actually directly go from one to the other, meaning you can, if you are a 14-year-old girl in Kenya, you can read the "Diamond Sūtra" in your own language of today on your phone if you choose to do so. And that is a big change globally. I think that people, wherever they are, as long as they have access to digital tools, have now access to all of humanity's textual heritage. Big deal, really. It hasn't quite sunk in yet, I think, but this is what we'll see over the next 10, 20 years.

MILES OSGOOD:

Yeah. And that's a wonderful note for us to end on. I think that what we have here is not just an opportunity for the future of scholarship, which needs to be disseminated and made accessible and made translatable, but for the original primary text themselves, right? That this was always kind of the purpose of these scriptures, right, was to be able to be spread, that they were open to translations, and open to new media over the centuries. And so an opportunity yet again to encourage that anew.

MARCUS BINGENHEIMER:

Yeah, well I mean, I don't want to sound too optimistic. I mean, it is quite possible that we are all going to die because of AI. Right. I mean, these are serious concerns, but I don't see how I, as a Buddhist scholar can do anything about it. So as long as I'm here, I'm going to play.

MILES OSGOOD:

As long as we fundamentally acknowledge the ultimate impermanence of things and... laughing

MARCUS BINGENHEIMER:

Yes, I mean, there are serious concerns. The alignment problem is... I mean a lot of people dream about AGI, about what happens if these machines become even more human-like intelligent, and what will happen then if they take off, kind of if they become basically a different conscious species, that... What would they do and would their interest be aligned with ours, or so? So these are science-fiction-like dreams. We don't know whether they would come true. A lot of people seem very concerned about it, and I sort of understand that, but I don't see how I, as a—I mean, this is about Buddhist studies and Buddhist scholarship, or so. We have to take these things as they appear on the horizon of humanity and then deal with it and try to...

MILES OSGOOD:

Yeah. It sounds like there's a whole separate conversation to have here about what it would be for AIs to gain consciousness specifically on a corpus of Buddhist texts, what that would mean for their understanding of themselves. Laughing. But maybe we'll leave that.

MARCUS BINGENHEIMER:

Yeah, I hope the ideas will somehow be useful to whatever comes after humans. I mean, I think of Buddhism as a supremely human thing to do, and I'm not sure how much whatever comes after homo sapiens on this planet is going to do with it, but if there is intelligence after us, after our civilization or so, I hope they can use the Buddhadharma as some kind of an inspiration or some of the things can be somehow salvaged.

MILES OSGOOD:

All right. Well, we've had a note of hope, a note of fear, and maybe a note of mystery there. But, well, just thank you so much, Marcus, for taking us through all of that, your own research, the future of the field as you see it, the opportunities and challenges of these tools. Really interesting and I think going to be very important clearly for scholars to reckon with if they haven't. So thanks again for taking the time to talk with us and yeah, have a lovely...

MARCUS BINGENHEIMER:

Thanks for having me on this program. It was fun. Take care, Miles.

Music:

Oṃ maṇi padme hūm. Oṃ maṇi padme hūm. Oṃ maṇi padme hūm. Oṃ maṇi padme hūm. Oṃ maṇi padme hūm…

MILES OSGOOD:

Thanks one more time to Marcus Bingenheimer for coming on the show. There's no video for this episode, but you can check out other interviews and podcasts on our website at buddhiststudies.stanford.edu. Again, if you want to interact more with Marcus's research head to mbingenheimer.net. There you'll find linked "Tools for Buddhist Studies," including some of the GIS maps, social network datasets, and digitized translations referenced in the interview, and a personal bibliography of publications both traditional and digital. I'll plug again the "Sutra2DNA" project, which bridges both forms of scholarship. This collaboration funded by a Temple University Presidential Humanities and Arts grant starts out by describing the unique material history of the "Diamond Sūtra" and then gives the text a new future, encoding it into DNA molecules. To bring tradition and innovation together once more and to find an appropriate home for the molecular scripture, the Loretta C. Duckworth Scholars Studio at Temple Libraries organized a miniature stūpa 3D printing contest. The winner, Tom Leighton of Glenview, Illinois, crafted 20 gold stūpas, each topped with a printed double-helix and diamond, linked to one another via an online repository and distributed out into the world of Buddhism.

Music:

Oṃ maṇi padme hūm.

MILES OSGOOD:

As always, the music for this episode is a recording from Ani Choying Drolma's "Buddhist Chants and Songs," performed at Stanford's Memorial Church in twenty seventeen. Until next time, this has been The Ho Center for Buddhist Studies at Stanford Podcast.

Music:

Oṃ maṇi padme hūm.

43:11:

Oṃ maṇi padme hūm.

43:11:

Oṃ maṇi padme hūm

43:11:

Oṃ maṇi padme hūm.

Links

Chapters

Video

More from YouTube