Artwork for podcast Learning Bayesian Statistics
#83 Multilevel Regression, Post-Stratification & Electoral Dynamics, with Tarmo Jüristo
Episode 8325th May 2023 • Learning Bayesian Statistics • Alexandre ANDORRA
00:00:00 01:17:20

Share Episode


Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

One of the greatest features of this podcast, and my work in general, is that I keep getting surprised. Along the way, I keep learning, and I meet fascinating people, like Tarmo Jüristo.

Tarmo is hard to describe. These days, he’s heading an NGO called Salk, in the Baltic state called Estonia. Among other things, they are studying and forecasting elections, which is how we met and ended up collaborating with PyMC Labs, our Bayesian consultancy.

But Tarmo is much more than that. Born in 1971 in what was still the Soviet Union, he graduated in finance from Tartu University. He worked in finance and investment banking until the 2009 crisis, when he quit and started a doctorate in… cultural studies. He then went on to write for theater and TV, teaching literature, anthropology and philosophy. An avid world traveler, he also teaches kendo and Brazilian jiu-jitsu.

As you’ll hear in the episode, after lots of adventures, he established Salk, and they just used a Bayesian hierarchical model with post-stratification to forecast the results of the 2023 Estonian parliamentary elections and target the campaign efforts to specific demographics.

Oh, and let thing: Tarmo is a fan of the show — I told you he was a great guy ;)

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at !

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Nathaniel Neitzke, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Raul Maldonado, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Trey Causey, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh and Grant Pezzolesi.

Visit to unlock exclusive Bayesian swag ;)

Links from the show:


please note that the transcript was generated automatically and may therefore contain errors. Feel free to reach out if you're willing to correct them.

This podcast uses the following third-party services for analysis:

Podcorn -


Welcome to Learning Bayesian statistics.


Nice to be here.


Yeah, thanks for taking the time and how bed with my Estonian fermentation of your name.


I'm kind of used to this. So you're actually doing pretty good.


tonian elections this year in:


. So I was born in Estonia in:


Yeah, it's thanks for that. Start summary. story. I know he's longer. Yeah. And I also want to say that you also did some screenwriting. So you have your you've had a very, very diversified carrier. So that that's really interesting. We talked about that when I was in Cali, and that was super interesting to to extend that. And so to dive in a bit more into base, do remember when you first got introduced actually to patient statistics.


Now this is a little more difficult thing to point pin down precisely in the abstract terms. I think the first time I really took significant notice and Bayesian statistics was probably about 10 to 15 years ago when there was a brief period of my life when I was taking playing poker quite seriously. And at that time, this was a very interesting, interesting time in poker as well. The way how the game was approached was changing very rapidly. From the old days of you know, smokey Cardrooms back in Las Vegas to Internet poker, where lots of people were starting to use the statistical tools to analyze the game to figure out their own leaks in the game. And there was lots of interesting theory being produced in a pretty short span of time. So people started thinking about poker hands, not in terms of of, you know, the particular hand you have rather than the ranges of hands or, as you know, you would say in statistics, the distribution of hands rather than the point value and so we would start balancing the distributions against those of your opponents. And think in terms of the expected value of not your single hand rather than a distribution of hands like it would play in any particular spot. And then you would have things like Game Theory, optimal play and all these and I was I was at the at the time when I was playing. Of course, these things help you to play well and you know, make money or not lose quite as much money as you otherwise would. But this was something that I found really fascinating also, in abstract terms, the way how we change your thinking. And then there was a really one, you know, seminal book that came out many years ago by Bill Chan and Gerald Ackerman, I think it was the his name, which are called mathematics of poker. And this was very Bayesian, so they were explicitly Bayesian in their approach. And I guess this must have been the first time when I actually really you know, got thinking about Bayesian Bayesian statistics in in specifically Bayesian terms, because as I was saying, back when I was in university 30 years ago, then the statistics, as far as I can remember, everything we were taught was was frequentist so this was there was no bacon stats.


Guess so like, basically these these, I mean, I'm not surprised that basically diving into Parker Parker helped you then discover bass. Did that help you?


Well, yeah, obviously it does. I mean, it's something of course, people are imperfect, in terms of the replication of the statistical principles, and then people are really bad randomizers and all of that, but the way how which I think the most valuable thing from from that period of of my life was, was not really just related to poker, but the way how this kind of approach changed my thinking outside of poker, the way how you think about the randomness, the way how you think about chess the way how you think about, you know, in many ways, life in general. So rather than thinking about the things that happened in terms of, you know, point estimates or point values, you think of them as ranges of different things that could have happened. And then that will, it will limber up the way how you have you looked at the wide range of things, not just playing cards.


Yeah. Yeah. I completely agree. Is, is the cool thing of that framework is that it's a tool that you can use in a lot of like any, any endeavor that you have, where you need to think, which is a lot of cases.


By the way, I even used the Bayesian stats, kind of thinking when I'm doing martial arts nowadays. So when you're, this is the Brazilian jujitsu service submission grappling. And in a way this is also I teach it as well and I don't teach it this way necessarily to beginners, but the in more advanced groups, you can think of a fight like a stochastic process, so you don't know what your opponent is going to do. But you have a range of things they can do and some of the things are more likely or less likely. And so you have your your priors you have an idea of things that could happen. And then you incorporate the information that you gain over the encounter to narrow these these things. And in a way, what you try to do when you're fighting your opponent or you know when you're playing poker or you know, chess or whatever. board game, you narrow your opponents options while keeping yours as flexible as possible. And this is I could go on about this a long, long, long way. But you know, this is just an example that you can use the Bayesian approach to randomness or to, you know, unknown things in a much broader way than people would probably usually recognize.


he dress, you know, that from:


Well, as I was saying, in the introduction, we set up the organization or that we had the foundation for a particular reason for fighting the referendum. And for this, as we, we were really short, short of time. We figured that we have to move really quickly tried to get an idea of what the public opinion is on this issue, and then see what we can do to tilt it towards the outcome that we were we were looking for. Now, this is a kind of a general problem of course, not only to in pertaining to referendums, but also to elections, that the function of the outcome of a referendum or an election is actually you know, a function of a number of different variables. The most important ones would be in case of election that would be the People's Party preference, and the other would be turnout. Because, you know, whether you agree with something or disagree with something is of no consequence if you do not show up and cast your vote. And so this is something that we got into first, we figured that since there was a less than six months to do the plan date or the referendum, then there's not going to be a whole lot of time to change people's opinions. So because you know, based on all the available literature, you can change people's minds. But it's hard to do that overnight. So it takes time. And however, it is oftentimes much easier to change people's behavior. So you can you can try and and motivate people to to get out and then cast their vote, or you can you can try and give them you know, good enough reasons to stay at home or not show up and then getting bored. So this was something that we were rehearsed, trying to look into. But just as I said before, the election never materialized. But we had already gotten started with this idea and then we said okay, so the problem is still there. So we we still have the far right party in the garment and even although the the the garment fell apart, so but we still have them in the parliament, they could easily be in the in the garment. And if you look at what's been going on in Europe, then this is a very general thing. Everywhere that the far right is, is gaining strength throughout the continent. Now however, when we were looking, we started our monthly service streams, we started getting the data in building the time series, and then pretty much immediately we notice something which is actually quite obvious of course once you once you see it and this is something that is true throughout the all of the of the Europe, country to UK and US where you have basically two party systems most of the Europe is multi party systems, where you have coalition governments and this presents you with a particular kind of a problem. This is It is also there in case of UK and US but in a slightly different than perhaps a little less pronounced form. And what I'm getting to is that if you look at the setup of the political landscape in all of these Western European countries, then you can easily see that the liberal side of the politics is highly fragmented, and has been this way for for a long time. However, the far right in most of the European countries tends to be pretty unified or you know, they are not split. Oftentimes, it's just because it's just one party like you know, the northern northern league in Italy or you know, like, the true things in Finland or Swedish Democrats in Sweden or whatever. But it's also it carries over to the sight of the voters. And what I have in mind is that if you look at the way how the voters opinions cluster, then the far right tends to have no competition for their core voters, while the liberal parties tend to share their, you know, core voters not by the party affiliation in the way how people like to you know, express how they tend to vote. But in terms of if you look at the people's political opinions or preference on issues like immigration or you know, women's rights or environment, climate crisis or the each, then the liberal cluster is split between a number of different parties. And now this is something that got me and got us with, you know, the team that we have, thinking that what we're facing here is actually a pretty basic coordination problem. So the liberal party's compete with each other, but the far right doesn't really far right only competes with liberal parties. And in a way this is inevitable in politics because in parliamentary politics, you know, politicians see the politics as a zero sum game. And they actually have pretty good justification for this because you know, the the number of seats in the parliament, whichever country is, is a set number, so it's not too flexible. And this means that any seat that someone else takes is a seat that you do not get, and this is a very you know, this incentivizing poor cooperation or even coordination between the parties. And this is a handicap for for liberal parties. Going into elections and running their campaigns. This can take a number of different guises. So, it can be you know, their you know, different type of toy gains or situations. So, you can have like, you know, battle of sexes, or you know, the Tragedy of Commons type of things where, sometimes, you know, people would like to coordinate for a certain kind of a result as a possibility for a coalition, but their own selfish interests drive them against this optimal result. So they arrive at suboptimal result. And so this was the the issue that slowly emerged or actually pretty quickly emerged when we started looking at the data. And then we started figuring that, trying to figure out what can we do about it? And so, and the rest is history like they say so this got us embarked on a very, very interesting process, which, during which also our roads crossed.


Yeah, very interesting. Story, and yeah, I mean, I think it's a pretty good segue to then start talking about Yeah, like you reached the what you recently these, and how, how the model was used. So for the listeners, can you tell us basically, what happened, like which kind of elections just happened in Estonia, and how did your work with the NGO cell basically feed into that? And then we'll get into the the patient model part of the project.


ntary elections scheduled for:


Yeah, basically anything and definitely that's something we are gonna get back to. When we talk about more of the of the basically the usage of the model. I find that super interesting basically this idea that just having a reliable and trustworthy outside source of data and and modeling helps you solve the prisoner's dilemma basically, that you were talking about a few minutes ago. And basically instead of fighting on whether there is a problem, then parties can coalesce and be like, okay, there is a problem and let's agree on how to solve it. Which of course, way more efficient. They collaborate on the solution instead of fighting in the first place over whether there is a problem or no. So definitely, let's go back to that. But first, let's look at the model. And before that, even so you're saying that there was quite a substantial polling error. Basically, the polls ended up being biased statistically biased towards the right parties. So that means that the left parties have been underestimated. So I'm actually wondering, what was the magnitude of that error and happy the model help cope with that though it was either an error that the model had actually anticipated in the way that in the uncertainties that it was calculating these kind of polling error was already taken in and so that way, the fact that you had a Bayesian model, with uncertainties made your predictions way more robust than just taking an average of polls.


Yeah. Now, this is of course, a huge subject and we could easily talk an hour about nuances here. But let's let's find and put a finger on a few more important things. So first of all, it was a really strange situation in in Estonia in the last weeks leading up to the election, because assembly is a small country, we do not have, you know, a huge number of posters covering the elections like you would have in United States where there's literally dozens of them running different service all the time. I don't know how many how many there are in France, for instance, if


if you ate depends on the elections, but between Natella needs, not the US.


Predicted constantly between:


Yeah, that's, that's interesting. So, I mean, hear from the mother is endpoints if you want to convince people of the importance of a model you've you seem to have had the perfect, you know, circumstance, which I've been dreaming of that circumstance in France for a long time convinced French, at least journalists that just making an average of polls is not the best. And that's why it's different to do to make a model. But yeah, basically, the the model ends up being way closer to the election than the conventional wisdom and the polls. That really helps driving the point home that you need some serious modeling because these are extremely complicated events. To forecasts and just your intuition is usually fall short is usually gonna fall short. Even though you can be an extremely smart person, say like, but you're not happy


but you know, as a proper be Bayesian, you would also obviously recognize that we might have just been lucky. The results of this way.


Yep. Yeah. I'm just more talking from the marketing standpoint here. Even the political standpoint, but yeah, for sure. Like this just the first first election, and so I'm really looking forward for your next elections that you're going to try that type of models. And I mean, for sure, if you try and go into other countries also doing the same you should that will increase your sample size of elections even though that these will be different countries. And yeah, I mean, the first thing I would do as the modeler here is like trying and understand if, like, if there is a good reason why the model actually differed from the polls and the conventional wisdom. I think, to me, that would be the most interesting because maybe the model was just lucky, because it was biased. In some ways, like, you know, like in the violence bias tradeoff, it was more biased than variable and so in that case, it was lucky, but maybe the next time we want so yeah, like that. We know this pendulum basically all the time. Is there Harding, and trying to place the slider between overfitting and underfitting, especially when you don't have a lot of the sample size, as you were saying, is extremely, extremely important. But yeah, I'm like, I'm quite happy to hear about all these. These six bits vary based on patient and data science modeling. That's absolutely awesome. And so as you were saying, we could continue talking about that, but I think now it's a good point to actually dive deeper into the model, because that will help listeners understand basically what the model was doing, and why also, it could have been more efficient than the rest of the methods. And I mean, I do have a bias. I worked on the model. And also I do think that these kinds of methods are actually better at trading between overthinking and underfitting. And so in the long term, this kind of method will usually give you better predictions than other methods that are either too biased or too variable. But basically, these online priors and biased biases BOTH Yeah, can you tell us a bit about the structure of the model, first of all, like the patient's structure, and then we will talk a bit more about how we make that even better with MRP.


So even before we dive deep into the model itself, I would like to set one thing straight there and say that, you know, this was definitely the case with us, but I think it's also something which would apply in an in a bit more universal broader way. We did not use the model and we didn't even you know, suggest using I actually suggested against using it for predicting the elections. So this is something which, again, would would take a lot of more looking into it. But say, in the case if it was just to give an example, if it was a really tightly contested election, so it was basically a coin flip and you build a model that would give you a correct prediction. Who wins then I would say that if it really is a coin flip, type of model, or type of situation, then your models prediction is, is pretty useless. Because you know, the distribution if the mean is right in the center, the center of the the outcomes, you know, if you were right then it was just luck. So predicting coin flips is not something that you should use a Bayesian model for. However, what you can use a Bayesian model for is determining whether the situation you're facing is it indeed a coin flip, or is it it's a lopsided situation. And this is something where the Bayesian model can give you a lots of really good input and this is where we get into the importance that you were also referring to before that the model can give you that the chest you know aberration with the survey results wouldn't because the if you average the survey results, you end up with a point estimate and this can be either right or wrong. But that's not in itself a hugely useful piece of information. However, if you do get also the uncertainty estimate with this, then you can make you know much more informed course. Whether this is you know the right place where you actually want to, you know, send your resources to whether this is a you know, Hill to die on, or whether this is something that you should just you know, leave leave aside because, you know, there's a no but a snowball's chance in hell to get a mandate from that. District. So this is, this is one of the important things to keep it and now what we try to do the other thing we tried to do with the model, and that's


new here, but the model has access to previous elections. Like the difference with like, just averaging is just like you, you train the model and produce elections. So if you structure it in a way where the model can actually learn from history, something you can do with just simple average that is


were with the sample size was:


Yeah. Yeah. So that's, that's where the basic structure comes, comes into place. Right? They can, can you do you want to talk a bit about that, or should I give the rundown basically, of how that kind of modeling could work?


Well, I can I can give the I can try and give from my side the, the overview of the structure. So, basically, perfect. Basically, obviously, this is we did not invent the wheel. So this is the the type or structure of the model, which has been used for for quite, quite quite a long number of years. So it's called MRP, the multilevel regression with postcard education, where we pick up let's let's take them you know, one letter at a time, so the multi level is basically just the refers to a hierarchical structure of the model. And let's leave it aside for a moment. Let's get back to this. So the R is regression. And this refers to the point that I was making before that we would have a model that learns the relationships between these four factors that I mentioned before the age, age, gender, education and ethnicity. The model learns how these things affect or tilt, or you know, somehow influence current medical preferences, yeah, opinions. And then also looks at the way how these different factors interact in these influences. And then they have a pretty good idea of what each and every component of these, these, these four do. So it's kind of like levers that you you can slide to one side or the other. And it's a very multi dimensional data space that this thing unfolds in, but basically, this is how it works. And then the post front ification part is that once you put the multilateral and the repression parts together and multilevel is then something which I was referring to before saying this motion thing that you can, you can borrow the signal of gender, ethnicity, whatever else education, run the light groups as we're in. So we had this what you were referring to yourself, Alex as a Russian doll doll type of structure where the it's a nested structure where the we had a small geographic units, kind of local districts that were grouped into electoral districts and that were then grouped into the whole population. And so the model keeps them separate, but allows you to learn across the divisions of these geographical divisions. And then post from implication is the final bit, which I think was also pretty important for the the end results being that way. They were. We were kind of lucky that Estonia had its food census about a year ago, so half year before we started working with the model. So we had a fresh, high quality census data that we could just get from the statistics office for you know, a couple of like 20 bucks, and then have model to D bias the, the the inputs from the survey and scale them to the population. And this does a number of things. So first of all, yes, D biases, the estimates, but this also allows you to simulate the population and then instead of working, you know, like I was giving the sample before about part two and you know, for Russian males, you can simulate the actual number of Russian speaking males in character. And then you know, dice and slice them whichever way you like, group them. Figure out you know, isolate the further narrower category of age groups within this broader demographic, and make inferences about this. And you get the inferences together with your uncertainty estimates, which is again, hugely important and useful. So this is the overview of the how the MRP model is set up. But I guess we can we can get into a lot more more detail there. So also, like you were saying before, we added the GP part. So this is this is something which was a crucial, crucial component.


Yeah. Yeah. So I will dive into Yeah, we'll talk about the TPS in a minute. And then we'll dive into how you concretely use the model during the campaign because it will also help people understand the how powerful these kinds of levels can be. And there's so to summarize what you just said with the model structure is that yeah, what you observe are poles row poles that you will come back with your partners in the NGO, and then you get those row poles, which is extremely valuable because me that was the first time I got the opportunity to work on relay row poles like that it was not poles that were reported by draw a line newspapers are also which is what usually I use for French editorial forecasting. But here you get access to the row polls. And so you do that when it's in your regression part where the model basically is trying to then Multinomial choice simulation and based on that, well, Bayes formula comes in we observed polls, and the model says, Well, based on the data I've observed in on priors, and the structure of the model, which is reflecting your domain knowledge. I think that the crew Leighton's popularity of the parties in the population is these and when you get a distribution for each party, but as you're saying, we're observing polls, and even though we're also doing a regression using the social demographics, demographical factors that you talked about in we trained the model on previous elections. This is still a biased sample of the population because it's a poll. And so afterwards, Tom's the post stratification part that you talked about that was invented by Andrew Gelman and other very smart people. And basically, this is kind of a magic thing that is so easy in a way to do in the Bayesian framework right. Just now that we fit the model, you tell the model Well, now you mentioned that we observed these data, which are extremely reliable data, because they come from they are census data. And in Estonia, you're incredibly detailed census data, which then you can use and then tell the model, okay, based on these data that we trust now, make the predictions on what we learned previously from the polling data. And now you have your device estimates that you're talking about, and you are able to make predictions even for very low sample sizes of the population, like so, as you were saying, Russian speaking males, in fact to maybe low education, Russian speaking males, and then you can you can dive into the you can slice your your population to strike out whatever you want. But since you have your sense of data, then you're able to actually make predictions, which also makes sense to you as the domain experts, which was the amazing thing. And the uncertainty is actually workable, right? It's not an uncertainty and it's like, oh, yeah, well, they think we should, like the Estonian government should invest more in education. With a probability of 10 to 70% is actually an actionable. This was actually very actionable probability. And to me, that was magic. Like, here's, like, I know how this works mathematically, but like, doing it and then seeing it in action, and how that device is your you're you're estimating allow you to make predictions on very small datasets that it feels like magic. That was incredible. And all that structure, then you add IID Gaussian processes. So yeah, now I'll give you the floor again, if you want to add anything to that and then just talk a bit more about the Gaussian processes before we go into the practical use of the of the model.


In some cases, I just want to underline this what I said before about the uncertainty estimates being important. In some cases, if you you know, slice the data long enough, then eventually you get to the point where the uncertainty is going to explode. So this is just the nature of the way how statistics works. But even in these cases, it can be immensely useful. Because when we were showing the results to the, you know, Soviet clients to the parties that we worked with, then we were drawing their attention to this and saying that, you know, don't just look at the distribution means don't disregard the, you know, the long tails of the distributions. So if the distribution is really wide, then you shouldn't be using, you know, a stopwatch or a, you know, the ruler to figure out which option is better than in this case, he would say they are roughly equal, but in some cases, you can figure out even though the aims are longer than the, you know, the joint distribution is pretty small. Then you say that, you know, there's actually a significant difference between these two options. And I can say that with a pretty high confidence, even though the, the uncertainty can be very high. So this is, and I absolutely agree with you, as you said before, but it looks and feels like a matrix. So it takes a while to get used to, especially if you come from, you know, working with just a raw survey data and then running reports to apps, and then figuring out that, you know, I think I have a signal but I have no idea how short I can be about that. So this is a very different world. And now about GDP. So the Gaussian process part, so that was really important addition. We did discuss it with you Alex first, but we figured that we'll we'll leave it out of the you know, the main main version of the model that you shipped. But the reason why it's important is that if you show a model, let's say two years worth of data, and that's been collected on in monthly intervals. So if you if you just feed it to the the model, then the model really has no way of knowing that what Edie seeing is actually a time series. So it will take the whole variance over the two year period as a, you know, content or simultaneous variance within the same moment. And as everyone knows the party popularity can have wild swings, especially you know, at the time of COVID pandemics and everything like this. So the it's been a roller coaster. And so there's a huge variety or huge variance in the data. However, if you can tell the model and say that, you know, this is this is a time segmented time series. So it's not just you know, one moment that we're talking about. So the Gaussian process allows a model to keep the time period separate the same way like the hierarchical structure helps it to keep the geographic units separate, and still learn from this and not mix everything up. Keep the important distinctions but pick up the useful signal and that is what the Gaussian process gave us. So once we added this, then the uncertainty came down. We could make even if we wanted to, we could make predictions into the future although then as you will know the uncertainties is going to explode very quickly. But it will let you it will let the model to learn much more precisely.


Yeah, exactly. And yeah, I mean, I do still snacks for French elections, and it's definitely something you want because we are live you give all the time. The whole time series to the model. It will be both over confidence and estimate larger violence than needed, which is the weird combination. But yeah, because the model is like wait, that's where that party can go from 5% probability popularity to 25%, which is a five fold increase or decrease in interest a few like at the same time is weird, because the model doesn't know about time series. That I'm not conscious of time. And at the same time, the model has a huge load of data. If you give it my I don't know, five elections. That's actually a lot of polls. So then the model will think Well, surely I have a lot of data. I shouldn't be very uncertain. So it should be very certain that the variance is very high. Exactly what you're told what still Yeah, like then any so if you're conscious of that, then you can you can still work with a model, which doesn't have time series. So it can already be a good model, but then definitely the next installment in your modeling workflow should be okay, how do we make the model time conscious basically, because it needs to know that? Yeah, one party can go from 5% to 25%, but it usually happens during yours and not during one election campaign. So that's where all the work you did on the on the Gaussian processes, I guess was very useful. Still, okay. I think now, listeners have a very good background for everything your deeds, and hopefully the more astute technical listeners will feel fulfilled by the previous segment. Now, before we close up the show, because it's already late for you, I don't want to take too much of your time. I can you dive a bit into basically how you use the model and how you use these also to basically focus the campaigning efforts and the kind of insights that you've got practically from it.


I guess the most interesting and important contribution to what we could offer the parties was was something where we we did use the model that we were just describing the MRP model. As a platform for working with the different kinds of data. So instead of the, the MRP model was initially built to predict the or discover in for the latent support for parties in the population. But what we ended up doing was that we run a different kind of survey, which was set up as a max diff or best worse scoring. So just very briefly, we had 18 different policy questions that we showed people, the respondents we had, again, 1200 respondents, every one of them saw 10 times a random sample of five out of the 18. And every time they had to indicate the one that is for them most important for making the decision, and the least important for making their decision. And this gives you you know, a whole bunch of data points. So this is like 12 1200 people giving you 10 screens, so that's already 12 12,000 data points, and in each of those, there's five options. So this is this is a lot of data that the model can dig, dig into. And now what it discovers is not just the latent support for certain policy proposals, but it also gives you heterogeneous effects over the different socio demographic groups. And it works out each and every person is sort of a latent how to how to say that order importance of these, these topics. And now this combined with the same patient model that we used as a platform, so we we built a couple of things on top of it, so we still use the GP. We had a little bit different regression part where it learns people's Layton preferences, and then allows us to post stratify those across the whole population. And now this is why this was really important is that as I mentioned before, it lets you to dig into the heterogeneous effects. So you can specify, you can be very precise, and figure out that you know, in this part of the country, in general, this topic trend is important, but not in this group. So let's say you may want to talk about education, but not to a lower education, lower educated people or you know, for instance, one of the topics that was very strongly stratified, and this is true also in elsewhere in the world is attitudes towards nuclear energy. And this is where there's a big gender gap. So, men tend to be viewed much more favorably than than women. And also tend to think it's much more important than women. And this was really interesting because we had a Europe wide energy crisis in the last winter. So this was an important topic to figure out and, and we could tell that you know, if you want to talk about this talk to men, not women, and, and we could say that, you know, don't go and talk about this. If people's political preferences are leaning this way, then they are probably not receptive to this, this idea, and we could do that. On all those different 18 different topics. We could give very precise, precise ideas, what topics to stress and which topics to avoid when you win short, because this would generate a lot of very strong response from you know, wrong kind of, of people from the campaigning point of view. And that is something that the parties were later telling us was was hugely important or hugely useful for them for calibrating their campaign messages and for figuring out where to go with them. And and what you do with it.


Yeah, it's so interesting. So basically, like, yeah, the the insights you get from the post stratified estimates, after ones are really informed, which kind of demographics you should focus on to, depending on the issue you're interested in talking about.


War are the issues you would want to avoid. And this is equally important in the elections that you do not race the for instance, in us is a very well known thing that if the salience of immigration issue starts trending, then this is beneficial for Republicans, because the median voter tends to think that Republicans have more convincing answers. To immigration if it's framed as a problem. And so Democrats should really avoid touching the immigration issue. And that was the same thing in Estonia, by the way, so we were because the war in Ukraine there's lots of Ukrainian refugees in the center so we could very confidently tell parties that it's fine to talk about providing military help to Ukraine, it's fine to talk about assisting and helping, but you do not want to debate openly or had pay a lot of attention to the issue of Ukrainian refugees, especially outside of big cities, because this is something which was a contentious issue for for many people in the in the smaller, smaller parts of Estonia. And and so better to avoid and and they did and it seems to have worked out really well.


And basically, this is due to the fact that here, people are not really receptive to any thing that could change their view. Right. It's that basically just avoid that topic because you're not going to be able to change your view. For now. The views are way too entrenched and in their identity. And so that's basically a waste of your political capital to capital to try and do that. I'll come with something else or try to tackle the issue from another standpoint from another way of coming at the issue instead of coming right in front of this issue and just talking about refugees, for instance.


Yeah, so this, I could bring other examples of this as well. But this is this is something which, which is immensely important in campaigning and we were using it quite extensively. You know, coming back to the start of our discussion when you mentioned about this, the nerdy world of of patient statistics and modeling and everything like this, then, you know, I have been thinking about the back to my childhood when I was also one of the nerdy things that I was doing. I was reading lots of science fiction. So I was really a big fan of all the classical science fiction. And I don't know if you have read this very famous, serious Foundation series by Isaac Asimov. But I'm sure some of the listener listeners have


said, Yes, I've heard of it. I've just not tried it.


Yeah, this is the basic premise of the book is that there's this one man called Harry Selden, who has, you know, discovers or comes up with the this whole new discipline called psycho psycho history, which allows you to predict the future of societies by looking at the interactions of the people. And it's, you know, this is something that has come back to me every now and then, but you know, in a way, of course, as you know, in Foundation Series misrepresents very fundamentally the nature of stochastic processes and then randomness of all this, but, in a way, the ethics of this book or what, what Harry Selden is trying to do is not one light from the patient modeling in political context. So you're trying to gain insight to how people would act in a certain situation, and this is like an anthill. So it's impossible to predict the trajectory of a single ant, but the the the totality of the anthill follows a small number of very basic fundamental heuristics. And this allows you to predict the entirety of the behavior of the anthill with surprisingly high precision. And and this is what's so fascinating about seeing this thing unfold.


Right? Yeah, for sure. So what's the name of the of the series I put that in the show notes? Cuz that sounds like a read.


Yeah, there's, there's I think there's four or five books in that series, but it's a foundation series.


Oh, yeah. Okay. Yeah. Alright exam. So, I will put that into the show notes. My second femur Foundation, there even Wikipedia page. Perfect. That sounds like fun. Probably going to read that. Right. So. Yeah, I mean, because one of the main questions I would have on that kind of, you know, accusation of what parties can and cannot say is, yeah, that's good. But then what do you do if you really have to talk about refugees is like, if you really want to talk about drugs, or Jesus, and you think that it's a problem that actually you cannot talk about the fact that Well, I think that Estonia should take on more Ukrainian refugees and what if you want to do that? And isn't that also the role of politics is to bring bring some hot topics for a poor debate? And if we're not able to debate these kinds of very hot topics, doesn't that mean that in the end incentives of our democratic institutions or maybe not the main maybe need to be updated in a way? So yeah, like that's the main question I would get based on these.


Well, now we're getting into politics podcast away from the statistics podcast and I would be happy discussion as well. But hopefully,


we're almost at the end of the discussion. So you know, like it's, it's gonna have a natural ending point.


I think the important thing to notice here is that you are free to talk about whatever you like, but this kind of model just gives you an honest estimate, what the, the likely outcome or you know, the expected cost of that could be and so you can make an informed decision. If you think that this is an important thing to to bring forth and discuss, then by all means, go and do that. So this is politics. So however, the statistics part cannot tell you what your value should be. Statistics cannot tell you if you should be in favor of accepting Ukrainian refugees or, you know, draw the line somewhere. So this is a different thing. This is this is politics, this is where people have to figure out and and arrive at some kind of a consensus or some kind of a working arrangement in the end. So this is not something that statistics can provide you statistics can tell you what is likely to happen if you go down this way, rather than the other.


Yeah, yeah. So to make it clear, it's like the models here continue with the problems are better not the problem themselves. I've had something that I have to remind people of, it's like, you know, it's like the famous when penned up comedians, often saying the joke about the horrible thing is not the horrible thing itself. And yeah, like the model is not the problem itself. It just reveals the problems that we may have and then we may need collectively and one collectively to do something about that. But at least the modeling can can tell you Yeah, like, here, there is kind of a problem. You could get an optimized solution. But maybe that's the local optimum, and you might want to find another optimal which is more global.


And this is important thing to underline at the end of the episode is that I don't think that the politics should or could be modeled statistically, from the start to the end. I think it would be a terrible idea. And in that sense, if you read the theme of a foundation, then there are you know, these darker tones there as well, which would make you think about the downsides of such things. But that being said, statistics and modeling and Bayesian modeling can be immensely useful tool for also for doing the right thing. So it's just that I want to be it to be clear that what we have spoken about today in terms of modeling the people's preference and modeling election outcomes, and all of that is it's just a technical way of figuring out if you look at the election as a kind of a game and say that you want to maximize your, your results you want to optimize, find this local, local maximum, then this is what you should do, but you should always keep in mind that there's a broader word behind the books that you're you're


working with the current rules of the game, here is how to optimize your game. But that doesn't mean you shouldn't change the world. Exactly. Okay, cool. So let's maybe close the show. So I added the foundation series in the show notes. And also for we referenced a lot of concepts that we didn't really explain in this episode. And that's kind of normal because already had episodes about all those topics. I put them in the show notes. You'll find episodes about hierarchical models, Gaussian processes, nonparametric models, which are Gaussian processes also. And also MRP and missing data. So you will find all these episodes in the show notes if you want to dig deeper, which I recommend because these are very interesting topics. And so maybe I assume the final two questions on escaper guests at the end of the show, I've kind of quick question and probably quick answer. I'm just curious, what is the thing that surprised you surprised you the most in this whole project in this whole endeavor that we talked about?


I didn't know if it's a short answer. So there were I should I should think about the short answer the longer one would be that I was I was pretty much constantly surprised how much there is to learn and how much fun things you can do with your mentioned, for instance, nonparametric models, which is something that we were considering at one point and we'll probably go down that route and try so it's, it's all like it's tinkering. It's finding the different bits and pieces and then trying and most of the time you fail in one form or the other. But, you know, sometimes you strike out and those are the great moments. So when you you know run your MCMC sampler and the you know, the trace plots come out perfect, and then you suddenly see something that you didn't see before. So this is the It's a wonderful feeling and I'm sure you know, well as well.


Yeah, yeah, for sure. I and I understand. There can't be an answer I could have given. Okay, so let's close up the show now, by asking you the last two questions. So first one, if you had a new at time and resources, which problem would you try to solve?


Well right now I would probably given my current knowledge and, and leanings I would probably dedicate a lot of basically all my all my time and, and resources to trying to figure out how to avoid the climate problems the because I think this is really, really a fundamental thing we're facing in the world. And so this is something that we're also thinking of actually doing in in Estonia with our models and with our other capacities. So I guess that would be the answer.


Yeah, definitely in good company. With this answer is a popular one. And second questions at the end of the show, second question, if you could have dinner with any great scientific minds, dead alive. Or fictional, who would it be? Oh,


scientific mind. I don't know. I think I would actually go back quite a long time into the past. I think Aristotle would be a great guy to talk to you. So to get really to the wellsprings of the Western scientific tradition, so that would be a good choice.


That sounds like fun, and it will probably be in Greece, which is a good choice. Awesome, well, thermo so let's do a fake goodbye here. And then we'll stop the recordings and I will tell me what to do. Okay. Well, thermo Thank you very much that was fascinating conversation are in low needs, I think. I hope we struck a good balance between going very detailed and nerdy and giving people background about European politics, especially Eastern Europe and Estonia. And help people know before about the scam tree and the wonderful Bayesian statistics that you folks are making as well. So, I will as usual put everything in the show notes and links to your to your websites and things like that for people who want to dig deeper. Thank you again, that mo for taking the time and being on the show.


And I also want to thank you because I'm just thinking that life takes French turns when I started listening to your podcast two years ago, or around that time, I would have never guessed that I end up being hosted or hosted by you or guest on on one of the episodes. So it's been it's been great. Working with you. Great knowing you and thanks for


everything. Yeah, you bet. Thank you very much. Appreciate it. Appreciate your loyalty to the show. And for sure, when I started the show, almost four years ago, I never felt that they would be a full time patient model. Because I was actually starting learning Bayesian statistics. And that's why I stopped the show. So yeah, for sure. Life is always full of surprises. Well, that note, thank you very much tunnel and still very soon in Estonia. Okay, see you bye. Okay, so you can stop with SCP.


So hit the spacebar.


Yep. Yep. And then you could file export, export as W and V.


Web as a WAV format.


Yeah. And make sure it's 24 beats in the in the format 24 beats PCM and then you can save it wherever you want and will you keep the default meta data? Here keep the default, exporting. Yep. Then it's going to be a big file. But whenever you've got time and you can send that to me, we've got Google Drive or Dropbox or whatever you prefer. We transfer and then we'll send that to editing.


It's 651 megabytes. I'll drop you a link.


Yeah, and other than that, so I didn't check yet my Discord. So I don't know if you if you already answered or not, but I did. Okay, so I will add whatever you sent me to the show notes. If you want to add anything to the shownotes I shared with you during the episode Google Doc. So you will see that I already put a lot into the show notes. If you can. Maybe you send me your bio already. So I'll just add it to the Google Doc. But anything you want to add you can add there until we release the episode. I'm trying to think if I'm forgetting something, but I think we're good. No, yeah, just to submit a ffice and off you go.


Okay. Anyway, so thanks again for having me. And let's see. Yeah, other when you're over in Estonia.


Yeah, for sure. I will. I will let you know when the extent maybe Estonia and thanks again for taking the time and saying that. Now it's time to go back to your newborn child.


Yeah, I had a phone call from home so I need to hurry. Okay. Take care. Bye. You