Artwork for podcast Ocean Missions Campfire
Decentralising AI: Algovera - Richard Blythman
Episode 529th March 2022 • Ocean Missions Campfire • Scott Milat
00:00:00 00:32:01

Share Episode

Shownotes

Richard is the co-founder of Algovera.ai, a community that's building the tools necessary to enable decentralised AI.

Algovera Website: https://www.algovera.ai/

Algovera Discord: https://discord.com/invite/e65RuHSDS5

Reference Links:

Tristan Harris & Daniel Schmachtenberger (The Joe Rogan Experience) https://open.spotify.com/episode/2LNwwgJqOMKHOqdvwmLxqd?si=67hM4wkDSMOrNb_KWkAg8A

Transcripts

[:

And there's definitely some interesting use cases that can be explored around this model. I was wondering if you could. Introduce yourself, Richard, and maybe just tell us a little bit about this concept of distributed AI for people who are most likely hearing this for the first time.

[:

And so I did a lot of visualization, experimental techniques and lots of, you know, mathematical modeling and stuff like that. And that kind of enabled me. To move into the field of machine learning towards the end of my PhD. And after us, I worked in some tech companies as a machine learning R and D engineer.

So yes, that's my background. And then the, how Algovera came. The base was, you know I went down the web three rabbit hole and it became really interested in you know, data ownership and things like that. And so basically trying to combine my, my background in machine learning with my, with my new interest in what three.

And so in terms of distributed AI, I think the best way to describe us is by, you know, how AI has done at the moment. And so. At the moment we have these AI apps that are made by centralized tech companies who keep all of your data on their servers and all of the ideas that their machine learning engineers come up with are owned by the tech company.

And so basically everything is owned by this centralized entity at the moment. And so the other side of it is that developing AI is quite a complex process, even compared to a traditional software developments because you have a lot more stakeholders involved in the process. So stakeholders that you wouldn't have in traditional software development for example, would be, you know, a domain experts.

So maybe someone who is I dunno, an orthopedic surgeon who knows the data really well, and they need to work with, you know, the data scientists and other, other members of the team. And so. W basically, if we want to, you know, decentralize AI or create distributed AI, we need to decentralize. Lots of parts of the stack.

And so one of those is, you know, having data ownership and that's, and that's what ocean enables, but there's also other things that we need to develop distributed AI, like for example, empowering data science teams to work for themselves and keep ownership of what they create instead of giving it to the, to the centralized tech company.

Yeah, I guess we're trying to take lots of different components in the decentralized world. Like for example, the decentralized marketplace of ocean, maybe some decentralized storage, like file coin, decentralized compute, you know and then decentralized coordination tools like dials. So, so how can we coordinate without the tech company?

So we, we kind of consider the tech company like the middleman, and now we think with web three, we have the coordination tools to potentially coordinate without this middleman.

[:

I've always sort of had this, this sort of it's not necessarily a saying, but a kind of just a, a thought experiment to challenge people. And when they think about how we currently set ourselves up, where traditionally, if you have a. In a, in a city then by default, you you're kind of limited to, to working with the people who live within you know, however many half an hour drive from, from that office or, or whatever.

But that is a really random filter filtering criteria for when it comes to solving more complex problems. And if you can have, you know, the best people you know, in the domain working together to solve some of these problems, irrespective of where they are around the world, then presumably we can have lot more effective teams and, and solve, you know, much more complicated

problems, I can imagine that this, this kind of concept of people working , around the world and collaborating on AI projects, I suppose, just to, to begin with AI for a lot of people, is, is scary. Probably not for, for many people that will be listening to this that's for sure. But for, for those who may be are a little bit.

Maybe, you know, they're, they're not quite sure about this, this kind of idea of distributed AI. What would you sort of say to them? And, how do you think something like this. Could, or maybe should potentially as, as it being regulated or, you know, how do we, how do we make sure that, that these kinds of tools, or what can we do to try and ensure that , these tools and methods are kind of used to solve some of our , more desirable problems then, then maybe ones that are going to lead to worse outcomes for Us generally or specific

groups of.

[:

Like one of the concerns I had when I was working in the centralized AI world was the problems that, that people were working on. So. You know, I joined a company and I didn't know what team I was going to be on. And then I got put on the surveillance team and like one of the first in the first week I was given the status of, you know, millions of images from surveillance camera, surveillance cameras, and asked to track people across different views.

So this was quite a scary prospect for me, you know? And I was looking around at the team, been thinking, is this okay? And. We act like we had discussions with the team and like, no one felt that comfortable with us bus. For some reason, people, people still work on us and you know, compartmentalize it and things like that.

And I think that happens with a lot of people in the world, you know, I think people are generally well-meaning, but you know kind of. Can work on things that aren't necessarily that good for the world. And that includes things like surveillance or, you know, clicking on odds or, you know, attention engineering, like all of the stuff that we've seen in Netflix, documentaries and things like that.

And so like one of the things that I like about decentralized AI and one of the things we're trying to do with Algovera. Empower people to work on the projects that they want to work on rather than projects that they're given top down on our thesis is that our hypothesis is that by giving people the parrots to choose what they work on, we think they'll work on things that are better for humanity.

You know, all of these massive problems that we have with respect to climate, you know, Built the doing machine learning for climate machine learning for democratize finance, you know, machine learning models to help and developing economies and things like that. And so that's, that's one of the aspects that I like about decentralized AI, but, and also the fact that, you know, what's built, isn't owned by a centralized entity, you know, so.

Bye. You know, all of these machine learning engineers that are working in tech companies and also for universities, they take ownership of all of the ideas that are created. And that means that a single or a few entities are in charge of all of these machine learning and AI innovations. And so I think that's also a dangerous aspect of, of centralized AI.

Of course, there's dangerous aspects of decentralized AI as well. And actually up in, I've been doing a lot of research into this through people like Daniel, Schmachtenberger So there's a, there's a really good Joe Rogan podcast on the dangerous of decentralized tech. And that isn't just things like AI, but you know, also the CRISPR technology and stuff like that.

And, you know, tech is growing so rapidly. Thus, we almost have the power of gods in the, in the hands of, you know, lots of people. So, so now I can create a language model that mimics a human and I can do that really easily and everyone can do us. So having decentralized tech in the hands of everyone and introduces a lot of dangerous, not just in AI.

But I think there's lots of things that we can do to, to like to, to help us, for example, you know, the blockchain is a massive one, in my opinion, the fact that you can track Providence of what happens in AI development. And so, you know, for example I kind of think of it as this shared operating system for AI, where.

Every action that happens in AI, like training a model, using a data assess is I'll try it from the blockchain and that doesn't happen at the moment. And so what happens is that. Licenses got broken. You know, someone might put up some code for, for non-commercial use on GitHub, but those licenses get broken all the time.

And I've seen us in the companies that I worked for, but what we can do with blockchain is actually, you know, make the rules so that you can't, you can't do that. You know, if a license is, is for a certain use, we can bake, bake it into the platform that you can't use outside of, outside of that license.

For example, And so, yeah, I think like having increased auditability and accountability by the blockchain is, is one huge thing that can help to, to prevent the scary aspects of, of AI. And so, yeah, I guess just generally I think, you know, decentralized AI improves on centralized AI and in some aspects But we also need to be really careful and designed with ethical principles and that's, that's what we're trying to do.

Algovera, you know, like consider all of these, these ethical considerations. And so, yeah I think it's really important to always consider this Andrews. But you know, I'm confident we can get away from these scary, these scary outcomes. Yeah.

[:

He's a very interesting guy with a lot of very, very fascinating sort of points of view and thinking around a lot of this stuff. So I will actually add a link to that into the show notes as well. So if people want to dive deeper into that, they can. And like a lot of these things, I mean, we are dealing with many.

Complex adaptive systems. And so, you know, there is second, third and fourth order consequences that are hard to predict. But one thing we do know is that they are coming, they are coming in thick and fast. And I'm not sure that currently we are adequately set up to, to kind of navigate them. But that is probably a topic for an entirely different episode.

So you've also been building a community around this, this, through the ocean DAO called Algovera. Can you maybe just tell us a little bit about you know, what's been happening inside this community, how long have you guys been doing it for? And you know, what actually what's going on?

Yeah, sure. So like when, when I first found ocean as a data scientist, I was, I tried to find, you know, other data scientists in the community and also, you know, tools for doing data science in ocean.

And there, there wasn't a huge amount you know on the other, like the other thing with ocean at the moment, There's there's quite a few data sets on the ocean marketplace, but they're, they're not getting consumed. And so, you know, our hypothesis is thus, the reason they're not getting consumed is because there's not enough data scientists in the ecosystem and data scientists are natural consumers of data.

And so if we can get more data scientists into the ecosystem, we think that can solve the data consume problem. And so, yeah, basically we're just building was, you know, I looked for when I first found the ocean. So for example, we, we started off by running these weekly hiking sessions where, you know, we just, we just started coding on data science problems.

Like live on zoom and figured it out with everyone else there. And then put the recordings up on YouTube and people found those really useful. Like a lot of people say, it's this is exactly what I was looking for. And the ocean ecosystem and stuff. We've heard that from like two or three people, which is, which is fantastic.

We've just basically been trying to grow our community of data scientists with, with the end goal, you know, data scientists can sustain themselves in web three. And so like we want, we want to show people that there's opportunities available in the ocean ecosystem. So we started a grants program to to help people kickstart their ideas in data science.

We're starting to. Partner up with other projects in the ocean ecosystem, like insights to do freelance work and tournament's for them. We're also planning to run a hackathon with links on brain computer interface. Robin from data union is in our ecosystem. We just started some hacking sessions to create a face anonymization algorithm which will, we'll all share ownership of.

So we work closely with, with lots of different projects in the ocean ecosystem who have data, both, you know, needs to help with data scientists, to extract insights and, and process this data. And so. Yeah. Like th there's not a, there's not a huge amount of data sciences and ocean, like I said, and web three in general.

And so we want to be like one of the go-to places where data scientists can come, you know, can learn from each other, get trained up on all of these new resources, get funding opportunities, you know, work on different projects. And so that's, that's the, the community aspect of Algovera.

It's awesome to say how much the community has grown and, and the level of engagement and enthusiasm around, you know, the grants program.

And I think it was just last week, you shared a number of projects that were, were active and you know, there was a lot of really exciting stuff coming out of the, out of the committee. From a sort of technical perspective, I can assume that you know, building, building AI and tools surrounding AI are already going to be relatively complicated when it comes to building distributed AI.

I mean, what are some of the missing pieces of the. The assumption being that this is really where it's kind of starting to happen and that there's no real preexisting take stack that has kind of predated it.

[:

And, but also they haven't used, you know, DAO tools to coordinate. They've never shared a treasury. They've never, you know, voted online. And also like working with private data is, is, is a very different workflow. So data scientists are used to having access to the full data set locally. So you can inspect, you know, every single result and things like that.

I suppose, with computers, data, for example, it's a very different workflow where maybe you have some sample data, like a few images or something like that. And so you need to like, make sure. Your model runs on the sample data. And, and then like, you know, make sure everything works before you send it off to the full dataset, because you probably have to pay for the compute job done as well.

Right. So, so yeah, it's a very different workflow that we've, we've, we've tried to map out and we're hoping we're working to converse into like a course that can train. So at the moments we send the data scientists to our hacking sessions, which is, you know, eight hours to watch. So we're trying to condense that into a course that can, can start to train people up in this, this new workflow.

And so like that that's from the data science side, but from the tech stack side, it's also very different. So, you know, basically the whole stock is owned by, by the big AI companies like Google, Amazon. And so if you, if you're a data scientist, you store your data on AWS, you know, you probably use compute on AWS.

Like maybe you even work for, for AWS. They depend on. They pay your salary and, you know, keep ownership of everything that you create. And so basically the whole stack is run by, by the big tech companies at the moment. And so decentralizing, dusk is a, is a massive problem. And we we've, we've basically been trying to do it one step at a time.

Like I said, the decentralized Marcus place is one component of the decentralized AI tech stack. And so we've been building almost exclusively in ocean so far. So we have a good idea of how that works, but now we're starting to, you know, play around with decentralized storage. So there's no good decentralized storage solutions for data scientists at the moment.

For example, data scientists use Python pretty much exclusively and, and there's no pit Python API. For people to use, you know, file coin or IPFS and easily, you know, there's no easy way for data scientist to use decentralized compute providers as an alternative to AWS, for example. And there's also no, no goods way for data scientists to coordinate through a DAO framework.

So that's another thing we've been building is this, this DAO framework, but that's made, especially for data scientists. So some of you might've heard of like DAO house or arrogant. You know, gnosis even, these are all dive frameworks to allow people to coordinate together and take the actions that they need to take to coordinate with decentralized teams like with gnosis, for example, share at treasury, but there's lots of actions that data scientists need to take that aren't currently possible.

For example, you can't publish an asset on the ocean marketplace through a DAO so that's what we've been. These are some of the first smart contracts that we want a rice is, you know, maybe a layer on top of gnosis or a layer on top of Aragon that allows a group of people to publish and share ownership of a data Sasser algorithm on, on ocean, for example.

And so, yeah, I think like how I would sum up Algovera so far. And we've been going for like about six months. We we've basically been focusing on the community more so up until maybe two months ago and, you know, mapping out the problem for what does the stack look like? Cause it's such a hard problem.

And in the last two months or so, we've really just started and building like focusing on building. And so you know, we're, we're starting to build our Algovera muddle library. We hope to build a data set library. You know, we're building ops on hugging face spaces. We're starting to build our day framework for data scientists, and we're starting to build libraries for decentralized storage for data scientists.

We're playing around with decentralized compute for data scientists. And so all of these are like libraries that just make it easier for, for each aspect of the problem. And future that's hopefully going to come together into one, like giant data set platform that has a bunch of tools, libraries and gives you everything that you need as a data scientist and web three to easily build and monetize while you create.

[:

[:

You know, we, we want to work in an ecosystem of decentralized AI projects. We can't do it all by ourselves. And like, this is one of the aspects of our grants program. For example, we would love the grants that we fund to, to spin out with Algovera and start their own projects and the decentralized AI space.

This is already like starting to happen for some, you know, fell token who came to ocean, for example, joined our community. And then that was actually before our grants, we pushed them into direct direction of, of ocean and encouraged them to like build out their decentralized federated learning solution.

And so. We're hoping that can happen more. And especially through our grants program, you know, we can kickstart projects. They can, you know, maybe move on to bigger funding from like ocean or elsewhere in the ecosystem. And we can really just start to build this ecosystem of projects, all working together to solve this problem of decentralized AI.

[:

[:

Honestly, that's probably the biggest It is that people are just really skeptical there.

[:

[:

And I think the attitudes. Machine learning engineers and data scientists is going to be later to change the night because like, bear in mind, you know, we've all worked for these tech companies who look after you really well, you know and give you access to amazing resources, you know, massive data sets.

And so I think there's like a little bit of worship that goes on with these big tech companies. You know, people hang off every, every new academic paper that's released and stuff like that. So. I think it's, I think the reaction we get is because, you know, we're, we're kind of hating on something that they love so much or they're so fond of.

So I haven't heard any like really good critiques and, you know, I'd love to hear some great critiques because that, that means that we can solve the problems that they're talking about. I suppose, most of the reactions we've got, haven't been kind of structured arguments, more like Reactionary. I would say it's interesting.

Yeah.

[:

[:

We've got, we've got bass comments before. And like we even got told that our website was, was too combatative. You know, the message was was too strong against big tech and stuff, even though I don't think we even mentioned big tech on our website, but we actually had to go back and look at our website and think like, is the message we're saying.

Is that too competitive to, should we be, you know and how should our message be? Should we be combative against big tech or should we, you know, not kind of give out about big tech and just build something that's better, you know, bus, I think like my personal, the way that I feel personally is like some kind of disillusioned with the way the world's works at the moment.

So, and that includes big tech. And so the message that we're trying to send across is. It's kind of like, it resonates with me. And so I'm hoping that it resonates with others. But maybe also it's not the best message for people who, you know, really enjoy their job in big tech and aren't aware of some of the dangerous and stuff.

So yeah, it's tough to figure out what the message should be.

[:

[:

Like just to give you a couple one of them, for example, and this ties in with what data union are doing. How, how AI works at the moment. If you wanted to develop a new algorithm, you collect a data assess. So that could be, you know, in-house or across different branches of your organization. You send that to a company like like Amazon mechanical Turk who get these humans to label this data for you.

And then maybe you use that data and you train a model and then maybe the model isn't good enough. So maybe you need to go back to Amazon mechanical Turk and get some more data labels. And this, this process has always been really manual and time consuming. And, but with, with with this distributed network, this decentralized network of, of web three, we now have the ability and the incentive system to crowd source data from anywhere in the world.

So for example, you know, rather than going to Amazon mechanical Turk, we could be constantly monitoring which new data that we need and then like paying people to supply that data, you know, almost in real time. And so that means that you know and just, sorry, another issue with the way that it's done at the moment is that as I mentioned, when the company collects the data, in-house that that's not representative of the real world.

So, what you see is that after you develop the algorithm on your in-house data, it works well. And then when you deploy it in the real world, you see this performance drop, and this is, this is well-known in AI. And so the way that we can do this by crowdsourcing in real time, you know, using web three, that means that we're crowd sourcing from data that represents the railroad.

And so we won't see this deployments of issue that we see in traditional AI. So that's, that's kind of a complicated message. If you can kind of imagine these dynamic AI systems that are constantly improving their models and constantly crowd sourcing the data that it takes to improve them. That's a use case that has never been possible before with that kind of timeframe

[:

Structural models in which they can be built upon. And it looks like, well, we know that the centralized one is already up and running. And you know, it looks like the, the de-centralized alternative maybe coming up soon to, to sort of chase

[:

One with the walled gardens of AI and the other one with open gardens with interoperable AI, it kind of sums up the difference between the web three and swept two or centralized and decentralized in my opinion. It's the difference between zero sum and positive sun. And so in the web tool world, we have these walled gardens with zero sum where everyone is trying to take, you know, services from, from the other person.

Unlike another example of zero sum is this are the war games that we play, you know, This, this military industrial complex, where everyone tries to build the biggest army because they're playing zero sum games, rice, and that's just going to lead to our destruction. Somehow whether it's through AI or, you know, climate or something else.

But if we can start to pay positive some games where the mindset is. We work with each other on all ships will rise. It's not me taking some service from you. It's, you know, both of us improving our services on both of us benefiting. I think if we live in that world and we're much less likely to end up in one of these scary outcomes.

[:

World wars as such since then, but there's sort of been many, many proxies. But ultimately, you know, there is no winner to that game that is, that is lose, lose. So if you can have win-win embedded at the, at the root then surely that is a, as you say, positive sum game

[:

It would be so cool. Maybe he could come on this podcast and, you know, we could chat about this stuff. I'd love to get him involved in. Oh, he's mentioned about three and a few times in the podcast and YouTube videos I've watched. And so he seems interested and I'm sure because they talk about AI so much in the podcast.

I'm sure the concept of data decentralized and distributed AI will be of interest to him. So yeah, hopefully we can get people thinkers like him in the space to like help develop our thinking and Yeah, I'm just really excited about where, where all the fat could go. There you

[:

if you're listening, please reach out. On that note for people listening and they want to find out more, we will, we will have links to all of your socials, but what, what, what would you recommend for someone interested in learning more? We should have.

[:

Like we have, we also have our YouTube channel with all of our playlists for all of the hiking sessions. We put up quite a lot of material there. We have, we're starting to build out our docks and stuff as well. But yeah. And we've lots of other social and like Twitter and stuff like that. But I'd say, yeah, like the main place where things happen is our, is our discord.

So just drop in, like join in any conversations that you find interesting. And the community is like really helpful. So we'll make you feel the home and like point you in any direction that you want to learn more about and stuff. So, yeah, just drop into the discord and reach out.

Links