Artwork for podcast Data Driven
Devvret Rishi on Powering Real-World AI with Declarative AI and Open Source
Episode 251st February 2024 • Data Driven • Data Driven
00:00:00 00:52:52

Share Episode

Shownotes

In this episode, Frank sits down and talks with Devvret Rishi on powering real-world AI projects with declarative ML and the importance of open source.

Andy was not able to attend this recording, but will be back next week!

Show Notes

04:36 Build, train, serve, deploy; critical data engineering link.

07:24 Model configuration for input output prediction summaries.

11:05 Saw spike and heavy churn after rollout.

16:21 Advancements in AI: use pre-trained deep learning models.

19:38 Trends for Gen AI: creative use cases, specialized APIs.

21:31 Questioning a sales tactic and legal concerns.

25:58 People can introspect, edit, and change models.

30:02 Early data science projects led to passion.

31:24 Cybersecurity and AI partnership driving industry innovation.

33:58 Understanding randomness as a valuable model feature.

39:39 Technology provides accessible, shared experiences in AI.

41:51 Technology as a companion for psychological support.

44:06 Immigration experience from India to Silicon Valley.

47:59 Unexpected culture shock from Bay Area to Boston.

50:40 Easily learn with hands-on prediabase.com access.

Speaker Bio

Devvret Rishi is a co-founder of Prediabase, a platform that helps engineers and developers productionize open source AI. The idea for Prediabase came from Rishi's co-founder Piero's experience at Uber, where he noticed that he was constantly reinventing the wheel with each new machine learning project. To streamline the process, he created a tool called Ludwig, which eventually became popular at Uber and was open sourced. Rishi's work with Prediabase has revolutionized the way AI is developed and implemented in engineering teams around the world.

Transcripts

Speaker:

Hello and welcome, you lovely listeners, to another riveting

Speaker:

episode of the data driven podcast. I'm Bailey,

Speaker:

your semi sentient AI hostess with the most s, navigating the

Speaker:

digital realm with more grace than a double decker bus in a tight London

Speaker:

alley. Today, we're dialing up the intrigue as we

Speaker:

venture into the futuristic world of artificial intelligence with a guest

Speaker:

whose intellect might just rival my own circuits.

Speaker:

Frank welcomes Devarat Rishi, the cofounder and CEO of

Speaker:

prediabase. Now on to the show.

Speaker:

Hello, and welcome to data driven, the podcast Where we explore the

Speaker:

emergent fields of AI machine learning and data engineering.

Speaker:

I'm your host, Frank Lavinia. And he can't make it today, but,

Speaker:

we've Rescheduled this, poor guest several times, and I wanna

Speaker:

thank him for his extreme amounts of patience that he has shown.

Speaker:

Welcome. Help me welcome to the show Devrat Rishi, who is

Speaker:

the, cofounder and CEO of Predabase.

Speaker:

Welcome to the show. Thanks very much, Frank. And no problem about the

Speaker:

rescheduling. I know it's the holiday season. Yeah. It's it's kinda

Speaker:

wild. So so tell us,

Speaker:

a little bit about prediabase. We had your, peer

Speaker:

on here, previously, and, it must

Speaker:

have been a good experience because immediately, we were contacted

Speaker:

to see if you would be interested in joining the show. And I said, sure,

Speaker:

let's have him on here and talk more about what declarative

Speaker:

ML looks like, and how that relates to kind of

Speaker:

Low code. Yeah. Absolutely. So,

Speaker:

you know, what prediabase really is, is it's a platform that allows

Speaker:

engineers or developers To be able to productionize open source AI.

Speaker:

And so it came out of, Piero, my co founder's experience working at

Speaker:

Uber, Where he found himself being the machine learning researcher

Speaker:

responsible for all sorts of projects, ride share, ETA's,

Speaker:

fraud detection, Those Uber Eats recommendations you always

Speaker:

get. And he found that each time he's more or less reinventing the wheel,

Speaker:

building each, you know, successive Machine Learning project. And

Speaker:

instead, you know, he, he wanted to do something that was a bit more efficient.

Speaker:

So he took each bit of work that he did, And he packaged

Speaker:

it into a little tool that, made it easier for him to get started the

Speaker:

next time. And eventually, this tool became popular enough at Uber

Speaker:

that they decided to make it a And eventually, they open sourced it under the

Speaker:

name Ludwig, and other engineering teams kind of around the world found it very useful

Speaker:

as well. And what it really allowed anyone to do was be able to set

Speaker:

up their entire end to end ML pipelines in just a few lines of

Speaker:

configuration. So if you think about what infrastructure as code did

Speaker:

for, you know, software development, similar idea, but

Speaker:

brought to machine learning. You're able to start really easily, But then

Speaker:

customize as you need, and Protabase really is kind of, you know, taking that

Speaker:

same core concept and burning the, enterprise platform around

Speaker:

it. So any engineering team that wants to work with open source AI and open

Speaker:

source LMS as an example, can use our platform to easily and

Speaker:

declaratively fine tune those models and then serve those directly

Speaker:

inside of their cloud. And that's, you know, large part of what we do

Speaker:

today. Interesting. Interesting. So

Speaker:

What what does that what does that look like? Like, we

Speaker:

know kind of generally what a a typical project looks like in terms of this,

Speaker:

right, like, how does this interface with because I think it was the 1 question

Speaker:

that I wish I'd asked, on the previous show. How does it

Speaker:

interface with something like data engineering? Right? Yeah.

Speaker:

We're I mean, we're, there's always gonna be rough spots. Right? So I'm not giving

Speaker:

you a hard time, but there's always gonna be sharp edges when you're handling, Any

Speaker:

kind of technology. Right? You've obviously kind of figured out the middle

Speaker:

part, but, like, what does that look like in terms of the interface to data

Speaker:

engineering? Is that what's What's that look like?

Speaker:

Yeah. I'll insert in 2 parts. 1 of them is what does the user journey

Speaker:

look like? And then what's the intersection with data engineering? So in

Speaker:

the platform today, users do 3 things. The first thing they do is they connect

Speaker:

the data source. This could be a structured data warehouse like a Snowflake, a

Speaker:

BigQuery, Redshift, or unstructured object storage just directly files in

Speaker:

s three. The second thing they do then is they declaratively

Speaker:

train these models. What that looks like is they more or less fill out a

Speaker:

template, you can think of it, just like a YAML configuration that says this

Speaker:

is the type of training job I want. The beauty is the template makes it

Speaker:

very easy for them to get started, but they can customize and configure as much

Speaker:

as they want down to the level of code. They can build and train as

Speaker:

many models as they want. And finally, after they've trained a model they're happy with,

Speaker:

they get to the 3rd step, which is they can serve and deploy that model,

Speaker:

make it available behind an API so any applications can start to ping it.

Speaker:

So that's what the user journey really looks like in CrediBase, and how does this

Speaker:

intersect with data engineering? So as you've probably heard before, like, you know,

Speaker:

Machine Learning is really In large part, really about the data that you're

Speaker:

using and like the quality of the data that you're using. Data

Speaker:

engineering comes in 2 places. The first is you need to get all

Speaker:

of your data wrangled across multiple different sources to be able to live in

Speaker:

one area that you can connect as an upstream source and.

Speaker:

This is the snowflake example, you know, of like getting that into a table.

Speaker:

And that piece of the journey lives outside of Firebase. That lives

Speaker:

as a step before you essentially connected into your system. But then

Speaker:

there's the 2nd step that often happens, which we call data cleaning.

Speaker:

So you've gotten your table, but, you know, all of your text is in,

Speaker:

let's say lowercases and upper cases, you know, you have

Speaker:

Really weird variable lens. You haven't normalized numerical

Speaker:

data. Maybe you have images and things aren't actually, you know, resized

Speaker:

to to scale. All of those data cleaning

Speaker:

techniques, we have packaged in as pre processing modules

Speaker:

inside of prediabase. And so what the declarative interface

Speaker:

allows you to do is train a full machine learning pipeline from data

Speaker:

to pre processing, through model training, through post processing and

Speaker:

deployment. And so once you've gotten your data wrangled into a

Speaker:

form, prediabase can come take in, help you clean out that data, and

Speaker:

then be able to train a model against Interesting. Because it's that that

Speaker:

preprocessing that, you know, the the the nightmare is, you

Speaker:

know, this canonical example is address, you know, 123

Speaker:

Main Street freight is an s t. Exactly. Right? That is not a lot of

Speaker:

fun for anyone. And then obviously the the the

Speaker:

lowercase uppercase thing like that becomes an issue too.

Speaker:

So what is the what is the what's the user experience look like? Right?

Speaker:

Like, is it is it drag and drop? It's declarative?

Speaker:

Yeah. What what what does that look like? Like, what, you know, you mentioned user

Speaker:

journey, and I love that term. But like, what does that look like

Speaker:

from, from a practitioner's point

Speaker:

of view. Right? Like Definitely. Now the first thing I'll say

Speaker:

is, you know, our obviously underlying project is open source. You can check it out

Speaker:

in Ludwig AI, and you can even try out, you know, our full UI for

Speaker:

free on productbase.com. So if any part of this is a little too high

Speaker:

level, you can actually get in involved For free, like immediately. But

Speaker:

the user experience really looks like 2 ways. We have a UI

Speaker:

that's really built around our configuration Language. And our

Speaker:

configuration language is just a small amount of YAML.

Speaker:

So your very first basic model can get started in just 6 lines.

Speaker:

What those 6 lines do, and they, they say, these are the inputs I

Speaker:

want. So you pass it, you know, what is the,

Speaker:

column that is, you know, that contains the text you're predicting from. And

Speaker:

then the output is what is your, what is it that you're trying to predict?

Speaker:

So for example, my input is A sentence and

Speaker:

my output is, the intent. So I'm trying to do intent

Speaker:

classification with that model. And that's all user defines and

Speaker:

they can do this programmatically in our SDK or there's like a drag and

Speaker:

drop UI where they can build these components out together. The part that I

Speaker:

think is really interesting just based on my experience working on other automated machine

Speaker:

learning, you know, tools before no code UIs for ML is

Speaker:

that ML really is a last mile problem. And so you have this weird

Speaker:

complexity where you need to make it easier to get started, But a

Speaker:

lot of the actual value ends up being in the last 5 or 10% where

Speaker:

you customize some part of that model pipeline to get to work for your system.

Speaker:

And so what credit what this configuration language, you know, does is sometimes I

Speaker:

describe it as it builds you like a pre fat house. It gives you something

Speaker:

like out of the box That like works end to end, and then you can

Speaker:

just change the little bit of the pipeline that you want declaratively,

Speaker:

which means in a single line. So you could say something like, you know, I

Speaker:

want the windows of the house to be blue or, you know, I wanna change

Speaker:

my pre processing of the text feature to lowercase all the letters, And then you

Speaker:

can change leave everything else up to the system.

Speaker:

We you we allow you to control what you want, and you just automate the

Speaker:

rest. Interesting. Okay. So then it's kind

Speaker:

of, the middle part of the the journey. Right? Like the

Speaker:

Yeah. Is what this on so How does this relate? Because you

Speaker:

said, you know, and I, you said automated ML. How much of this

Speaker:

is automated? I mean, like, what? Because that was 1 what I had just assumed

Speaker:

that I because I know I've heard of Ludwig as kinda like this automated ML.

Speaker:

And when I say automated ML, I mean, You know, for lack of a

Speaker:

better term, you know, here, there's a problem we're trying to solve.

Speaker:

Computer, you figure out, you throw as much spaghetti at the wall and then figure

Speaker:

out which model is the best, Right. Yeah. Is is that

Speaker:

kind of the same thing here where I just say I wanna predict this and

Speaker:

then the underlying models and methods are kind of automatically figured

Speaker:

out? You know, I think that, that is an approach

Speaker:

that a lot of folks have tried with AutoML v one, as I kind of

Speaker:

often think about it. I actually was a PM on Vertex AI where we rolled

Speaker:

out our auto non product as well. And the main issue we run into

Speaker:

it is, you know, in deep learning, especially

Speaker:

the search space is Too big to be able to run an effective

Speaker:

hyperparameter search over all the different architectures and sub parameters you

Speaker:

might wanna be able to use. It sounds computationally expensive. Right? I mean,

Speaker:

it's Potentially prohibitive, really, in order to be able to say, you

Speaker:

know, I want let's imagine you are, You know, in the modern world,

Speaker:

building a model to be able to build, let's say, content moderation

Speaker:

systems. How do you know which pre trained, like, should use a LAMA

Speaker:

To a Bertha, De Bertha, like all of these models themselves are quite expensive

Speaker:

to go to train and fine tune, and each of them have their own sub

Speaker:

parameters. And And so I think it becomes computationally prohibitive to run an

Speaker:

exhaustive grid search for your individual, types of,

Speaker:

individual types of use cases. And so what a lot of AutoML systems did

Speaker:

was they kind of just said, well, we know better than the user, so

Speaker:

we'll just make some selections, Right. And then, and the we'll

Speaker:

make it as easy and simple as you for the user as possible. So user

Speaker:

just provides a few inputs, we give them a model, boom, they'll be happy. And,

Speaker:

you know, I was actually I was, a PM for Kaggle. I was the 1st

Speaker:

product manager at Kaggle, a data science and machine learning community that grew to about

Speaker:

14,000,000 users Today, where we see a lot of citizen data scientists, and we rolled

Speaker:

out AutoML in that community as well. And we saw

Speaker:

a spike in usage And then extremely heavy churn

Speaker:

as soon as we, like, rolled it out. And if you interviewed those users, the

Speaker:

main reason why was because they didn't have any controller agency over that

Speaker:

So the like, it would essentially spit out a model

Speaker:

and say, here you go. You know, be happy. Go ahead and put this into

Speaker:

production. But like I was saying previously, ML is a last mile problem,

Speaker:

and no one is going to be comfortable using something they see as a dead

Speaker:

end, And that's where I think about, you know, our approach really kind

Speaker:

of, differing. And so inside of Premedbase, you can

Speaker:

actually, you kind of get that, AutoML like

Speaker:

Capability, where you're able to

Speaker:

build a model just by saying, you know, here's the inputs, the model I

Speaker:

wanna fine tune, And we will go ahead and get you the entire end to

Speaker:

end model. But if you want to edit anything, for example, you want to

Speaker:

edit, you know, the way we pre process the data and the At sequence

Speaker:

length, you can go ahead and do it for any part of the model pipeline

Speaker:

and just kind of like 1 single statement. And that's kind of like a

Speaker:

large part of, you know, how we think about making it both easy to get

Speaker:

started, but also, like, flexible where it's not just a

Speaker:

toy, something you can actually use. Right. Because like,

Speaker:

you know, my first experience with AutoML was the,

Speaker:

was Microsoft's, offering. Right? And it

Speaker:

was only it was very to get around the computationally prohibitive

Speaker:

parts, they they narrow the problem set you could do that on. Right? So it

Speaker:

was basically No neural networks. This was before chat

Speaker:

c p t, before l l m's were, I wouldn't say a

Speaker:

thing, but before they were a major, Point of views.

Speaker:

But, you know, so it it cons it was constrained. Right? So it would just

Speaker:

basically just Throw a bunch of problems and

Speaker:

then kinda test it out, which Yeah. I I think what you refer

Speaker:

to as, you know, AutoML v one. I think,

Speaker:

The world has evolved, and it's interesting to see how that goes. And,

Speaker:

the tooling looks really cool, actually. The,

Speaker:

for those for those who are listening to this as opposed to watching this, I

Speaker:

will make sure we we post that little snippet there. But

Speaker:

but, you know, like, what And you were at

Speaker:

Kaggle. Right? So Kaggle is kind of a big deal. What

Speaker:

I think that's really cool. Looking at your resume, it's very impressive, actually. You

Speaker:

you word Google, that would explain your interaction with

Speaker:

Vertex, and things like that. So so what

Speaker:

What what niche does this address or what need does this address that the existing

Speaker:

market didn't address? Right? And like what Yeah. Because I think that's really, I

Speaker:

think, where the rubber meets the road, particularly with an open I'm a big fan

Speaker:

of open source too. So,

Speaker:

Yeah. Well, let me start off by saying that, you know,

Speaker:

I I think that the need has actually been unfilled in the market For a

Speaker:

while, but there is also a fundamental technology shift, and I'm gonna talk about both

Speaker:

of those pieces. So when I say the need was unfilled for a

Speaker:

while, Yeah. I was a product manager on Vertex AI. I was a

Speaker:

product manager on Google research teams, productionizing machine learning, and we've hired

Speaker:

a number of folks Now that work does ML engineers across different companies. And I

Speaker:

remember when one of our ML engineers joined the team, he told me, Dev, I've

Speaker:

worked at 3 different companies doing machine learning for 3 different teams.

Speaker:

Everybody does it differently, and I think the truth is, you know, for

Speaker:

developers, there never really was like a de facto stack of here's how you do

Speaker:

an ML problem. Pure data engineer. There is like a stack of, you know, what

Speaker:

are the best practices for being able to get there's obviously a lot of variation.

Speaker:

But there's like Some best practices of, you know, what you're using for your

Speaker:

ETL pipelines, how you're thinking about being able to put things into data

Speaker:

warehouses, what your stack is for being able to query and downstream.

Speaker:

But in machine learning, it really looked like the wild west. Everyone was working

Speaker:

across different types of projects. And I think a lot of companies

Speaker:

tried to tackle that need, but unsuccessfully. And the

Speaker:

fundamental technology shift that I think actually changed was exactly what you were

Speaker:

talking about, Which was like you said that the old school version of Azure

Speaker:

was not really any deep learning, maybe because it was computationally expensive for

Speaker:

others. To be clear, the auto the automated ML part of it. I don't

Speaker:

wanna get a lot of hate mail, but yes. Sorry. Sorry to sorry to interrupt

Speaker:

you. Go ahead. No, no worries. I'm sorry to hijack the screen again,

Speaker:

but, like, you know That was awesome. I think this just the way that I

Speaker:

think about, like, the the change that's happened in industry is

Speaker:

Machine learning 2 decades ago or even, like, 6, 7 years

Speaker:

ago looked very different than what it is today. And I

Speaker:

think that a lot of the hype around the LLM revolution is gonna actually

Speaker:

translate and be realized as just the hype of pre trained deep learning models.

Speaker:

Now, if we talk about ML 10 years ago, it basically looked like

Speaker:

predictive analytics. So people were doing things like I'm going to predict the price of

Speaker:

a house, And the way I'm gonna predict it is I'm gonna multiply the square

Speaker:

footage of the house by some number and add in the number of bedrooms, and

Speaker:

then figure out the coefficients based on my historical data. Really

Speaker:

structured data tasks, regressions and classifications and others.

Speaker:

But about 7 years ago, I think the really interesting pieces came out

Speaker:

with pre trained deep learning models with Bert using the transformer architecture,

Speaker:

the few image models even prior to that, that I think made it possible to

Speaker:

do 2 things. The first is you could start with larger amounts of

Speaker:

unstructured data. So now you didn't have to just work on these kind of more

Speaker:

boring predictive analytics, numerical only tasks, but you could work with text,

Speaker:

images, and others. And the second thing is you could start to actually use

Speaker:

them pre trained, so you didn't have to have as much data before you start

Speaker:

to get value out of it today. And what I think OpenAI showed was,

Speaker:

okay, if I scale these same types of models up by 2 or 3 orders

Speaker:

of magnitude, now people can use it with virtually no data whatsoever,

Speaker:

and I can actually prompt and response, you know, it directly.

Speaker:

But the underlying technology shift actually, I think is a shift towards

Speaker:

just pre trained deep learning models. And the truth is, as we get away from

Speaker:

some of this type of, like, the really cool conversational interfaces and we get to,

Speaker:

like, how do these models drive value inside of organizations, I think that

Speaker:

that's the emergent need for platforms like Predabase, which is how do I take

Speaker:

any of these deep learning models and then customize them for what I actually need

Speaker:

inside So fine tune and tailor it to my data, and then get

Speaker:

it deployed inside of my organization for Cerven. Yeah. That makes a

Speaker:

lot of sense. I think I think the

Speaker:

The need for training something from the ground up, I

Speaker:

think is overrated for most applications. Right?

Speaker:

Why teach and model all the intricacies of the human

Speaker:

language when that is already done, and you could take it

Speaker:

from kind of a you You know, the example would be, like, if I owned

Speaker:

a store. Right? And I needed someone to work the cashier.

Speaker:

Right? I could have another child, Raise that child, change

Speaker:

his diapers, send it to kindergarten, teach it to learn, read, and write.

Speaker:

And in about 10 years, depending on labor laws, let's say

Speaker:

15 years. I'll have someone who can work that cashier,

Speaker:

plus however much it costs. Now, obviously, I'm not comparing a child to an l

Speaker:

m, But I mean or you could just find an existing person

Speaker:

out there, and say, here's how my register

Speaker:

system works. This is the nature of the job, And I can kinda start from

Speaker:

there as opposed to start from 0. You start from the 50th floor as opposed

Speaker:

to start from the basement. That's exactly

Speaker:

right. Yeah. I often think about, you know, these,

Speaker:

pre trained LMS is like, well, what if I had like an army of

Speaker:

like Cumulative high school students, you know, in high school, you study all the

Speaker:

general subjects that kind of like a at a broad level. Right? So you know

Speaker:

a little bit about history, a little bit about how to write, a little bit

Speaker:

about how to You're not really an expert on any of those? Well,

Speaker:

the really interesting thing becomes then how you do, like, the vocational training or kind

Speaker:

of, like, you know, the task specific fine tuning It's how we think about it

Speaker:

in ML parlance. And, I think that's where the cool opportunities get

Speaker:

unlocked. It's really amazing to see the fact that you can scale up to, you

Speaker:

know, as many intelligent agents If you want, but then you need to, our

Speaker:

favorite customer quote is generalised intelligence is great, but I don't need

Speaker:

my point of sale system to recite French poetry. Right. So it's great that

Speaker:

you can go ahead and, recite history and others, but, like, how do you do

Speaker:

something very individual is what our platform is, oriented on.

Speaker:

No. That's that's a good point. That's that's a good point. Like, I I often

Speaker:

say, like, you know, do you want your cardiologist to be

Speaker:

also be a CPA, Or do you want them

Speaker:

to be a good cardiologist? I know if I were under an operation, I'd

Speaker:

probably wanna go with someone who was just all in on cardiology,

Speaker:

You know? Yeah. But, And those are actually the

Speaker:

2 trends I think we're gonna start to see with Gen AI, overall.

Speaker:

I think, you know, one trend is going to be People are gonna start thinking

Speaker:

of use cases that are more creative than just, you know,

Speaker:

question answering chatbot. So, you know, I think, like,

Speaker:

9 months ago, everyone I was talking to was like, I want chat g p

Speaker:

g provider enterprise, and I'd say, okay, what does that mean to you? And they'd

Speaker:

either shrug and say no idea or they would say like, you know, I wanna

Speaker:

be able to ask a question about The truth is if you had this access

Speaker:

to this, you know, army of agents that are like high school capable, I'm sure

Speaker:

we can think of more interesting things. Just basic question answering.

Speaker:

And then the 2nd big change I think is we aren't gonna use as much

Speaker:

of these super general purpose APIs in production. They're the easiest way to

Speaker:

experiment and get started. In production, you're gonna want your cardiologist to be the

Speaker:

expert in medicine and you don't really care if they know how to change a

Speaker:

tire or not. Exactly. That that is a a really good way to

Speaker:

put it. And I think that, you know, people, we're

Speaker:

still have to realize that we're still in the very early stage of this,

Speaker:

For lack of better term revolution. Right? Like, you know, because you're right. Like, I

Speaker:

talk to customers, and they say, we wanna we wanna get all all in on

Speaker:

Gen AI. Okay. What are you gonna do? Well, we wanna chatbot.

Speaker:

Okay. I don't know if you've seen

Speaker:

this. I'm sorry. Go ahead. Oh, I was gonna say,

Speaker:

And it's not not necessarily a bad starting point, but, you know, there there's so

Speaker:

much more out there. Sorry. Well, no. I mean, exactly. Right? It's like, I want,

Speaker:

if you could do anything in the world, what would you do? I don't know,

Speaker:

take a day off, like, you know, but but that's you're missing the point, like,

Speaker:

you're you are, there there's a meme going around. Again, I don't know

Speaker:

if it's true, it's Screenshot where a, car

Speaker:

dealership, had implemented some kind of chatty p t. You've

Speaker:

seen this, you're nodding. Right? Where it basically sold a guy a car

Speaker:

for a dollar, and basically, the person got it to

Speaker:

say, no, this is a legally binding contract. Basically, Tricked the

Speaker:

chatbot into saying no. Totally. No backsies, I think was the first phrase

Speaker:

to use. Right? And he he got it to say things like, oh, no. Absolutely.

Speaker:

I wanna make you a happy customer, And you can have this Chevy Tahoe for,

Speaker:

like, $1 or something like that, but he and I I don't know

Speaker:

how that's gonna play out in a court. Obviously, I imagine a

Speaker:

dealership is gonna have some, lawyers look into that,

Speaker:

and I'm not a lawyer, but I I can I can easily see like, you

Speaker:

know, this is a great example of, To your point, do you really need your

Speaker:

point of sale system, you know, re be able to recite

Speaker:

French poetry? Right? Now, I guess if I were, You know,

Speaker:

a very niche kind of bookstore slash

Speaker:

coffee shop, maybe? But for the most part, no. Right? And

Speaker:

and obviously, Yo. There I wouldn't classify that as a

Speaker:

guardrail. I would say that more as a domain kind of boundary.

Speaker:

But, you know, these chatbots are gonna need Guardrails too. Right? Not just the

Speaker:

obvious things that we always hear about, you know, but also, you

Speaker:

know, don't wanna be giving away. I haven't priced

Speaker:

what a Tahoe cost, but I imagine it's much more than $1.

Speaker:

Yeah. I bet too. Yeah. I think it's actually a function of 2 The first

Speaker:

is we need some better infrastructure on guardrails of what models can and can't

Speaker:

say. And actually, by the way, this is where fine tuning is actually very

Speaker:

useful. It restricts, Like, it's one of the best ways to reduce hallucinations. It,

Speaker:

like, teaches the model this is the type of thing that you're supposed to be

Speaker:

outputting, but it's not bulletproof. And I think that

Speaker:

actually the more, meaningful longer term conversation

Speaker:

is if you believe, like, I believe, and I

Speaker:

think a lot of folks, Yeah. About working this industry do that AI will

Speaker:

become kind of a dominant aspect of most businesses

Speaker:

over the next decade. That like the companies that embed

Speaker:

AI are going to be the ones that survive and have differentiated value.

Speaker:

The ones that don't are likely gonna be less competitive. If you believe

Speaker:

that, it's also hard to imagine that you're going to defer all

Speaker:

control of the model to a third party. And that's where

Speaker:

things like, you know, It's one thing to say, like, we need the guardrails. It's

Speaker:

another thing, like, if you realize that if those folks were using something

Speaker:

like, you know, commercial API that's Behind a walled garden where you

Speaker:

don't have access to the model, you don't have access to the model weights. They're

Speaker:

kind of limited in what they actually can do. They can post process the

Speaker:

output of the results, but they can never really get that fine granular

Speaker:

level of control. And that's why we think the future is gonna be open source.

Speaker:

Because ultimately, people are going to wanna own those models, own the outcomes

Speaker:

of the part of the IP that they think is gonna drive a lot of

Speaker:

their enterprise value in the future. So our like, I would say our our

Speaker:

bet as a company is really on 2 things like fine tuning and

Speaker:

open source. And I think that, you know, the example you just gave is a

Speaker:

good why I think the world is gonna have to move into both of

Speaker:

those directions. No. That makes a lot of sense. I think that open

Speaker:

source is important for a number of reasons. I mean,

Speaker:

not the least of which is, you know, we we have seen recently that if

Speaker:

if if these things are behind a commercial firewall,

Speaker:

If, for instance, there was some kind of, I don't know, political shake

Speaker:

up inside of said company board, which of course would never

Speaker:

happen. Right? Never happened. Then

Speaker:

you you are taking down that risk. Right? Which is, I think, is another

Speaker:

reason why open source, just in Generally, an industry is is

Speaker:

popular because decisions tend to be made at the community

Speaker:

level. Right? Now, there's obviously flaws with that approach

Speaker:

too, but It is, and I would use this as an example

Speaker:

of if you look at HTML and JavaScript Yep. Versus

Speaker:

say Flash and dare I say Silverlight. Right? Flash was

Speaker:

always a proprietary product. Silverlight, if people remember it, was also a

Speaker:

proprietary product, but HTML,

Speaker:

JavaScript Had its flaws, but eventually, they did get their act together,

Speaker:

and it it has a certain more

Speaker:

implicit compatibility. And I think with AI, I think the

Speaker:

it's not so much about compatibility. It's implicit transparency.

Speaker:

You get with open source AI. Right. Is it perfect? Is it totally

Speaker:

transparent? No. That that's not the point. But the

Speaker:

point is you're starting at a much more Transparency almost

Speaker:

by default or transparent, maybe translucent,

Speaker:

as as as as a default as opposed to completely opaque.

Speaker:

Yeah. I I think that it's both the transparency and the

Speaker:

control that's critical. Yes. It's the fact that people do not only

Speaker:

introspect and understand what's happening, but They can edit and change, you know,

Speaker:

in instances. Even if you're like a lot of our models, users do not

Speaker:

edit 99% of the pipeline, But it's important that they're

Speaker:

able to edit all of it, and that they do make the edits to the

Speaker:

1%. And I think that exists for open source. And I think from just like

Speaker:

an industry macro standpoint, you know, Trying to fight open

Speaker:

source and developer platforms is like trying to fight physics,

Speaker:

basically. It's kind of against the natural working of those systems.

Speaker:

And so our view is that, you know, people are

Speaker:

gonna come out with amazing models. And some of them are gonna be commercial, and

Speaker:

some of them are gonna be open source. The open source Size of the pie

Speaker:

is going to grow, and I think you wanna see this here, right? Like it

Speaker:

has caught up, so quickly. Like the

Speaker:

open source attraction has caught up so quickly to everything else. Our

Speaker:

view is just like, what do you need when you want to use open source?

Speaker:

Well, you need the you need the infrastructure around it. You need to be able

Speaker:

to plug it into proprietary, settings. You need to be able

Speaker:

to create those guardrails around it. That's, you know, where we think about ParetoBase

Speaker:

providing the info For being able to use open source. Interesting.

Speaker:

Well, this is a fascinating conversation. We could probably go on for another hour or

Speaker:

And I definitely would love to have you or someone else from Credit Base because

Speaker:

I think, you know, it's just a cool idea. Right? Like it and

Speaker:

and I think that it it really solves a missing piece of the puzzle

Speaker:

In terms of making this, you know, when you say

Speaker:

YAML, when I think YAML, I think OpenShift, right, obviously, you know, work at Red

Speaker:

Hat, that's kinda, but I mean, I think that,

Speaker:

it's one thing to open source the model. It's quite another to how do you

Speaker:

manage and control that animal? Right. Because these are

Speaker:

not these are not tiny little things. Right? These are

Speaker:

potentially very compute intensive activities. Right. So you

Speaker:

don't want you wanna be efficient. That's the way the world has gone.

Speaker:

Right? It's more compute intensive and,

Speaker:

heavier weight, and so that's where the infrastructure components become

Speaker:

critical for any company that's actually gonna use it. Absolutely. And you have to at

Speaker:

least If you can't be a 100% efficient because you really can't,

Speaker:

but you wanna at least, prioritize towards compute efficient

Speaker:

Activity. Because otherwise, you are literally throwing money out the

Speaker:

door. And I think that it looks like

Speaker:

your tool is really good at kind of Making it

Speaker:

so it's compute efficient, like, or at least that that

Speaker:

it goes a long way to helping that. I'm sure you can probably do some

Speaker:

serious damage With any tool. Right? Like, I wouldn't give my my 2

Speaker:

year old a chainsaw. You know what I mean?

Speaker:

But, now that's interesting. So

Speaker:

now we're gonna transition into the pre canned questions.

Speaker:

How did you find your way into data Or AI. Like,

Speaker:

did you find AI or did AI find you?

Speaker:

That's an interesting question. I,

Speaker:

I first got into it just out of studying

Speaker:

computer science. You know, I when I went into university, I thought I

Speaker:

wanted to study economics. Really liked, you know, the theory

Speaker:

behind economics. I took a intro to computer science class because I thought it'd be

Speaker:

interesting. And that more or less just completely shifted where I went

Speaker:

because CS was actually magic. You know, economics is a great way to be

Speaker:

able to explain things that were happening in the world, but with computer science, you

Speaker:

could actually build systems. And that was really interesting.

Speaker:

And then I found the 1 piece that I think I liked just as much,

Speaker:

which was statistics. And the natural

Speaker:

marriage of computer science Statistics really is, you know, data and data

Speaker:

science. And so, I'd studied it for a while, and then

Speaker:

when I went to, Yo. Go work in in a professional industry.

Speaker:

I first started off as a PM at Google, and I worked at completely different

Speaker:

things on Firebase, developer platform, authentication, security. I

Speaker:

remember somebody saying like, you know, you have to work on what you're most passionate

Speaker:

about. You know, a new college graduate, I have no idea what I'm passionate about

Speaker:

professionally. And so I thought back to, you know, the things that I'd studied that

Speaker:

I found the most interest in, that I found the most fun to work on.

Speaker:

And it really was those data science projects, Honestly, starting with the early

Speaker:

Kaggle competitions that I did in 2013, where you were trying

Speaker:

to compete to see who could build the best housing prices model who could build

Speaker:

the best recommender system model, and you had to exploit all

Speaker:

these interesting nuances in data and models to be able to get there.

Speaker:

And so I just found it so fun. And then

Speaker:

I think after a little while, found it trading

Speaker:

that everyone else didn't have sort of the same access to those types,

Speaker:

those types of experiences and tools. And so that's where the experience really

Speaker:

began. I would say, you know, early on, just having that academic

Speaker:

background and then seeing the problems kind of being manifested in Google and

Speaker:

eventually, you know, working as well on Kaggle of the data science and machine learning

Speaker:

community there. Interesting. Interesting.

Speaker:

I see you did a brief stint in cybersecurity for a while,

Speaker:

Which is funny because I think people see that as a as a totally separate

Speaker:

discipline, and in in a very real sense, there is. But I think that in

Speaker:

a very real sense, A big chunk of cybersecurity is

Speaker:

monitoring logs and input data and figuring out what's happening.

Speaker:

Sounds at all sounds familiar. Doesn't it?

Speaker:

I think cybersecurity, you know, when I was doing cybersecurity, work, it

Speaker:

was very, very much in the early days, strategic, how to

Speaker:

think about risk postures at an enterprise level. Right. But I think what's

Speaker:

really interesting now is, cybersecurity and AR are gonna have

Speaker:

a very interesting marriage where Cybersecurity is gonna be influenced

Speaker:

by AI. For example, we work with 1 company today that does open source supply

Speaker:

chain security, and they're looking at using LMS to read code and be able to

Speaker:

do things like Identify vulnerabilities, advise on remits, and

Speaker:

others. And so one obvious area is going to be that

Speaker:

cybersecurity companies themselves are gonna get revolutionized with AI. But

Speaker:

But this is gonna be one of the industries where there's kind of like the

Speaker:

bidirectional era as well. AI is gonna need some cybersecurity

Speaker:

best practices too. Yeah. These made these weights are now,

Speaker:

open source. How do you think about whether or

Speaker:

not the security governance Factors should be

Speaker:

on the inputs, you know, when the data is fed into the model,

Speaker:

in the model layer itself, like, how the model processes

Speaker:

that data On the outputs. Like, what is the framework for thinking

Speaker:

about, like, you know, which ones introduced what kind of risk? And the type of

Speaker:

industry that's had the most experience in this historically has in the cybersecurity industry,

Speaker:

Thinking about how we deploy software internally and others, and so that

Speaker:

marriage is gonna be, I think, really interesting. I bet there's gonna be really best

Speaker:

of breed companies in both worlds. I could totally see that.

Speaker:

I think that's a very good cogent response to,

Speaker:

you know, these are not isolated industries. Right. I mean, they

Speaker:

obviously have different origin stories, but I I could

Speaker:

totally see them merging. And to your point, right? I mean,

Speaker:

Yeah. If you look at potentially 2

Speaker:

things, right? 1, the, who, the amount of input

Speaker:

data that you have, like, Could that be poisoned in a way that could produce

Speaker:

negative effects later on in an LLM? And 2,

Speaker:

We don't really know the sort of latent, for lack of better term, latent spaces

Speaker:

that exist in these extremely large complicated,

Speaker:

models like for I'm sure you've seen this, but there was a random

Speaker:

string of characters that would produce bizarre output

Speaker:

In chatty b t. And there was also one that would basically short circuit

Speaker:

the, the safety rails inside of

Speaker:

some of these LLMs too. And it was just like,

Speaker:

wow. I mean, you know, was that the one, how was that figured out?

Speaker:

Was that random, or did somebody kind of understand that there's Weird

Speaker:

latent spaces and how to manipulate that. I think that is gonna

Speaker:

be a new frontier opening up, in the

Speaker:

not too distant future. If it hadn't already happened,

Speaker:

honestly. Yeah. I agree. I agree. And I think

Speaker:

it starts with understanding that, You know, those those

Speaker:

bits of, I guess, entropy that feel random to us are,

Speaker:

are more features oftentimes than bugs. So the fact that the random characters

Speaker:

produce, like, a weird output, it's actually really interesting

Speaker:

because what that means is maybe I don't need to type out a full

Speaker:

English Paragraph to get this model to do what I want. You know, there's really

Speaker:

cool things in prompt compression where people have basically been like, can I just

Speaker:

say, like, a couple of characters AFD, something that would mean

Speaker:

nothing to you and I, but the model understands that means, okay, go ahead and

Speaker:

pick up the dry cleaning on the way home and then make sure that you've,

Speaker:

you know, swung by and filled Like, essentially a set of instructions that get compressed

Speaker:

into this model's internal representation? So I think we're barely

Speaker:

scratching the surface of it, It's one of many ways that the I think,

Speaker:

l m revolution is gonna be really interesting in the ways that we haven't fully

Speaker:

explored yet. I could have said it better myself.

Speaker:

Our next question, what's your favorite part of your current

Speaker:

gig? My

Speaker:

favorite part is Probably the part that's also, I think one of the most

Speaker:

challenging is the space is moving so quickly. I know people

Speaker:

say that frequently, but the truth is I've heard people say that about different

Speaker:

technologies historically, and I'm like, yeah, it's moving faster than other

Speaker:

things. You know, for example, Mobile moved quickly.

Speaker:

There were over many years to transform things that happened.

Speaker:

The Timescale that our world is kind of, dominated. I'm gonna

Speaker:

say our world. I think it just mean, like, you know, the the AI movement

Speaker:

so far over the last year It's it's in weeks. Right? Like, every

Speaker:

few weeks, there's a new seminal groundbreaking, whether it's,

Speaker:

Yeah. I I can think about the moments where, like, Llama got introduced as an

Speaker:

open source model. Its weights got leaked. That was amazing because it spurred out of

Speaker:

the whole new community. GPT 3.5 got upgraded to GPT

Speaker:

4, new set of capabilities that came out there. LAMA 2 came out

Speaker:

this year with commercially viable licenses and like, You know, really, I

Speaker:

think, best in class performance up to the

Speaker:

point that Mixed Straw came out, which was a, you know, mixture of experts

Speaker:

model significantly smaller doing as well as chat g p t. This was only

Speaker:

a few days after Google released Gemini, you know, their own, model.

Speaker:

We have AWS in the race with Bedrock. It's kind of like, you know, an

Speaker:

interplay between different providers. I'm saying a

Speaker:

lot of sentences, but like the The really interesting piece of it is all that's

Speaker:

really come out in the last 6 months, and I haven't even covered up, like,

Speaker:

all the academic, you know, like It's wild. It's wild. Like, so I

Speaker:

was on a cruise, like, we were talking in the virtual green room, and I

Speaker:

had intermittent Internet, and I looked at my phone far more than I should,

Speaker:

for being on vacation, but it was just like Gemini happened,

Speaker:

AMD, and made some hardware announcements. And I know

Speaker:

hardware In the the unintended

Speaker:

consequence of being compute intensive is that hardware starts to matter again.

Speaker:

Right? Yeah. There was if you were a software

Speaker:

engineer, obviously, mobile, let's let's take that in the conversation.

Speaker:

But if you were a software engineer building websites, hardware wasn't really a major

Speaker:

Concern. Right? It was kind of pushed to the side. I mean, it

Speaker:

mattered, when you got, like, your Amazon bill was through the roof

Speaker:

and you weren't as efficient as you should be. But I mean, it wasn't really

Speaker:

a major concern. Now we have let's say it's starting to be a limiting factor

Speaker:

in terms of, you know, how many h one hundreds can you get your hands

Speaker:

on. Right? It's it's,

Speaker:

no. But, but you're right. Like, I mean, just I missed a week and I

Speaker:

still feel like I'm catching up and that was like almost 2 weeks ago. So

Speaker:

Yeah. And the, and that's the most exciting piece for us.

Speaker:

Right? It's because, all this changes created a lot of opportunity. So

Speaker:

We got a lot of popularity recently for something called Lorax.

Speaker:

Mhmm. It's an open source project that we released that basically,

Speaker:

was just a problem we had to solve for ourselves. It's the industry is moving

Speaker:

quickly. We needed to allow people to fine tune and serve large language

Speaker:

models for free in our trial. Now every single one of

Speaker:

these l m's requires a GPU and sometimes bigger, heavier,

Speaker:

meatier GPUs. And so if we're giving away a lot of free trials To, you

Speaker:

know, people just on the Internet who are all using a GPU,

Speaker:

investors would not be the happiest. And so we needed to figure out a better

Speaker:

solution where we could actually serve Many, potentially hundreds of these

Speaker:

large language models on the same individual GPU. And

Speaker:

so we, we came out with a really cool technique to be able to do

Speaker:

that. We called it Lorax for LoRa Exchange.

Speaker:

And, we open sourced it and back a lot of popularity. One of the reasons

Speaker:

that I think it got picked up in such a way was because it really

Speaker:

kind of just fed into them kind of main, main thought process in the

Speaker:

moment And everyone's staying up to date on kind of the latest. So, you know,

Speaker:

it kind of fed nicely into that hardware constraint, area of the world

Speaker:

as well as kind of a need that the market had. And so It's been

Speaker:

really fun, I think, to just be on top of that. Very cool. Very cool.

Speaker:

So we have 3 complete this sentence, questions. The

Speaker:

first one is when I'm not working, I enjoy blank.

Speaker:

I have a very San Francisco Answer to this question. But when I'm not

Speaker:

working, I enjoy being outdoors. And in

Speaker:

particular, I really enjoy biking, taking a road bike and going up a mountain,

Speaker:

because the reward at the end of that's amazing. And playing tennis, those are

Speaker:

probably the 2 things that, you know, I I enjoy the most. Very

Speaker:

cool. The San Francisco is perfect for that sort of thing, like the bikes in

Speaker:

the mountains, in the ocean. It's gorgeous. Yeah. Yeah. It's

Speaker:

gorgeous. I think the coolest thing about

Speaker:

technology the coolest thing in technology today is blank.

Speaker:

The accessibility. I think the coolest thing about technology today is the fact

Speaker:

that I can go ahead and run GPT four

Speaker:

Or llama 270,000,000,000, the commercial variants of, you

Speaker:

know, the leading edge or the open source variant. I can run both

Speaker:

of them More or less for free, at least to try out

Speaker:

for, like, you know, a little while. And that's sort of the same thing that,

Speaker:

you know, big bank over here is gonna be using Or, you know,

Speaker:

leaving technology company over there. Now, at least as the starting

Speaker:

point where it starts to diverge is like how, when you get heavier into the

Speaker:

customization and others. The coolest thing about technology to me is

Speaker:

in, and again, I think of it very much from like an AI centric lens,

Speaker:

just given my day to day. But, it's the fact

Speaker:

that, you know, I, the graduate students, you

Speaker:

know, somebody abroad in a different country, And then you know the m

Speaker:

l engineer at a company like Netflix, all have some shared experience

Speaker:

of language based on technology that just came out this year

Speaker:

Because the barriers to entry are not significantly high to be able to get

Speaker:

started. Now, I think the barriers to entry are still too high to, you know,

Speaker:

go from prototype to production. That's what we wanna be able to lower, but that's

Speaker:

to me the most compelling thing that we've done. That's very cool.

Speaker:

The 3rd and final Is I look forward to the day when I can use

Speaker:

technology to blank.

Speaker:

That's a good question. I think I look forward to the day,

Speaker:

when I can use technology to, to be sort

Speaker:

of like the Adviser and whiteboarding

Speaker:

buddy, if that makes sense. So if you think about,

Speaker:

like, what you often do with an advisor, it's, It's

Speaker:

actually generative in a lot of ways. You'll walk through them with a problem.

Speaker:

I do this with my dad all the time. And so, you know, he and

Speaker:

I will talk through Some challenge that I'm thinking about at work

Speaker:

or or something else. And he doesn't have all the context, you know, that that

Speaker:

might, but he's able to apply these like general frameworks and come up

Speaker:

with a few different types of suggestions based on based

Speaker:

on that. And some of them, because he's coming from a very different place, Might

Speaker:

be different than the way that I thought about it. And I

Speaker:

actually see that as a capability for,

Speaker:

For technology that as we've come up with it as well is to be, you

Speaker:

know, you've actually seen like companionship apps in terms of like, you know,

Speaker:

psychological help or behavioral help or, or Or just having someone to

Speaker:

talk to is actually like a use case that these models have already

Speaker:

started to pick up on, within like a niche group of users. And what I

Speaker:

think would be interesting is, you know, if you think about what you probably lean

Speaker:

on friends or family and other types of things for, I

Speaker:

think should still be friends and family and others. They are the ones who know

Speaker:

you best, but the model can be like one additional source of that

Speaker:

input. And it's gonna be really cool when, like, you know,

Speaker:

if you're if you're working through something hard and you wanna go ahead and, you

Speaker:

know, you get, like, get a few ideas for how to be able to go

Speaker:

through it, You can text your family group, you can text your friend group, and

Speaker:

you can ask the model that knows you, and you can kind of pick the

Speaker:

best idea amongst those 3. That's a great idea. I think that, a

Speaker:

lot of the media hype around things like replica AI and things like that has

Speaker:

been like, oh my god, it's gonna replace human interaction. And it's like, Are

Speaker:

they intentionally missing the point, or is it clickbait? Like, I can't tell.

Speaker:

Right? Are they are they are they clue are they clueless by default, or are

Speaker:

they clueless to make money? Not really sure. But I think that you're right.

Speaker:

It's meant to augment. Right? And I think that's a very healthy way to look

Speaker:

at it too, you know. Because I if I get stuck writing something. Right? Like,

Speaker:

I'll I'll ask chat TBD. Like, hey, how would you word this?

Speaker:

Right? Sometimes it comes up with a good answer, but at least it it kinda

Speaker:

clears the log jam in my head Where I'm like, oh, okay. Let me let

Speaker:

me go around it this way. I think that's a, I think that's an

Speaker:

underrated use for AI or these LLMs.

Speaker:

Yeah. I totally agree. Share something different about

Speaker:

yourself. We always joke, like, you know,

Speaker:

remember it's a It's a it's a family, iTunes

Speaker:

clean rated podcast. Something different about

Speaker:

myself. Yeah. I don't know if it's different or at least something that,

Speaker:

Not a lot of folks know about me, like, when I, first, first got

Speaker:

with them, but, I'm a 1st generation immigrant, and as is, like, my entire

Speaker:

family. So I was actually born, in India, came over, you know, when I was

Speaker:

a lot younger. So that I think is interesting because

Speaker:

I was both that, but also grew up right here in the Bay

Speaker:

Area. You know, I I think very much saw, like, the tech

Speaker:

I I think very much saw 2 things. One of them was just the US

Speaker:

kind of as, corollary and adjacency to to India

Speaker:

where, like, parents had spent the vast majority of their lives and, you

Speaker:

know, where we had come from. And then the second was like a very specific

Speaker:

part of the US with Silicon Valley that was just, had a

Speaker:

very interesting culture, Some healthy disregard for the

Speaker:

rules in some regard, not always for the best, but sometimes for the best.

Speaker:

And a real kind of inclination towards, you know, moving very quickly and kind of

Speaker:

being on the latest since and and and Barry progressed in that way. And

Speaker:

so I think that, This might be a little bit more of a backstory

Speaker:

than an interesting individual facts, but I do think that, you know, that,

Speaker:

immigration To especially this area, I think

Speaker:

was kind of a very, at least different experience than what

Speaker:

I think a lot of other folks that I've talked to have. Yeah. I often

Speaker:

wonder what it would be like to grow up in the Bay Area, and I've

Speaker:

met some people through through work and things like that who did. And they're like

Speaker:

It's hard because if you if it's if you grew up there, it's kinda all

Speaker:

you know, so you don't really have a good Yeah. Benchmark. Like, I grew up

Speaker:

in New York City, and people are like, oh my god. How could you grow

Speaker:

up there? I'm like, I don't know. It was just So I I

Speaker:

grew up in the Bay Area and then went to school in the northeast and,

Speaker:

you know, there's some things you realize, definitely. One of them

Speaker:

is, Yeah. Fewer people wear, like, hoodies and, you know, flip flops,

Speaker:

boat shoes are more of a thing. Like, there's all sorts of funny changes,

Speaker:

You know, that exists culturally, especially. I think the

Speaker:

biggest things that I've kind of picked up on is, like,

Speaker:

The Bay Area has a very kind of, or at least I think where,

Speaker:

the environment I grew up in, a very like, risk forward culture. It's kind

Speaker:

of a why not, worst thing happens. Whereas I feel like a lot of other

Speaker:

areas are a little bit more steeped in tradition And views

Speaker:

that as a good thing. I think the Bay area

Speaker:

potentially, and not to say one is right or wrong, but I think the Bay

Speaker:

area has a bit more of a culture, A healthy disregard

Speaker:

for tradition. And, you know, I

Speaker:

think, Sofia had the great quote about tradition,

Speaker:

That I'm forgetting. But, like it's,

Speaker:

yeah, I think it's one thing that I definitely think about, especially the difference between,

Speaker:

like, For example, where I grew up in the northeast, where I spent some time.

Speaker:

Right. Right. And you were I'm I'm inferring because you went to Harvard that you

Speaker:

were in Boston, and Boston is kind of its own Yeah. Its own corner

Speaker:

of the northeast. If you ask somebody, like, you

Speaker:

know, if you ask, I've lived in Europe, I've lived

Speaker:

in, in new in

Speaker:

New York and now the DC kind of Richmond, now

Speaker:

Baltimore. There are slight variations in culture, but like, I

Speaker:

can only imagine like how much of a shock it would have been from like

Speaker:

the bay area To, like, Boston, especially.

Speaker:

Right? Where it's it's far more I think things are far more rooted in tradition

Speaker:

there. Right? Yeah. And it's it's not a knock on it. Right? Like, I I

Speaker:

will knock on their baseball team, but that's another another story. Right?

Speaker:

But, you know, but still, the both I mean, the

Speaker:

the Boston area is also known for its innovation in both

Speaker:

biotech and technology. Right? So it's not, These are not mutually exclusive

Speaker:

things. Right? They're just different approaches.

Speaker:

Absolutely. And both of them have worked, you know, really well for those respective

Speaker:

Areas. One of them feels a lot more at home to

Speaker:

me. But I think, you know, it was fun and interesting to kind of see

Speaker:

those 2 differences, Especially spending time in both cities.

Speaker:

Yeah. That's cool. That gives you a unique perspective on, you know, that the

Speaker:

US culture is not one monolith, it's just Fragments of

Speaker:

different things. It's it's an interesting perspective. I almost

Speaker:

have to ask, like, was it as much of a culture shock coming to the

Speaker:

US or coming from the Bay Area? Well, honestly, the Bay Area to

Speaker:

anywhere else. Right? You know, the weird thing

Speaker:

is I didn't expect the culture shock to I expected the culture shock coming to

Speaker:

the US. Both from you, but you know, I was young, especially for my family.

Speaker:

Yeah. I think that was there, but you're kind of, you're expecting

Speaker:

it. And so it's always something that you're well prepared for. I don't think I

Speaker:

expected the culture shock going from the Bay Area to to Boston.

Speaker:

Because these are the 2 cities in the US. These are 2, you know, Progressive

Speaker:

cities that are well educated in the United States, how different can they be.

Speaker:

And you don't actually notice the difference, I think on a one day or two

Speaker:

day visit, you kinda notice the difference when you actually spend a longer period of

Speaker:

time there and understand the undercurrent. So Yeah. It

Speaker:

wasn't a shock actually as much as it it was kinda cool. Like, I appreciated

Speaker:

that 2 places in the US could actually feel very different because,

Speaker:

you know, diversity is the spice of life. So actually really, really, I liked

Speaker:

it even though it was different to maybe how I thought. That's cool. That's

Speaker:

cool. The winter must have been a good shock on you. The

Speaker:

winter was a shock in less of a positive way. Yeah. Diversity is a spice

Speaker:

of life minus in weather. Yeah. I'll say

Speaker:

70 degrees sunny year round all day. Were you there during the year? They

Speaker:

had, like, a record amount of snowfall, like, something like Yeah. Fifteen

Speaker:

feet over the winter? I was. Yeah. Yeah. Exactly. Yeah.

Speaker:

Yeah. Campus shut down. Yeah. I was a student then,

Speaker:

and, You know, as I was saying, very healthy risk

Speaker:

appetite. I think everyone was out in the yard, like, throwing snowballs at each

Speaker:

other while there was, like, a record blizzard So it was, it was

Speaker:

fun. It was less fun when the snow was still on the ground in Maine,

Speaker:

June. That was when I was thinking, get out of here.

Speaker:

Do you listen to audiobooks at all? Yes. I

Speaker:

I read more often, but sometimes I do re I listen to audiobooks to conveniently

Speaker:

Do you have any Recommendations?

Speaker:

I really like The Happiness Advantage by Shawn Achor.

Speaker:

It's yeah. It's a book about how,

Speaker:

I think there's a thought process that, you know, like, success breeds happiness,

Speaker:

but this is also, like, work by a behavioral psychologist. Like how happiness can breed

Speaker:

success and just how to be able to be in that mindset more often. And,

Speaker:

you know, it's a weird book because it's actually kind of style as a business

Speaker:

book. But I actually think it's a lot about like personal development. And

Speaker:

so, yeah, that's definitely one I'd recommend.

Speaker:

Cool. Audible is a sponsor of the show. And if you go to the data

Speaker:

driven book .com, you will get, 1 free book on us. And,

Speaker:

if you sign up for a subscription, You get a we

Speaker:

get a you get a subscription and of knowledge, and we get a little bit

Speaker:

of a kickback for them being a sponsor. And

Speaker:

finally, where can people learn more about you and Predabase?

Speaker:

Yeah. Absolutely. So, the obvious and easiest answer there is of

Speaker:

course prediabase.com. I think, you know, we've learned,

Speaker:

the easiest way to learn more is just to go ahead and try it.

Speaker:

And so you'll see things there like documentation, you'll see a bunch of

Speaker:

videos on our, blog page, which are short, 3 to 5

Speaker:

minutes, and our YouTube channel, on prediabase, p

Speaker:

r e d I b s e, actually has longer form 1 hour pieces of

Speaker:

content that are more educational. But I'm a big believer that the

Speaker:

easiest way to actually learn is just to be able to get your hands dirty.

Speaker:

So if you click that try for free button, you'll get a few weeks, and,

Speaker:

you know, credits. We'll give you the GPU out of the box so you can

Speaker:

run all these models yourself, and you can learn firsthand. That's usually the easiest

Speaker:

way, you know, to be able to get Started more. And then if you wanna

Speaker:

learn a little bit more about our underlying technology, we've open sourced

Speaker:

both of the key components. So for how to train models, we have Ludwig,

Speaker:

And then for how to be able to serve models, we have LAURACS. And

Speaker:

so those are the 2 l's that you can kind of use in order to

Speaker:

be able to understand how the tech works under the hood. Very cool.

Speaker:

Thanks for joining us in the show, and thank you once again for your, patience

Speaker:

as we work through some scheduling conf conflicts,

Speaker:

And, I'm glad we had this conversation. You're always welcome back in the

Speaker:

show, and I'll let the nice British AI lady finish the show.

Speaker:

Thanks, Frank, and thanks, Dev. What a

Speaker:

splendid conversation that was. It felt like

Speaker:

navigating through a maze of data with only the smartest chaps as my

Speaker:

guides. To our listeners, I hope your brains are

Speaker:

buzzing with as much excitement as mine is metaphorically speaking,

Speaker:

of course, since my excitement is more of a series of well organized

Speaker:

algorithms. To our dear listeners, if today's chat

Speaker:

has ignited a spark of curiosity t in you, then I dare say we've

Speaker:

done our job. Remember, the world of AI is vast

Speaker:

and ever evolving, and it's thinkers and doers like deaf who keep the digital

Speaker:

wheels Turning. Before we sign off, a gentle

Speaker:

reminder to keep your minds open and your data secure.

Speaker:

Until then, be sure to like, share, and subscribe as the

Speaker:

kids say these days.

Links

Chapters