Artwork for podcast HockeyStick Show
HockeyStick #8 - Generative AI in Action
Episode 820th May 2024 • HockeyStick Show • Miko Pawlikowski
00:00:00 01:09:43

Share Episode

Shownotes

Demystifying Generative AI: Insights from Microsoft's Amit Bahree

In this episode, Miko Pawlikowski interviews Amit Bahree, a principal group technical program manager at Microsoft and the author of 'Generative AI in Action'. They delve into the complexities and applications of generative AI, particularly in a professional setting. Amit shares his journey into AI, the motivation behind his book, and provides insights into the future of AI technologies, including Large Language Models (LLMs), Small Language Models (SLMs), and the significance of prompt engineering. They also discuss the importance of multimodal AI for future developments and the role of open-source models in the AI community.

00:00 Welcome to Hockey Stick: Diving into Generative AI

00:59 The Genesis of "Generative AI in Action"

02:11 Amit Bahri's Journey into AI

03:36 The Multifaceted Role of a Technical Program Manager at Microsoft

06:20 The Sam Altman Saga and Microsoft's Position in AI

08:45 Why Generative AI is a Game Changer

13:07 The Evolution and Impact of AI Technologies

15:14 Exploring the Landscape of AI Models

18:22 Introducing Phi-3: A New Benchmark in AI

24:36 The Future of Small Language Models

32:33 Real-time AI: The Quest for Seamless Interaction

35:04 The Evolution of User Interfaces: From Apps to Voice Commands

36:49 Democratizing Technology: Real-World Examples of AI in Action

40:09 The Dark Side of AI: Ethical Dilemmas and Security Concerns

41:44 Navigating the New Landscape: Security and Ethical Considerations

45:06 Generative AI in Action: A Guide to Practical Applications

49:51 Exploring Image Generation Techniques and Their Applications

52:26 The Art and Science of Prompt Engineering

01:00:37 Future Predictions: Multimodality, SLMs, and System Improvements

01:05:55 The Role of Open Models in AI's Evolution

01:08:03 Reflecting on the Rapid Advancements in AI Technology

Transcripts

Miko Pawlikowski:

I'm Miko Pawlikowski, and this is Hockey Stick.

Miko Pawlikowski:

Starting with generative AI can be daunting.

Miko Pawlikowski:

There's a lot of hype, a lot of development, and a lot of change, daily.

Miko Pawlikowski:

It's easy enough to launch ChatGPT and ask for a poem on how Vim is superior to Emacs, but to get value from it professionally requires a bit more skill.

Miko Pawlikowski:

Today, I'm joined by Amit Bahri, The author of "Generative AI in Action", a brand new book published by Manning.

Miko Pawlikowski:

Amit is a principal group technical program manager at Microsoft, where he leads the engineering team that builds the next generation of AI products and services on the Azure AI platform.

Miko Pawlikowski:

He has over 25 years of experience in technology and product development, including the artificial intelligence and cloud platforms fields.

Miko Pawlikowski:

And yes, you will learn what his mom thinks about ChatGPT.

Miko Pawlikowski:

Welcome to this episode and thank you for flying hockey stick.

Miko Pawlikowski:

let's start right away.

Miko Pawlikowski:

why did you write the book?

Amit Bahree:

being in the AI platform team at Microsoft, one of my roles, which more or less became the day job over the last year and a half was meeting with a lot of our customers, which are generally large enterprises where

Amit Bahree:

everybody wanted to know how do I use Gen AI, obviously took over the world, as I joke, my mom's a ChatGPT expert

Miko Pawlikowski:

Oh yeah, I bet she is.

Amit Bahree:

and I basically, at the end of the day, got tired answering and guiding the same thing again and again across multiple customers.

Amit Bahree:

so I said, What if this could be put down on paper and they could just learn themselves rather than we being the bottleneck in many ways, right?

Amit Bahree:

So in, in full transparency, it was a selfish exercise.

Amit Bahree:

so I don't have to repeat myself again and again in doing this could just point them and say, Hey, go read this and that'll at least give you a jumpstart.

Miko Pawlikowski:

Yeah, exactly.

Miko Pawlikowski:

Read the book.

Miko Pawlikowski:

I love that.

Miko Pawlikowski:

that's a completely valid, perfect origin story.

Miko Pawlikowski:

so you mentioned that your mom is a generative AI expert, so I guess we'll interview her next time.

Miko Pawlikowski:

But, for you,v what was your moment?

Miko Pawlikowski:

When did you decide to go into AI?

Miko Pawlikowski:

obviously you've been in it for a while.

Miko Pawlikowski:

It wasn't as hot as it is right now back then.

Miko Pawlikowski:

Can you tell us a little bit about your story and how you ended up doing it?

Miko Pawlikowski:

What you're doing?

Amit Bahree:

I am actually not a data scientist.

Amit Bahree:

I'm not a machine learning engineer.

Amit Bahree:

I know how to build models, but that's not what I live and breathe and dream up in the middle of the night, as I know many of my colleagues do.

Amit Bahree:

That's their passion.

Amit Bahree:

in my previous role before Microsoft, one of the things I was learning was, emerging technologies, understanding from a technical point of view, what they are, how they work, how they could be used, or mostly in the context of an enterprise setting.

Amit Bahree:

And one of the technologies among a few was AI, a few years ago.

Amit Bahree:

In my role of looking at emerging tech, is how I got into AI.

Amit Bahree:

Of course, Gen AI or these underlying architecture principles that power these things today didn't exist.

Amit Bahree:

But I was quite fascinated.

Amit Bahree:

it was still a side job in the sense it was one of a few areas of emerging technologies to go dig and deep into.

Amit Bahree:

And then as that started getting more traction, I was the one eyed king in the kingdom of blind.

Amit Bahree:

Because, I knew more than the others, didn't know, doesn't mean I know most.

Amit Bahree:

And then I was stuck with that.

Amit Bahree:

And then grew into that and got fascinated.

Miko Pawlikowski:

as they say, 'the rest is history'.

Amit Bahree:

It's still early days.

Miko Pawlikowski:

so what does a principal group technical program manager, that's a mouthful, is that how you introduce yourself at parties?

Amit Bahree:

No.

Amit Bahree:

Microsoft likes long names and titles.

Amit Bahree:

titles.

Amit Bahree:

Being a aside, I basically have officially two day jobs, unofficially three day jobs So I sit in what we call the AI platform team.

Amit Bahree:

We are the product team that builds all the AI products that power other products or our end customers.

Amit Bahree:

I have formally two buckets of responsibilities.

Amit Bahree:

Microsoft, our leadership goals, we sign large contracts with customers, within which we promise them either new or better AI features.

Amit Bahree:

it could be brand new things that we're building with them or for them, or it could be improving existing features.

Amit Bahree:

So once we sign those contracts, those land on my plate to go deliver from a platform team perspective.

Amit Bahree:

So I'm responsible for a lot of the custom engineering on the platform, which is this.

Amit Bahree:

That's my first bucket of responsibilities.

Amit Bahree:

My second bucket of responsibilities is whatever we do custom in the first, make sure it's in the platform.

Amit Bahree:

Because if you keep being custom, then there's no platform left.

Amit Bahree:

So the way I want you and the listeners who will get to this to think about it is, these large deals that we sign are the catalyst for us to go do things in the platform that we already are thinking, maybe it's not prioritized enough.

Amit Bahree:

so they are a forcing function to go improve the platform at the end of the day.

Amit Bahree:

and that helps not just that one specific customer, but all the rest of them as well.

Amit Bahree:

And then my third unofficial one is anything and everything related to, Azure OpenAI coming from our, CEO and what we call our SLT, which is CEO and his, direct reports, in the context of customers where, it's a top of mind for many and.

Amit Bahree:

For many folks, their understanding is varied, which somewhat ties back to the genesis of the book.

Amit Bahree:

so when Satya meets other, CEOs and they have a question or they're not happy about something or they need guidance, those get sent over and say, here's the team is going to go help you.

Amit Bahree:

And so then I go in and from an engineering point of view, support, see what they need or what they want.

Amit Bahree:

So those are, that's my day job, right?

Amit Bahree:

So custom engineering.

Amit Bahree:

And then supporting, Azure OpenAI related things, from our leadership team.

Miko Pawlikowski:

and then there's your fourth job, which is writing books.

Amit Bahree:

That, indeed, that is also a moment of insanity in some ways, but yes, that is the graveyard shift, as I call it, because, it's after the work is done, and, which is never done, these days at least, so yes.

Miko Pawlikowski:

Of course.

Miko Pawlikowski:

I have to ask you, obviously, not that long ago, there was this entire drama of, Sam Altman being fired, and then rehired, and all of that.

Miko Pawlikowski:

And a lot of people were wondering a lot of things.

Miko Pawlikowski:

Satya was quite prominent during that entire conversation.

Miko Pawlikowski:

What's your take on what happened?

Amit Bahree:

couple of things.

Amit Bahree:

we were learning along with the rest of the folks on Twitter or Reddit or wherever one follows things, right?

Amit Bahree:

the conversations that, Satya and Sam were having was above my pay grade, just to be black and white about it.

Amit Bahree:

So we were following along and listening along just like rest of the world.

Amit Bahree:

I think the one difference is, we had a little bit in the machinery.

Amit Bahree:

Obviously in our team, we do work from an engineering perspective closely with OpenAI and they're a massive partner to us.

Amit Bahree:

So I think in some cases, maybe we are a little more empathetic, I would say, because it's a little more closer to home.

Amit Bahree:

And, it's one big virtual team is loosely speaking, how to think about it.

Miko Pawlikowski:

So there was one particular thing that I think it's interesting and, It might be that people are just reading way too much into that, but I think Satya went and said something along the lines of, 'don't you worry,

Miko Pawlikowski:

even if OpenAI stops existing tomorrow, we're basically well positioned to continue, the innovation' and all of that.

Miko Pawlikowski:

And a lot of people took it as saying, okay, they basically bought themselves OpenAI.

Miko Pawlikowski:

is that roughly what's happening?

Amit Bahree:

Couple of things.

Amit Bahree:

One is now I'm not a Microsoft spokesman.

Amit Bahree:

I'm just talking on my behalf.

Amit Bahree:

we don't own OpenAI.

Amit Bahree:

I don't think that is correct.

Amit Bahree:

I think people are reading too much into it.

Amit Bahree:

I think the thing I want the folks to understand is, Microsoft and Microsoft research investments in AI have been over 30 years.

Amit Bahree:

So it's not just today we've woken up.

Amit Bahree:

Or a few years ago, we've woken up and say, 'look, this is the thing to go in'.

Amit Bahree:

I think the difference really is my mom didn't know about it, nor did she care.

Amit Bahree:

now she does.

Amit Bahree:

so I think, where we're coming from in some ways it's not new.

Amit Bahree:

And it's just become more in the limelight and people are becoming more aware, but We've been at it for a while, and both from a research perspective, products perspective, it's just more in the limelight now.

Miko Pawlikowski:

Okay.

Miko Pawlikowski:

let's leave Microsoft alone and talk a little bit closer to your book.

Miko Pawlikowski:

So one of the questions that I keep asking everybody is.

Miko Pawlikowski:

Their reason to think why GenAI is such a massive deal, right?

Miko Pawlikowski:

Why is it such a big deal and why again, your mom, why does she know about it now?

Miko Pawlikowski:

And she didn't before, I don't think she knew about BERT.

Miko Pawlikowski:

I suspect, but she does know about ChatGPT and there's a good chance she's using ChatGPT, which is, next level.

Miko Pawlikowski:

And, what do you think was so special recently?

Miko Pawlikowski:

What's the like hockey stick moment from your perspective of what's changed that it became, a household name?

Amit Bahree:

it was ChatGPT itself that changed to make it a household name, and as we all know and perhaps understand what most people is, the roots of ChatGPT was a demo.

Amit Bahree:

It wasn't meant to where it is right now.

Amit Bahree:

And the fact that one doesn't have to know BERT or any of the other sort of technical mumbo jumbo, and I can just talk to it, I can just use it just as an end user.

Amit Bahree:

I think the simplicity is the power of it.

Amit Bahree:

And the breadth of what a language understanding, it can do versus as we call it now, the traditional AI, which is very odd, by the way, in the first place, but, in the old AI, the pre gen AI, which is not old again, it's very much valid, of course, today.

Amit Bahree:

was very task specific, where you go deep in a certain task.

Amit Bahree:

So if you're in a company, in an enterprise doing a certain thing, using that, you understand it, you get its value, you know why it's powerful.

Amit Bahree:

But you can't have a generic, free ranging, wider set of, conversations and thoughts, around it.

Amit Bahree:

So if you take a previous chatbot, for example, which is not powered by GenAI, and you say, and if Miko goes and says, hey, I'm hungry,

Amit Bahree:

It won't know what to do, I'm sorry.

Amit Bahree:

Whereas these things understand, they adapt, so I think the simplicity from a using perspective is the power.

Amit Bahree:

And that's why the likes of my mom and others in the world is talking about it, right?

Amit Bahree:

Because it's not technical mumbo jumbo that a handful of people understand and you geek out in the corner.

Amit Bahree:

I can just use it.

Miko Pawlikowski:

you behind this comparison?

Miko Pawlikowski:

This is the iPhone moment for, artificial intelligence in general, in particular, like large language models?

Amit Bahree:

is it the iPhone, the original one, or which was the one which got the 3G support, or when the App Store came up, is it that one?

Amit Bahree:

It's some variants out there, right?

Amit Bahree:

But, I look at it even simpler, because I think the, iPhone is still a very consumer thing at least.

Amit Bahree:

My world is very enterprising.

Amit Bahree:

Consumer is one side of the house.

Amit Bahree:

Enterprise is a very different kettle of fish in the sense of the problems and what they're trying to solve.

Amit Bahree:

So I think if you look at a consumer sense, like my mom, that is an iPhone sort of comparison moment.

Miko Pawlikowski:

If you go to Manning.com, you can actually browse portions of the book for free.

Miko Pawlikowski:

So if you're listening along to that, go to Manning.com, find the book, and, look for figure 1.

Miko Pawlikowski:

1.

Miko Pawlikowski:

It's a graph, that Amit took out of our world in data .org.

Miko Pawlikowski:

And it's called 'language and image recognition capabilities of AI systems, have improved rapidly'.

Miko Pawlikowski:

And it's basically plotting, the human performance benchmark, which goes from minus 100, meaning that it's pretty bad and goes all the way to zero where it's comparable, I think, or maybe equivalent to human.

Miko Pawlikowski:

And, For everybody who's listening to that as a podcast and not seeing this on video, it's showing different, machine learning, AI, trends, it's got handwriting, recognition, speech, recognition, image recognition,

Miko Pawlikowski:

and then it's got the reading comprehension and language understanding and what's mind blowing to me.

Miko Pawlikowski:

And I suspect this is why you chose this particular graph is that.

Miko Pawlikowski:

We've got the handwriting and the speech recognition that kind of goes slowly, looks linearly.

Miko Pawlikowski:

there was a little bit of progress and then somewhere in mid 2010s, it just goes out of control and, it goes all the way up, to very good results.

Miko Pawlikowski:

And then 2016, I think on the graph starts the reading comprehension.

Miko Pawlikowski:

It's basically, An arrow going straight up, same for language understanding.

Miko Pawlikowski:

This is within two years.

Miko Pawlikowski:

It goes from nothing literally to basically comparable to human performance.

Miko Pawlikowski:

why did it happen then?

Miko Pawlikowski:

What needed to happen this is not even a hockey stick.

Miko Pawlikowski:

This is just like the right angle here.

Miko Pawlikowski:

How do you explain that?

Amit Bahree:

true.

Amit Bahree:

I actually never thought of the right angle.

Amit Bahree:

I think it's, it's three things coming together, right?

Amit Bahree:

So one is aspects of AI and the research behind it have gotten better in that time frame, right?

Amit Bahree:

So we started getting deep learning, transformers I don't think quite existed at that point in time.

Amit Bahree:

so fundamental architecture changes, or improvements, from a model perspective, model architecture.

Amit Bahree:

So I think that's one.

Amit Bahree:

But I think crucially, maybe equally, maybe more crucially is availability of data at the scale you need.

Amit Bahree:

And then also compute most specific GPUs to, train and crunch through these.

Amit Bahree:

I think that it's that perfect storm of those three things coming together.

Amit Bahree:

if one of them didn't happen as much, it would be still slower.

Amit Bahree:

And that's why you see the linear progression in the others versus I don't know, is that a rocket thing?

Miko Pawlikowski:

basically vertical.

Amit Bahree:

so I think it's those three sort of things coming together.

Amit Bahree:

I personally believe, I don't think anything was planned or orchestrated.

Amit Bahree:

I think it's one of those happy accidents, how GPUs work and the number, the floating points it needs to do for graphics, which is gaming, is the same thing that AI models need to do.

Amit Bahree:

We as humans started spitting on more data, maybe thanks to social, thanks to actually iPhones and other smartphones and devices and whatnot.

Amit Bahree:

And then, cloud capabilities in the context of GPUs and compute, improved.

Amit Bahree:

I guess there's a fourth one, which is inherent, but A lot of system engineering things started coming online, right?

Amit Bahree:

How do you run these?

Amit Bahree:

Because it's not like running them on one GPU, for example.

Amit Bahree:

You need clusters of machines.

Amit Bahree:

So there's a fair amount of systems engineering, in the sense of reliability, resilience, and so on, under the covers that, that has to make it all happen.

Amit Bahree:

Otherwise it won't run.

Amit Bahree:

Lots of Physics and computer science.

Amit Bahree:

I keep saying that to my team, for example.

Amit Bahree:

so I think that's maybe a fourth dimension, which most people don't talk about, but, I think those are the things that perhaps enabled a bunch of this to go where we are right now.

Miko Pawlikowski:

There's another interesting reference that you have, it's called a survey of large language models and somehow I missed that I found it in your book.

Miko Pawlikowski:

So thank you for that.

Miko Pawlikowski:

and I think, page nine is where I found the figure three.

Miko Pawlikowski:

it's going to be very difficult to describe on a verbal way, but.

Miko Pawlikowski:

Imagine like a little anthill with a bunch of ants in it, swarming.

Miko Pawlikowski:

And each one of those ants is basically a model.

Miko Pawlikowski:

And, the figure is making a distinction between the ones that are basically open source, publicly available, and the ones that are closed source, and it's only.

Miko Pawlikowski:

graphing, up to, GPT-4 and LLAMA-2.

Miko Pawlikowski:

So there's, way more of that.

Miko Pawlikowski:

I think at some point I saw that, hugging face had a hundred thousand models uploaded to it.

Miko Pawlikowski:

And I suspect after Lama three, it's probably doubled since, it gives you a little bit of a perspective.

Miko Pawlikowski:

It's not just ChatGPT and it's certainly not just OpenAI.

Miko Pawlikowski:

And it, it shows you how much variety there is.

Miko Pawlikowski:

And, frankly, I've been looking at this things for a while now.

Miko Pawlikowski:

And I still, there's probably like half of this graph that I haven't actually even heard of, let alone, tried.

Miko Pawlikowski:

I keep using this word Cambrian explosion, but it really does feel like that.

Miko Pawlikowski:

They're just crawling out of every rock and hole, which is amazing.

Miko Pawlikowski:

This is, such an exciting time to be alive, is that it's the right way of putting that.

Miko Pawlikowski:

why did you choose that figure, for your book?

Amit Bahree:

I had two schools of thought when I originally said this would be a right one.

Amit Bahree:

I think one of them is what you touched on, yes, OpenAI and ChatGPT has the world's attention.

Amit Bahree:

but there's a lot of other innovation, a lot of other companies, a lot of other stuff going on as well.

Amit Bahree:

It's not only that.

Amit Bahree:

so I think it is more of awareness in that sense, because the book also is in my personal capacity.

Amit Bahree:

It's not a Microsoft-sponsored or a Microsoft book, right?

Amit Bahree:

So in that sense, I felt I would be doing a disservice if I didn't make folks, at least aware, because you just know what you know.

Amit Bahree:

So I think that was my one aspect.

Amit Bahree:

I think the second aspect was also.

Amit Bahree:

Showing lineage because a lot of these models are complex, as base models of training.

Amit Bahree:

they're super expensive, both in the sense of data gathering, cleaning it up, actual training costs, and so on and so forth, which many don't really have the appetite.

Amit Bahree:

or have the ability resources-wise to do that.

Amit Bahree:

So what I also wanted to show was, at the end of the day, it's still only a handful of base models that are further trained or fine-tuned and derived from.

Amit Bahree:

so it's a lineage aspect also I wanted to, because that gets lost in the noise as well.

Amit Bahree:

and again, the framing of the book is mostly in enterprises, so if you're in an enterprise setting, you just need to know the roots of the model you're using and the lineage it has.

Amit Bahree:

So you can make an informed decision if that's the right thing or not the right.

Miko Pawlikowski:

Speaking of which, that reminds me, Phi-3 released last week.

Miko Pawlikowski:

It seems to be punching above its, weight category, quite heavily.

Miko Pawlikowski:

were you involved in any capacity in that project?

Amit Bahree:

in a minor way.

Amit Bahree:

So if you go read the technical paper, I'm one of the 70 some people listed on that.

Amit Bahree:

it's a team sport.

Amit Bahree:

so the team that built the, the SLM, the Phi-3 is originally from our platform team.

Amit Bahree:

they've been moved out of that into the new GenAI team we've recently formed and publicly announced.

Amit Bahree:

so we work very closely with the team.

Amit Bahree:

even though I have roots in applied research, I don't think I can take credit to say I built the model, but I've been involved with it for sure.

Miko Pawlikowski:

you're on the paper.

Miko Pawlikowski:

That means you, you built it, you can claim that

Amit Bahree:

I think Sebastian and the others have been very kind where some of us have been involved in providing feedback and input and guidance and what have you.

Amit Bahree:

I think they've been quite kind and then they've done the right thing.

Amit Bahree:

But that doesn't mean I can take foot credit.

Amit Bahree:

the way I think it is, it takes a village.

Amit Bahree:

Each village needs an idiot, and that's me.

Amit Bahree:

It's an important role.

Amit Bahree:

Somebody has to do it.

Miko Pawlikowski:

Oh, wow.

Miko Pawlikowski:

That, is a lot of authors.

Miko Pawlikowski:

I just opened, and the paper, was released three days ago.

Miko Pawlikowski:

looks like it, and it is an impressive number of people working on that.

Miko Pawlikowski:

I've been reading people's opinions.

Miko Pawlikowski:

I haven't actually read the paper.

Miko Pawlikowski:

so I don't know how it explains how it's possibly this good.

Miko Pawlikowski:

it happened a few days.

Miko Pawlikowski:

Was it a week after LLAMA-3 was

Miko Pawlikowski:

. Amit Bahree: Roughly.

Miko Pawlikowski:

the main selling point being that they trained it on 15 trillion tokens or some ridiculous number like that.

Miko Pawlikowski:

And they were surprised that it kept getting better.

Miko Pawlikowski:

Sounds like this one was, trained on a much smaller corpus of text.

Miko Pawlikowski:

how do you explain, why it's so good?

Amit Bahree:

so there's two things here.

Amit Bahree:

I think it's, and it's in the paper, it's 3 trillion tokens.

Amit Bahree:

I think the, again, this is a genesis from Phi-2, which is a genesis from Phi-1, which is a genesis from ORCA2.

Amit Bahree:

Those are all research models.

Amit Bahree:

one of the things we've come around to seeing is in the context of these new categories of small language models, is highly curated data sets is better.

Amit Bahree:

so one reason why you see Phi-2 and Phi-3 doing so much better.

Amit Bahree:

Relative to, bigger models is because, a good chunk of the data is highly curated.

Amit Bahree:

There's two aspects to it, which we also publish.

Amit Bahree:

So there's this other paper, this textbooks is all you need if you or your readers have seen it.

Amit Bahree:

So basically a good portion of the corpus is high quality textbooks as input into the model to train on.

Amit Bahree:

And then the second aspect of, data is not common crawl sucking stuff off the web, but again, highly curated, web data, or a very small subset of the web data, combined with the, Textbooks.

Amit Bahree:

that is also an interesting research thing where it's going now to say is like for these smaller models.

Amit Bahree:

How much higher quality data sets does carry a lot of weight.

Amit Bahree:

and that's really a lot of what you're seeing.

Miko Pawlikowski:

So When people say curated, does it mean an army of humans?

Miko Pawlikowski:

Like selecting, reading that and annotating and like discarding low quality stuff.

Miko Pawlikowski:

Or is there like another model that does that work to pre select and it's models all the way down

Amit Bahree:

It's not an army of humans because that's not scalable and doable at, you can do it as a one

Miko Pawlikowski:

trillion tokens.

Miko Pawlikowski:

Yeah.

Amit Bahree:

Yes, you can do it as a one off maybe, but, especially.

Amit Bahree:

Phi-3 is a product.

Amit Bahree:

Phi-2 was a research model, two different things and from our perspective, the minute we're saying it's a product, we release it to production, it has to go through the right rigour and cycles from a Microsoft perspective.

Amit Bahree:

That means, We have to support it for a number of years.

Amit Bahree:

We have customers who are gonna use it and so on.

Amit Bahree:

So we can't just publish it with an army of people.

Amit Bahree:

'cause that doesn't really scale.

Amit Bahree:

So there is other models helping when you say how do you create it, at least in the context of this, it is synthetic data generated using, GPT-4, but then the humans are involved to make sure that, it is curated.

Amit Bahree:

Again, it's not an army of people, but it's.

Amit Bahree:

machinery evaluations and so on, machinery running to

Miko Pawlikowski:

So this is synthetic data we're talking about.

Miko Pawlikowski:

It's literally all generated by ChatGPT,

Amit Bahree:

most, most is generated by GPT-4.

Miko Pawlikowski:

So that always makes me wonder, if we train things on data coming out of a model.

Miko Pawlikowski:

I'm, obviously no expert on this, but intuitively it seems to me like that data generated by GPT-4, any model really it's going to have certain attributes to it that don't necessarily represent, the web, is that not a problem?

Amit Bahree:

Yes and no.

Amit Bahree:

I think one shouldn't be using the output of another model as your general data input only.

Amit Bahree:

I think you have to look at it in certain domains and specific of what you're trying to do.

Amit Bahree:

And then in that context, it would be okay.

Amit Bahree:

But that's where the human aspect also comes.

Amit Bahree:

You have to make sure evaluations are right.

Amit Bahree:

Cause guess what?

Amit Bahree:

The old school garbage in garbage out is still very much valid.

Amit Bahree:

but I think your intuition is correct in the sense.

Amit Bahree:

One shouldn't think of it as, 'hey, I can go use an LLM, spit it out, and then use that to go train my own model', in the breadth, in the broad sense of it.

Amit Bahree:

but you'll also hear of more recent papers coming, and more recent news where, in general, this is not Phi-3, but in general, we have reached the points where we are sucking in all of the available Internet that one's reachable or allowed to reach.

Amit Bahree:

And to train the models more and more, we are also then complementing it with the synthetic data, which, other AI is, generating,

Amit Bahree:

So I think you have to go put it back in which aspects of your existing models not doing great on, evaluating those, and then using that as a basis to strengthen that dimension, rather than just a more horizontal generic, if that makes sense.

Miko Pawlikowski:

Yeah, It certainly does.

Miko Pawlikowski:

And I think what I appreciated, I actually haven't seen the shortcut, the abbreviation SLMs for small language models until I opened your

Miko Pawlikowski:

Oh, okay.

Miko Pawlikowski:

which is, an indication of just how much focus we put on LLMs, the large language models.

Miko Pawlikowski:

And, I think that, to me, at least, I don't know if it's just like the part of me that loves running things on Raspberry Pis and gets excited about the possibility of actually running a decent enough model that I can speak to that actually runs on my phone or something like that.

Miko Pawlikowski:

so 3 billion parameters, does that mean roughly with 4 bit quantization that we can run it on effectively any phone at this stage?

Miko Pawlikowski:

Like it's going to need maybe a couple of gigs

Amit Bahree:

everyone's asking that.

Amit Bahree:

so on a certain profile, so I think we talk about an iPhone 14 with a Bionic processor.

Amit Bahree:

You can run it.

Amit Bahree:

It can do a certain number of tokens per minute sort of generations.

Amit Bahree:

I think.

Amit Bahree:

To be able to go run it for Miko or Amit as a, I can run it on a phone and as an experiment, what have you is one thing, versus the ability to run it at scale for a production deployment is a different thing.

Amit Bahree:

So yes, these are small language models and we do believe how LLMs after ChatGPT became a lot of hype.

Amit Bahree:

Some is good, some is not so good.

Amit Bahree:

SLMs will be the next set, in the context of the hype.

Amit Bahree:

But as I go to remind many of the folks I talk to, it's a small language model in relation to a large language model

Amit Bahree:

at the end of the day, I think 2.8 or 3.8 or whatever parameter we have on the mini one, because this is Phi mini, Phi-3 mini.

Amit Bahree:

It's also a family of Phi models.

Amit Bahree:

This is the smallest of the ones that should be coming out and the paper touches on the others, at the end of the day, three billion parameter or whatever the exact number isn't small just from a computer science perspective, it is still a pretty big, complex thing.

Amit Bahree:

Yes.

Amit Bahree:

Compared to hundreds of billions of parameters, it is small, but it is not small.

Amit Bahree:

I think I have to go remind people that.

Amit Bahree:

In relation, or relative to an LLM, yes, it's small, but by itself, it is still pretty complex and beefy in the sense of compute requirements and GPU requirements and, what it needs.

Amit Bahree:

It doesn't mean you'll go off and deploy a bunch of these on your Raspberry Pi with inference and, milliseconds and whatnot.

Miko Pawlikowski:

I'm asking this, but one of the reasons I am asking this is because, I don't know if you followed the launch of Humane AI, that little gadget that kind of looks like something from Star Trek.

Miko Pawlikowski:

and it looks like it hasn't been particularly well received because it's a little slow and a little clunky.

Miko Pawlikowski:

I think, I watched.

Miko Pawlikowski:

YouTube review of that and I think they basically destroyed it a little bit by showing just how long you have to wait because it's effectively just uploading it to a cloud somewhere and then downloading the response and it's just not there.

Miko Pawlikowski:

And, with Phi-3 and like the smaller models, all of a sudden everybody's thinking the same thing.

Miko Pawlikowski:

Can we make it native?

Miko Pawlikowski:

I think Apple announced some things about how they're going to work on making sure that the hardware in the newer iPhones.

Miko Pawlikowski:

to run this stuff at reasonable speed.

Miko Pawlikowski:

And this feels like this would be, another hockey stick moment for this things.

Miko Pawlikowski:

There's small language models where Siri doesn't suck Hey Google.

Miko Pawlikowski:

"Okay Google" works.

Miko Pawlikowski:

And Alexa actually listens to me, that kind of stuff.

Miko Pawlikowski:

do you think it was that one of the motivations of the smaller model?

Amit Bahree:

the premise you're touching on is, was one of the motivations.

Amit Bahree:

So if I rewind for a second, for a large language model, I go back, Again, if you cut through the hype for a second, laws of Physics and computer science.

Amit Bahree:

For these large language models, enormously complex, needs a lot of compute resources to run.

Amit Bahree:

And like any developer, programmer, computer scientist will tell you, laws of Physics, the scale means complexity, means latency, I have to process more things.

Amit Bahree:

It takes time to get results back and there is no ways to cut those corners at the end of the day.

Amit Bahree:

So where you're seeing latency or things are slower, it's because of that.

Amit Bahree:

from our perspective, there's also another dimension to run this at cloud at Azure level globally across the hundreds of data centers and what have you.

Amit Bahree:

that's not simple or cheap, So if we can reduce our costs to run this at scale, we can make sure the service is cheaper for our customers as well.

Amit Bahree:

I think this is also where, we as humans are awesome and we forget things.

Amit Bahree:

Because many of these models are exposed as an API.

Amit Bahree:

we as, at least developers for sure, have the expectation that it's an API call, so I'm going to get my response back in, milliseconds and what have you.

Amit Bahree:

because that's what we have been used to.

Amit Bahree:

The difference is, yes, it's an API call, but the machinery that's running behind, including the models itself, is super complex.

Amit Bahree:

and when things are slow, we get unhappy.

Amit Bahree:

So I think that also needs to be a mentorship.

Amit Bahree:

So if you package it all of this up, that's a big motivation of, in some cases, a small language model would make more sense.

Amit Bahree:

But I also want to outline this.

Amit Bahree:

It doesn't have the same power as a large language model.

Amit Bahree:

I see a lot of comparisons to the bigger models and all, which is good.

Amit Bahree:

It's early days, but at the end of the day, not an apples and apples comparison.

Amit Bahree:

For example, A lot of people, including me, have been guilty of just using GPT 4 as a knowledge database, more and more people are instead of googling or binging or whatever you do, you just ask the thing.

Amit Bahree:

So you're using it as a big, fancy database.

Amit Bahree:

So if I put that in the sense of the world knowledge, again, it's not factually correct whether it doesn't have the world knowledge, it only has the publicly accessible knowledge as of its cutoff training, but ignoring that

Amit Bahree:

point, the small language models will not have that because they've not been trained on that volume of data.

Amit Bahree:

So I think the other dimension is whilst the compute profile, what we've been talking is one, you have to think of the SLMs in the right use case.

Amit Bahree:

What am I trying to do?

Amit Bahree:

If I'm trying to do understand an entity in a workflow, I can use a small language model.

Amit Bahree:

I don't need the power of these large language models necessarily.

Amit Bahree:

equally if there's different languages that one has to use, not English, for example, a small language model may not be as powerful or as good as a large language model.

Amit Bahree:

So the way we should think about it is they shouldn't be competing.

Amit Bahree:

They're complementing each other.

Amit Bahree:

in what you're trying to solve at the right step, use the right model because the beauty again is they're an API call.

Amit Bahree:

So it's not that if you're developing an application, you're stuck with one thing for the whole duration.

Amit Bahree:

You can choose at the right step for the right thing, for the right power.

Amit Bahree:

So I use.

Amit Bahree:

Often with my teams and others, the analogy, like if a GPT or pick your model is like a Ferrari, if you're going to racing, you need a Ferrari, but if you are, if an SLM is like a Honda, and by the way, I don't get pick your brand,

Amit Bahree:

and you're stuck in morning rush hour traffic, and Honda is better than you pick the right thing for the right purpose.

Amit Bahree:

Is this really what I'm getting into?

Amit Bahree:

I would show them the compute profile.

Miko Pawlikowski:

I completely agree.

Miko Pawlikowski:

And I think these are separate use cases where I just want my okay Google and my Siri to not suck so much.

Miko Pawlikowski:

I want it to understand what I mean half of the time and not have to say the thing three times.

Miko Pawlikowski:

Not to wonder every time, what did I say differently now that it didn't catch the song that I wanted to play kind of thing.

Miko Pawlikowski:

And that would already be like a big improvement for me, just interacting with that thing.

Miko Pawlikowski:

when you were saying all those things, I was wondering whether there is a certain kind of minimal level where it will be, a certain number of tokens per second that will feel to most humans as real time is really what we're talking about here and beyond that point, probably doesn't matter.

Miko Pawlikowski:

if you can't read it faster than it's being produced and you're not going to have that, that feeling of slowness.

Miko Pawlikowski:

And there are some interesting.

Miko Pawlikowski:

things like Groq, the one with a Q at the end, I think there's suing Elon Musk over there, that has some dedicated hardware, and I saw some demo, I was doing something ridiculous, like 800 tokens a second on LLAMA 3.

Miko Pawlikowski:

So was it 70B or something?

Miko Pawlikowski:

Is it not just like the matter of waiting like 5-10 years for the dedicated hardware to get cheap and plentiful enough.

Miko Pawlikowski:

And it won't be so much of an issue?

Amit Bahree:

that's the whole story of computing history, if you go look at it, right?

Miko Pawlikowski:

Yeah.

Amit Bahree:

As hardware improves, but I think we also have to put it in the context of the scale of use.

Amit Bahree:

For example, if you have access to a data center with hundreds of GPUs of today's best in breed, let's say, and there's nobody else in, it won't feel slow to you, it'll be like, what's everyone complaining about?

Amit Bahree:

But if in the same data center, you have 4000 other users concurrently coming, it's a different story.

Amit Bahree:

I think I have to also remind people when you are doing comparisons or the expectations you have to think of in the sense of the load, the traffic, how much, now, we as a cloud provider, that's a lot of our headache and a lot of customers saying like, why do you think I'm paying you?

Amit Bahree:

but I also go back, yes, and laws of Physics don't change as well.

Amit Bahree:

but overall, I think, Nvidia announced a whole bunch of new stuff.

Amit Bahree:

at their conference quite recently, network speeds are improving or have been.

Amit Bahree:

So if you step back for a second, I think, just history of computing has been that hardware scales up and improves and helps the software.

Amit Bahree:

I think the one thing that I'm not, personally speaking, I can't predict the future.

Amit Bahree:

The one thing is, This is one of those back to your hockey stick points.

Amit Bahree:

The scale is almost at a global level.

Amit Bahree:

okay, it's not every human on the planet using ChatGPT or LLMs in some fashion, but quite a big percentage of people are.

Amit Bahree:

In some manner, on a daily basis, for some people, it is eight hours a day.

Amit Bahree:

My mom may be once every other, once a week or whatever it is she does.

Amit Bahree:

but the breadth of humans using it is much more broader than it ever has been.

Amit Bahree:

So with that context, even as other underlying system and hardware improves, I think the perception of, is it actually getting improving, maybe slower than perhaps in the past where it was still more niche, if that makes sense.

Miko Pawlikowski:

It does.

Miko Pawlikowski:

And I think to follow the train of thought that you started here, the potential is probably higher just because it's so much more intuitive.

Miko Pawlikowski:

You just talk to it.

Miko Pawlikowski:

my mom when she needs to install an app, it's a whole thing.

Miko Pawlikowski:

It takes a while.

Miko Pawlikowski:

She needs to get used to it.

Miko Pawlikowski:

She needs get comfortable with it, needs to remember the password.

Miko Pawlikowski:

There might be another pin.

Miko Pawlikowski:

it's a whole thing, but, once she gets some kind of interface that's built into her phone, or whatever, where she can just talk to it, that kind of clears a lot of barriers and, a lot of people are picturing this feature where your

Miko Pawlikowski:

phone is slowly turning to effectively listening device from Star Trek and, It's just doing what you want it to do.

Miko Pawlikowski:

And maybe integrates with all the apps.

Miko Pawlikowski:

I ordered that Rabbit R1.

Miko Pawlikowski:

I'm still waiting.

Miko Pawlikowski:

I don't know where the delivery is supposed to be, but, that's one of the visions of the future is right there.

Miko Pawlikowski:

You just talk to it and the model does things on your behalf, goes to this dodgy apps and clicks things, and you don't have to worry about that.

Miko Pawlikowski:

And you don't have a learning curve.

Miko Pawlikowski:

And I think that's a vision of the future that excites a lot of people.

Miko Pawlikowski:

And, I suspect we might see something like that in the near future because I don't see any roadblocks for

Amit Bahree:

No, I actually argue the other way.

Amit Bahree:

I see it actually is already happening now.

Amit Bahree:

And I can give you two, real examples.

Amit Bahree:

for example, I'm originally from India.

Amit Bahree:

And in India, as much as the country's made progress, there's still a, decent percentage of the population who is not very literate.

Amit Bahree:

Either they haven't finished school, or they dropped out early, or they've actually not gone to school.

Amit Bahree:

Now, it may be a small percentage at a country level, but if it's a country with 1.4 billion, a small percentage in absolute numbers is still a big number.

Amit Bahree:

a chunk of humanity.

Amit Bahree:

And in that, we're seeing, for many people who are not comfortable reading or writing, some of the cheaper devices they have, it's not an iPhone or an Android phone, but they have a speech.

Amit Bahree:

So they print, there's a big mic.

Amit Bahree:

In the middle of the phone, they can press that and talk to it and actually in natural language, in their language, they're asking questions and talking to it and that is, as it happens in some of these cases, it's out of some of our speech AI, which is understanding it and then responding back.

Amit Bahree:

So it's lowering the barrier and opening up this to a broader segment, which in the past was not possible.

Amit Bahree:

so that's one example, because they don't need to know the language to go type it in or what have you.

Amit Bahree:

They can just talk to it normally how they would talk to it.

Amit Bahree:

And then the second one was actually more of a ChatGPT example, which I think, Microsoft also published where it's, plugging in different languages for, again, in rural areas in India as farmers, like India is not, For those who don't know, it's not like the U.

Amit Bahree:

S.

Amit Bahree:

and others, where you have big farms with, hundreds and thousands of hectares or acres.

Amit Bahree:

They're usually small farms.

Amit Bahree:

Usually it's, the family which owns it.

Amit Bahree:

And they don't really have the muscle at an individual level, to go understand pricing and markets and what's happening as they want to go sell there.

Amit Bahree:

green or whatever they're growing.

Amit Bahree:

So in that sense, we talked about democratizing was how they're using ChatGPT to actually, getting basically real time market information.

Amit Bahree:

So they're empowered to go make a better decision, which until now was impossible because you need a computer, you need a modem, or you need to be online and those are the barriers.

Amit Bahree:

And it's like, they don't know how to use it, or in the language that they understand.

Amit Bahree:

So these are actually happening today, like in production, so to speak, live, and the way we want to think about it as is democratizing AI, which is when I go back to how you started asking me the question, when we started talking, if I go back to my mom's or the example you used with your mom of the barrier of a new

Amit Bahree:

app or a new interface, if we free them up or make it easier in many ways, those are the democratizing, elements that is happening is not only about your, Okay.

Amit Bahree:

how literate you are or not, but it's the, it's easing barriers basically.

Amit Bahree:

So of course it doesn't do everything.

Amit Bahree:

It doesn't mean all barriers are gone, but we see a lot of real examples, day to day life things that people are using it, which is absolutely fascinating.

Miko Pawlikowski:

there was this very popular demo of, I think it's called bland AI where they had a billboard with a phone number to call and you can have thousands and thousands of parallel conversations with an AI to do things like booking and, Basically get like a first line human in a, experience really.

Miko Pawlikowski:

And the demos were amazing.

Miko Pawlikowski:

And there's, like a million startups doing things around that at the moment, it also obviously has the dark side, right?

Miko Pawlikowski:

Where people are worried that what does it mean, Can you go and sway an election now by just calling everybody in the US and telling them something that they want to hear and, personalize the message.

Miko Pawlikowski:

It is a brave new world, a weird world that we're entering here, Where some things that, you could always technically go and call everybody in the U.

Miko Pawlikowski:

S., but it would take a while.

Miko Pawlikowski:

now with those things, maybe you can do it convincingly in a shorter period of time, And maybe not that expensively.

Miko Pawlikowski:

does that scare you?

Amit Bahree:

yes and no.

Amit Bahree:

I think that's true with any aspect of humanity or technology.

Amit Bahree:

You can use it for good, you can not use it for good.

Amit Bahree:

And it's a choice you have to make.

Amit Bahree:

So I think that's sort of one.

Amit Bahree:

So in that dimension, it's not something new that we haven't been doing.

Amit Bahree:

I think what is new or what is more dangerous, if that's the word I want.

Amit Bahree:

I'm not sure if that's the word I want, but I can't think of a better one.

Amit Bahree:

More concerning, is how easy it is.

Amit Bahree:

And unless What things to watch out for?

Amit Bahree:

How do you know what's true or not?

Amit Bahree:

So I think there's of course dimensions into it where we as humans have to recalibrate ourselves on, do I trust it or not?

Amit Bahree:

For example, Robocalling has been around for decades.

Amit Bahree:

The fact that I can cheaply call everyone is not the problem.

Amit Bahree:

now it is, it may sound like Amit or Miko's calling, which in the past You know it's not Amit or Miko calling.

Amit Bahree:

I think that's the really, things to think about and worry about.

Amit Bahree:

The way I reposition it as well, from a Microsoft perspective, and also I have a whole chapter in the book on that, is, the, there is new emerging threats from a security.

Amit Bahree:

So if you think of a traditional security aspect of your application or developer, DevStack, The way we're saying is look, there's additional new security threats you have to go think about.

Amit Bahree:

And it's easy to get wrapped up in all the negative, but if you step back and say, look, as there were paradigm shifts, as you went from client server two tier, and I'm going to show my age now, applications to distributed applications and then

Amit Bahree:

to web applications, There's a lot of goodness, but then it also opened up the exposure to, a different threat vector.

Amit Bahree:

The surface area was different.

Amit Bahree:

In some cases it was broader, in other cases it was actually contracted.

Amit Bahree:

And in that sense, this is no different.

Amit Bahree:

There is new emerging threats you have to think about and be cognizant of, and then also understand what is the risk of that.

Amit Bahree:

And sure, a threat could happen, but how often will it happen?

Amit Bahree:

And how do I mitigate that?

Amit Bahree:

Uganda will solve 100 percent everything.

Amit Bahree:

But you have to then hone it back down into what's your use case, how you're thinking about it, and so on.

Amit Bahree:

so instead of, either ignoring it, which is not good, or putting your head in the sand like it's all doom, neither of those dimensions are going to be helpful.

Amit Bahree:

So I think part of it is understanding that, yes, there is a new set of threats that are emerging.

Amit Bahree:

Be aware of those.

Amit Bahree:

How do you solve for those?

Amit Bahree:

How do you manage those?

Amit Bahree:

And then In the context of a use case, in the context of how you're using it.

Miko Pawlikowski:

It's a little bit like passwords, isn't it?

Miko Pawlikowski:

We rely on the fact that it's not practical for someone to go and brute force your password because it would take a thousand years.

Miko Pawlikowski:

And if someone goes and figures out how to make a computer that goes around the limitations of Physics and can do it a thousand times faster, all of a sudden a lot of passwords would be useless.

Miko Pawlikowski:

And I think it's a little bit like that, right?

Miko Pawlikowski:

we got a technology that made things possible now, that we're relying on them just not being practical from, time and cost perspective.

Miko Pawlikowski:

And now we have to deal with that.

Miko Pawlikowski:

And the genie's out of the bottle, as they say, I think.

Miko Pawlikowski:

and the cat's out of the bag.

Amit Bahree:

That's a great analogy.

Amit Bahree:

I actually like that.

Amit Bahree:

I'm going to steal that in other places.

Amit Bahree:

But you're right, like there was a time where we didn't need passwords.

Amit Bahree:

It wasn't a problem.

Amit Bahree:

And then there was a time where we needed passwords, but it was simple passwords.

Amit Bahree:

You could do hello1234 or password1 or what have you.

Amit Bahree:

And then it was like, time where, okay, it needs to be a little more complex.

Amit Bahree:

And now you can find these, buy these on the dark web and all.

Amit Bahree:

and hence you need more complex passwords.

Amit Bahree:

my one PSA is please use a password manager.

Amit Bahree:

10 years, 15 years ago, if you were chatting, the concept of a password manager would be so alien.

Amit Bahree:

And here now, I'm sure as you do, I do tech support for my family.

Amit Bahree:

Unpaid, of course.

Amit Bahree:

my everything is go use the password manager.

Amit Bahree:

Here's how you set it up.

Amit Bahree:

And why shouldn't you reuse passwords?

Amit Bahree:

And let the thing do the heavy lifting for you.

Amit Bahree:

But you save it, right?

Amit Bahree:

With your master password and whatnot.

Amit Bahree:

I think it's, yeah, it's the same analogy in that sense.

Amit Bahree:

it goes back to your thread vectors change society.

Amit Bahree:

Things are changing and, part of it is adapting.

Amit Bahree:

Some is good.

Amit Bahree:

Some is not good.

Miko Pawlikowski:

Let's circle back to your book.

Miko Pawlikowski:

ultimately, that's how I learned about you existing.

Miko Pawlikowski:

So as I was reading it, for anybody who's, interested in, go and pick it up on, manning.com, it's a very practical guide.

Miko Pawlikowski:

It's called "Generative AI in action" for a reason.

Miko Pawlikowski:

There is.

Miko Pawlikowski:

Little time spent on the underlying details.

Miko Pawlikowski:

There is obviously the intro that covers everything that you would expect in terms of what is generative AI, the architecture, high level, what it means, references, overview of LLMs, transformer, smaller language models, that kind of stuff.

Miko Pawlikowski:

And then it turns into basically a guide to show you what's possible with it, show you how you can go and call some API and get magic text being generated.

Miko Pawlikowski:

It shows you how to generate pictures, shows you how to generate other things like music, video, I think briefly code, all that kind of stuff.

Miko Pawlikowski:

So I'm picturing this really as a kind of guide that you get yourself when you want to get into this without wasting any time on things that are not necessary for your journey, I will get you from zero to one on that.

Miko Pawlikowski:

is that accurate description?

Miko Pawlikowski:

Am I doing a good, marketing pitch here?

Amit Bahree:

Mostly.

Amit Bahree:

Yeah.

Miko Pawlikowski:

Mostly.

Amit Bahree:

I'm not in marketing.

Amit Bahree:

Yes.

Amit Bahree:

I think that is an accurate description.

Amit Bahree:

I think the emphasis is on the "in Action" part, the premise of this is, you want to go build an app and right now.

Amit Bahree:

I go back to my year and a half of conversations from CEOs down across the Fortune 500 or whatever, which is our, from a work point of view, right?

Amit Bahree:

A lot of these large enterprises, but this is not about just large enterprises, it's about if you're a company, you have a set of products you want to improve or make new, how do I use this GenAI and ChatGPT and LLMs and everyone's heard about it.

Amit Bahree:

And they don't know where to start or how to start.

Amit Bahree:

So that's really what I was trying to do, right?

Amit Bahree:

There's broadly speaking three parts to the book.

Amit Bahree:

The first part is introductions and because you just know what you know, I can't just go dig into things without giving you some context and basis on what's possible, what's not possible.

Amit Bahree:

and that's the first part you're touching on.

Amit Bahree:

What I stay away from is it's not a science research book.

Amit Bahree:

I link to papers where there are people who generally are curious or they want to go deeper.

Amit Bahree:

So we leave those crumb trails in a way saying, if you want to dig more in your own time kind of a thing, here's the things you can go read up and then that'll expose you to more dimensions, right?

Amit Bahree:

So it's not a science book, techie book in that sense, because At least in an enterprise setting, most developers and CTOs and CIOs or CEOs, they want to see like, how is it going to solve my business problem?

Amit Bahree:

How do I do it?

Amit Bahree:

Some are interested in the science and the depth, but most just want to know at a high level, how it works deep enough, but not in the guts at least on the AI side of the science.

Amit Bahree:

so we leave the breadcrumbs and the trails pointing to papers where people can go deeper should they want to, but If you're a developer and you can use a set of APIs and SDKs, that is really for you and the way we say is because these, at least these LLMs are exposed as an API.

Amit Bahree:

You really don't need to know any of the AI sort of mumbo jumbo, any developer can pick it up easily.

Amit Bahree:

So that's certainly why I was trying to position it.

Amit Bahree:

Part one is getting you a sense of the world from a technical perspective, but not go super deep.

Amit Bahree:

And then part two and part three is where we start going deeper on, okay, how do I use this in my production, solving my business problem, what I'm trying to do.

Miko Pawlikowski:

making it very applicable, for example, at some point, the book is talking about image generation.

Miko Pawlikowski:

And there is a short description of generative adversarial networks, and it doesn't include Ian Goodfellow getting drunk and going to a fellow student's graduation and then arguing with them and then going home and implementing a proof of concept algorithm to prove the other people wrong.

Miko Pawlikowski:

And, next day discovering is actually working.

Miko Pawlikowski:

It's giving you the kind of applicable.

Miko Pawlikowski:

This is used for scenarios where the data is complex and diverse, requiring realism, suitable for high quality images, data augmentation, style transfer.

Miko Pawlikowski:

So it's prescriptive in a way, I would say, you give people, what they need to, get to get cracking with it.

Miko Pawlikowski:

speaking of which, let's talk a little bit about the images because you do cover a few interesting things like the VAE, the GANs, diffusion, vision transformers, to give people a sneak peek of what they're going to expect.

Miko Pawlikowski:

Can you talk about why they're interesting and why they might be, something that you should be paying attention to.

Miko Pawlikowski:

What are the breakthroughs?

Amit Bahree:

I think one aspect is where ChatGPT and the LLMs is just the language part, is taking the hype.

Amit Bahree:

And I think most people understand that there's a different set of tech related but different on images, right?

Amit Bahree:

and image understanding, image editing, the power of it on one hand, wherever you go on whichever social thingy with stable diffusion came out, lot of creativity on the image generation was there, but in a social

Amit Bahree:

setting, the thing really is, how do you expand that in a corporate, application setting and what can you do?

Amit Bahree:

It's one is like fun and wonderful in a personal social setting.

Amit Bahree:

But then how do I transfer that and then which area do I use in a work setting?

Amit Bahree:

Not even have to be work, each of these techniques have their own power.

Amit Bahree:

I think most people don't really care or maybe nor should they care, but in some cases where it would matter It's good to know what is the underlying tech, so I know what to ignore versus not to ignore.

Amit Bahree:

Because, again, the hype wraps up a lot of this.

Amit Bahree:

If you come back to it, it's more of helping people ground themselves a little, because at the end of the day, the tech is still the tech, right?

Amit Bahree:

What it is meant to do and how it is meant to do doesn't fundamentally move.

Amit Bahree:

if you're trying to solve one set of images, one kind of things, like diffusion models would be great for, that set of categories.

Amit Bahree:

And now there's multiple diffusion models.

Amit Bahree:

You can go pick which one you want.

Amit Bahree:

versus a transformer model.

Amit Bahree:

So again, we don't go, I have a few diagrams and images to, outline at a high level how these are, because there's papers on each topic, you can go read like hundreds of them.

Amit Bahree:

but the intention is just to know, look, there's different buckets and categories.

Amit Bahree:

Each has its own strengths.

Amit Bahree:

And if what you're trying to solve for, just make sure you connect those dots.

Amit Bahree:

I guess the other analogy is if you're writing a book, word is easier than notepad kind of a thing, right?

Miko Pawlikowski:

I was a little surprised to see a prompt engineering chapter, but I guess it makes perfect sense.

Miko Pawlikowski:

You need a little bit of basics.

Miko Pawlikowski:

What was your thinking, with that chapter?

Miko Pawlikowski:

What was the goal you wanted to achieve with it?

Amit Bahree:

in the context of LLMs, like prompt engineering is pretty crucial.

Amit Bahree:

it is how you steer the model fundamentally in many ways.

Amit Bahree:

the beauty of it is half art and half science.

Amit Bahree:

The frustration of it is it is half art and half science.

Amit Bahree:

but, fundamentally, at least with today's technology of where things are, prompt engineering is quite crucial.

Amit Bahree:

And the way we also tell many of our customers and I tell is look, you have to start thinking about prompts as your IP

Amit Bahree:

in many ways, and I'm not talking about simple prompts.

Amit Bahree:

Like I, in the book, I use simple prompts to make the point.

Amit Bahree:

So 'tell me a story about a panda' is not really IP in the context of a prompt.

Amit Bahree:

and then prompts are also closely tied to how a model understands it.

Amit Bahree:

So again, outside of simple prompts, where you are When you're using this concept of RAG, for example, as you start using a specific model or a family of models, which are closely related, you start picking up nuances on how the models interpreting things and working with things and so on.

Amit Bahree:

And then you're tweaking your prompts along with that, right?

Amit Bahree:

So it's cohesive together.

Amit Bahree:

and that intuition as you learn is also part of your IP and how you want to think about prompt engineering.

Amit Bahree:

That also means.

Amit Bahree:

There is no universal prompts.

Amit Bahree:

Again, outside of the simple ones, I'm not talking about the simple, straightforward prompts.

Amit Bahree:

So you should not, or one should not just say, if I'm, let's say, using GPT 3, 4, whichever, the same prompts, which are complex ones, I can pick up and expect to work on, let's say, LLAMA or something else.

Amit Bahree:

They will work, but do they work at the same level and the same evaluation, the same criteria?

Amit Bahree:

Probably not, because they are quite tied into how the model behaves.

Amit Bahree:

This is very loose, right?

Amit Bahree:

It's not a scientific thing, prompt engineering is quite crucial.

Amit Bahree:

It is how we talk, even though you are calling an API, but how you're talking to the model is through those.

Amit Bahree:

so I think it's worth spending time to understand how, what these are, how they work.

Amit Bahree:

There's a lot of hype around prompts as well.

Amit Bahree:

I would say don't believe all of it.

Amit Bahree:

The one final point I want to make on it is prompts is also one of the new threat vectors.

Amit Bahree:

So I touch on it in a later chapter, I touch a little bit on prompt injection in the chapter you've seen.

Amit Bahree:

but we go a little more deeper in one of the later chapters.

Amit Bahree:

But prompt injection, as an example, is one of the new threat vectors.

Amit Bahree:

It's not the only one.

Amit Bahree:

so again, understanding that as well, but prompts, in today's world gets quite crucial.

Amit Bahree:

At the end of the day, it's how we, in quotes, talk to the model.

Miko Pawlikowski:

That makes sense.

Miko Pawlikowski:

prompt engineering might be getting a little bit of bad rep just because of how many people are walking around saying that they have the ultimate prompt stuff like that.

Miko Pawlikowski:

But at the end of the day, you do need to learn how to to these things.

Miko Pawlikowski:

And it is one of the biggest frustrations.

Miko Pawlikowski:

It's almost like, You're talking to a cat sometimes it can suddenly freak out and do something very weird at that moment notice.

Miko Pawlikowski:

And there is little you can do to prevent that.

Amit Bahree:

Yeah.

Amit Bahree:

and so we call it, or at least I call it, is you have to think of prompts when you're talking to the model as a parent.

Amit Bahree:

So for those who have had children or who are toddlers right now, it is what we call parentology.

Amit Bahree:

Somebody said this to me in one of my meetings and I loved it and I stole it from them.

Amit Bahree:

So if you're a toddler, your memories retention is lower.

Amit Bahree:

so many often you have to keep repeating.

Amit Bahree:

It's the classic, don't stick your finger in the wall socket, a thing.

Amit Bahree:

Saying it one time doesn't help, you have to keep repeating.

Amit Bahree:

The way I want, generally speaking, folks to think is, your model's like a toddler, you have to keep repeating, keep thinking about it, right?

Amit Bahree:

and as silly as it may sound, it's like basic stuff, like for example, one of the side effects is what's called hallucinations where, non grounded.

Amit Bahree:

So you will get responses back, which are made up and it's not factual.

Amit Bahree:

That may be okay in one dimension.

Amit Bahree:

If you are writing a creative story, it may not be okay in another dimension where in a business setting, you're answering things based on some policy or information or what have you.

Amit Bahree:

so in the prompt it figures like simple things like do not make up any information, only answer from this, you would think that would be obvious, so your intuition of a cat is not very far off

Miko Pawlikowski:

that's an interesting comparison.

Miko Pawlikowski:

Let's do one more.

Miko Pawlikowski:

you talk about RAG in your book and I think a lot of people, I have heard the term and know that there is something to do with getting fresher data, can you give us an explanation for a five year old version of what that is and how it works

Amit Bahree:

should open ChatGPT on the other screen and say, 'explain RAG for a 5 year old in summary'.

Amit Bahree:

RAG is Retrieve, Augment, Generate, right?

Amit Bahree:

So the technique originally came from Meta, Facebook, as the research paper.

Amit Bahree:

But fundamentally, it is crucial when you are using large language models, specifically in the context of a company or a business or what have you.

Amit Bahree:

Basically, it is also a little clunky right now, but what it does is, as the name suggests, the model that you're using just knows what it knows, what it's been trained on, which is public data.

Amit Bahree:

That's one.

Amit Bahree:

And then as with these things, there's a training cutoff, right?

Amit Bahree:

At some point, you say, okay, I'm done collecting data.

Amit Bahree:

I need to go off for a few weeks or a few months or whatever it is and go train this thing and then spit out a model and go through a bunch more other alignment and this and that, and then eventually have a model available.

Amit Bahree:

So Online, when you go and see, a lot of people using RAG to get fresh information, which is post training data, that is absolutely valid use case.

Amit Bahree:

For many others, the other thing is my proprietary information.

Amit Bahree:

So especially in a company setting, your proprietary internal information, corporate knowledge, the model doesn't know because it's never seen it.

Amit Bahree:

In fact, if it does know that, then fundamentally there's a different problem.

Amit Bahree:

Because it shouldn't know that.

Amit Bahree:

for your business workflow, you often need to bring in your internal proprietary knowledge, whether it's a CRM or a database or an ERP, or you're solving a ticket or what have you, depending on the use case.

Amit Bahree:

The only way the knowledge, you can bring in the knowledge is through This technique of RAG, so retrieve, augment, generate.

Amit Bahree:

Retrieve means I'm retrieving the information, which could be from my corporate enterprise systems, or, Google or Bing to get more fresh information.

Amit Bahree:

I'm augmenting it in my prompt, which goes back to prompt engineering.

Amit Bahree:

And then based on that, I'm saying please generate, or whatever I'm trying to do.

Amit Bahree:

Generation could be a summary, or entity extraction, or depending on, whatever I'm trying to do.

Amit Bahree:

But that's what RAG is doing.

Amit Bahree:

It also gets clunky, by the way, because, it is the first generation.

Amit Bahree:

I do expect things to get improving in that.

Amit Bahree:

you talk about complexities of RAG, but if I have to get proprietary in-house information, if I have to get more fresher information, the only ways I can do that is through RAG, without retraining a whole model, which, in theory, is an option practically for, I guess 99% of people is not an option.

Amit Bahree:

I don't know if that was for a five year old, but

Miko Pawlikowski:

Yeah, that might have been a six and a half, maybe even seven, but I let, we'll let it

Amit Bahree:

Thank you.

Miko Pawlikowski:

time.

Miko Pawlikowski:

Okay.

Miko Pawlikowski:

so this is basically what, you're going to see if you look at, the early access version of the book, tells me that there is six more chapters coming very soon and there's a chapters 8-13 that cover things like, More on

Miko Pawlikowski:

RAG, telling models, application architecture for GenAI apps, can they have evaluation and ethical on GenAI.

Miko Pawlikowski:

I, I think at some point we're going to have to get you back to talk about the rest of the book.

Miko Pawlikowski:

but before I let you go, I wanted to, ask you for a few predictions.

Miko Pawlikowski:

from where you stand, where, are we going to see the next evolutions and breakthrough,

Amit Bahree:

one is , Multimodality,

Amit Bahree:

which basically, a lot of people today, primarily when they're using GenAI and the likes of ChatGPT is in one mode, i.e.

Amit Bahree:

language, text.

Amit Bahree:

But I do expect multimodality where I'm starting to combine language, images, text, video, and what have you, together.

Amit Bahree:

Not just generation, but input.

Amit Bahree:

We already are seeing that, by the way.

Amit Bahree:

That's already here today, like GPT V, which is vision, being one example of that.

Amit Bahree:

But more and more multimodality, because our real world is that as well, right?

Amit Bahree:

So I see one and that happening.

Amit Bahree:

I do see SLMs to accelerate more, as we touched on.

Amit Bahree:

Again, they're not better, they're different.

Amit Bahree:

there's times you need one, and there's times you need the other, and there's times you need both.

Amit Bahree:

But, I do see more and more on that front because for many use cases, I need simple things.

Amit Bahree:

I don't need all the other power.

Amit Bahree:

so I do see that accelerating a lot.

Amit Bahree:

And then I also see a third dimension is the, underlying, systems engineering things improving to be it cost effective from how much hardware and GPUs I need to run it to, latency around it, things like memory profile and so on and so forth.

Amit Bahree:

so I do see those sort of three and I guess I want to sneak in a fourth one, which is also all of the responsible AI aspects, which is one of the later chapters, we touched on the likes of prompt engineering.

Amit Bahree:

I know I talked a little bit on hallucinations, but the new harmful things one can do.

Amit Bahree:

That's also a cat and mouse thing.

Amit Bahree:

I do see more research breakthroughs

Miko Pawlikowski:

Do you expect we're still going to be doing transformers a year or two or three from now?

Miko Pawlikowski:

Do you think it was big enough of a breakthrough that it's going to stay

Amit Bahree:

I honestly don't know what I can tell you is it's what everybody's doing at the moment, which is not going away anytime soon.

Amit Bahree:

That's one side of it.

Amit Bahree:

Having said that.

Amit Bahree:

I think it's also pushing a lot of other areas around it where we can do things better.

Amit Bahree:

for example, we didn't touch on it, but each model has this concept of what we call a context window.

Amit Bahree:

How much, how big my prompt can be and in reality how many tokens can it be and how much can it send back?

Amit Bahree:

So on one hand a lot of people get happy Hey, if I have a longer token, my context window is longer.

Amit Bahree:

It means I can stuff in more things I can ask you more things or I can generate more things On one hand, that's good.

Amit Bahree:

People get happy about it.

Amit Bahree:

What then I come and have to remind them like each token one length Increase is a quadratic increase in compute So it's four times, extra costly in the sense of computing profile.

Amit Bahree:

So just having a longer token, context window isn't necessarily good.

Amit Bahree:

So there is research going on now to say, how can we do that?

Amit Bahree:

How can we, derivatives of the transformer architecture, which is, how can we increase?

Amit Bahree:

the token windows without having a quadratic increase on the compute profile.

Amit Bahree:

and that ties back to the attention mechanics of how the transformer architecture works.

Amit Bahree:

the way I would say it's the first one which has, reached the scale.

Amit Bahree:

And then now there's other research damage that's happening to make those profiles better.

Amit Bahree:

will there be another big two?

Amit Bahree:

Which is, better than this, I'm sure, in the sense of humanity, absolutely.

Miko Pawlikowski:

plus like you alluded to, there is a lot of value to being the first thing that's good enough, right?

Miko Pawlikowski:

when we look at how technology works, it's better to have something that's good enough today than to have the perfect or, ideal solution much later.

Miko Pawlikowski:

And typically there is enough momentum.

Miko Pawlikowski:

By the time the better thing comes that it might not be as, attractive as one would think.

Miko Pawlikowski:

I think what was the paper from Google talking about basically a method of achieving infinite, attention span, or was that some other research that you're alluding to?

Amit Bahree:

There, there's a few, there's a few papers.

Amit Bahree:

So there's one, in fact, Microsoft has on, like, how can I do 2 million tokens.

Amit Bahree:

that's one example.

Amit Bahree:

There's another one which is research going on called Ring Attention, which is different.

Amit Bahree:

I can't remember, I think it was Google?

Amit Bahree:

I can't recall off the top of my head.

Amit Bahree:

so there's multitudes of things going on.

Amit Bahree:

In parallel, like active research, on, how do we look at this differently.

Amit Bahree:

And then, that's just a context window.

Amit Bahree:

There's other things, for example, like when we touched on RAG, I said, it's a clunky way of doing things.

Amit Bahree:

we didn't go deep in it, but it's a clunky way.

Amit Bahree:

So there's other things happening, like graph, can I do graph with RAGs, and so on and so forth.

Amit Bahree:

it's not only in one dimension, I was using one of these as an example, but across multitudes of dimensions.

Amit Bahree:

There is active research going on to improve those.

Amit Bahree:

And as you start, as that starts formulating, cause look, research is one, getting something as a product that is deployable and running.

Amit Bahree:

That you can consistently, is a whole separate sort of scale and of its own separate complexities.

Amit Bahree:

but across these multiple dimensions as they come together, it'll just suddenly get improved.

Amit Bahree:

To your point, this is like version one, and it's a mad race across the board from academia to, commercial and whatnot.

Amit Bahree:

So it'll just be improving is how I see it.

Miko Pawlikowski:

the open models eventually prevailing and taking over?

Miko Pawlikowski:

there's a lot of talk.

Miko Pawlikowski:

Obviously people are excited about LLAMA-3.

Miko Pawlikowski:

That's I think a lot of people call it GPT-4 class, comparable, a model that's effectively free to use.

Miko Pawlikowski:

And, obviously Microsoft doing their own, research and releasing the Phi-3, in the open as well.

Miko Pawlikowski:

Do you see this models eventually becoming the defacto standard?

Amit Bahree:

they certainly have a place, for sure.

Amit Bahree:

I think, there's no question about that.

Amit Bahree:

I don't know if they're the de facto standard or not.

Amit Bahree:

I think the challenge would come down to is at the end of the day, with the current state of technology, training a model is super expensive.

Amit Bahree:

There is no shortcut around that.

Amit Bahree:

So even if you have an open source model in the sort of near term, there's only a handful of companies who have the technical know how have the sort of the muscle, have the compute profile to be able to do that.

Amit Bahree:

so more and more until again, unless, as there's more fundamental breakthroughs to open that up more.

Amit Bahree:

a lot of the open source, it's back to that, the ant analogy used for, with the, one of the diagrams we have from the paper in the book.

Amit Bahree:

the part I was trying to show there also is the roots are very few models that are derived from that.

Amit Bahree:

So I think once there's a lot happening, it's at the end of the day, there'll be just a handful of people who are publishing those and exposing those, that others are deriving from.

Amit Bahree:

So until that happens, as in fundamental breakthroughs at a cost point of view, where it becomes cheaper.

Amit Bahree:

It all doesn't need hundreds of thousands of whatever it is, GPUs plus I don't know how many billions of tokens of data, to train them.

Miko Pawlikowski:

Oh, don't worry.

Miko Pawlikowski:

I

Amit Bahree:

it won't be

Miko Pawlikowski:

my crypto mining farm in my garage.

Amit Bahree:

There you go.

Amit Bahree:

That is one way to do it.

Amit Bahree:

I think open source won't still be constrained with a few source models, which they go derive from.

Amit Bahree:

But the fact, if you, the other way If you just look in the last one year, 12 months, which is nothing in the sense of humanity and technology, just see how much progress and how much improvement the models have made across both dimensions, whether they're open source or closed source or what have you.

Amit Bahree:

It is fascinating.

Amit Bahree:

and, the fact that there's literally new models every day is a good thing, but also not a good thing.

Amit Bahree:

So it has to stabilize to some extent.

Amit Bahree:

At some point it will.

Amit Bahree:

but I think the open source community is absolutely critical.

Amit Bahree:

On the flip side, a lot of research breakthroughs also are coming from the research labs where, there's, at the end of the day, deeper pockets and muscle sports in the sense of financial and compute and data as well.

Amit Bahree:

It's a fascinating world we are in, which is your, one of your opening statements, because at least for a geek and somebody in the industry, these are far and few moments that one gets.

Amit Bahree:

So it's absolutely fascinating.

Miko Pawlikowski:

Yeah, we'll be Sitting down with the grandchildren saying, ah, I remember in my day

Miko Pawlikowski:

They released the first capable.

Amit Bahree:

they're like, what?

Amit Bahree:

you use hundreds of GPUs and all this stuff?

Amit Bahree:

Why?

Amit Bahree:

I can just run it on my phone or whatever the phone looks like.

Amit Bahree:

I don't know.

Miko Pawlikowski:

Yeah, exactly.

Miko Pawlikowski:

You are so wasteful back in the day.

Miko Pawlikowski:

Really very clunky

Amit Bahree:

That's right.

Miko Pawlikowski:

Well, we're going to have to wait a little bit until that materializes, but I completely agree It's a very interesting time to be alive, and I'm certainly grateful That I get to experience that.

Miko Pawlikowski:

Amit, it's been a pleasure to host you.

Miko Pawlikowski:

Thank you so much for coming

Amit Bahree:

Thank you for having me.

Links

Chapters

Video

More from YouTube