Artwork for podcast HockeyStick Show
HockeyStick #6 - AI-powered Developer
Episode 66th May 2024 • HockeyStick Show • Miko Pawlikowski
00:00:00 00:45:26

Share Episode

Shownotes

Generative AI in Software Development: A Future Without Coders?

In this episode of HockeyStick, Miko Pawlikowski interviews Nathan B. Crocker, CTO at Checkr and author of 'AI-Powered Developer,' exploring the impact of generative AI tools like ChatGPT and Copilot on software development. They discuss the book's insights into using AI as a junior developer, its appeal to different levels of software practitioners, and experiences with generative AI for coding tasks. The conversation covers AI's role in designing, testing, refactoring, and understanding code, addressing job security concerns for software engineers. They also tackle the effectiveness of local LLMs versus online models, the evolving landscape of AI in coding, and future directions for developers using AI tools.

00:00 Welcome to HockeyStick: Exploring Generative AI for Code

00:23 Diving Into AI-Powered Development with Nathan B. Crocker

00:44 The Practical Guide to AI in Coding: Insights and Experiences

02:47 The Revolutionary Impact of AI on Software Development

04:46 ChatGPT: A New Era of Coding Assistance

08:57 The Magic of Copilot in Your IDE

10:40 Navigating the Challenges of Custom Code with AI Tools

14:46 Designing Software with AI: Beyond Just Code

17:45 Refactoring and Upgrading with AI: A New Frontier

20:27 The Quirks of AI: From Training Data to Practical Use

23:34 Exploring the Limits of AI in Software Testing

24:01 Exploring AI in Testing and Development

24:25 Harnessing AI for Software Testing

25:08 AI's Role in Code Depreciation and Asset Management

25:59 Understanding and Describing Code with AI

28:47 Security Insights and Ethical Considerations in AI

32:05 AI in Infrastructure and Deployment

37:36 Evaluating Local LLMs and Their Capabilities

42:18 The Future of Coding and AI: Predictions and Perspectives

44:01 Closing Thoughts and Next Steps for the Author

Transcripts

Speaker:

I'm Miko Pawlikowski, and this is HockeyStick.

Speaker:

Today, we're talking about generative AI for code.

Speaker:

you know, whether software engineers should start worrying

Speaker:

about their job security.

Speaker:

We're talking about chatting to LLMs to help you design, build,

Speaker:

test, and understand software, both online and offline, as

Speaker:

well as tools like the Copilot.

Speaker:

I'm joined by Nathan B.

Speaker:

Crocker, the author of AI-Powered Developer published by Manning

Speaker:

and the CTO and co founder at Checkr, a tokenization startup.

Speaker:

He just finished his book and we're covering his experience with

Speaker:

using AI as a junior developer.

Speaker:

Welcome to this episode and thank you for flying HockeyStick.

Speaker:

So I, had the pleasure to read your book.

Speaker:

I would say that it's a real practitioners guide to using

Speaker:

AI-power, to, work with code.

Speaker:

It's, fairly light on details, so nothing to scare people off.

Speaker:

It basically jumps right into how to get value out of, artificial

Speaker:

intelligence, if you're working with code.

Speaker:

Would you like to tell us a little bit how you ended up writing this book?

Speaker:

I had actually had a number of, co-workers and other developers who are telling me,

Speaker:

"Hey, you got to check out this stuff".

Speaker:

It's really something.

Speaker:

So this was November of 2022.

Speaker:

and, I had no idea what it was.

Speaker:

I, I started looking into it.

Speaker:

It piqued my interest, but, I needed a really, a deep motivation to actually

Speaker:

dive into it and really incorporate it.

Speaker:

Cause you just use something, periodically, lightly, You're

Speaker:

not really engaged with it.

Speaker:

so I, I pitched Manning the book, they liked the idea and it's

Speaker:

really is my journey through, learning how to use these tools.

Speaker:

my journey is really going to mirror the practitioners that are

Speaker:

reading it as they work through it.

Speaker:

So who is it for?

Speaker:

What's the requirement to get value out of this book?

Speaker:

should have some familiarity with Python.

Speaker:

if I just take a step back, it's really for anyone.

Speaker:

early journey, mid journey, as a software developer, as a software architect.

Speaker:

I suppose as a business analyst, you could derive some value.

Speaker:

All the examples are in Python, There's a couple of microservice, chapters, so you

Speaker:

should have some familiarity with that, but it really is about, taking you through

Speaker:

things that you may or may not be familiar with and working with gen AI to really

Speaker:

teach you some of these concepts as well.

Speaker:

Yeah, maybe you graduated from computer science program and you're like fresh out

Speaker:

of college and you want to know how to take your development to the next level.

Speaker:

That would be really be the target demo.

Speaker:

I think the next level part of it is actually the keyword here.

Speaker:

One of the first things that people see when they open your book, I

Speaker:

think it's literally like page one, is the silent promotion.

Speaker:

Everybody all of a sudden overnight became an engineering manager.

Speaker:

Who basically can have a pretty good, junior working for them for free with

Speaker:

no labor laws or anything like that.

Speaker:

Why is that such a big deal?

Speaker:

it's such a big deal because it used to be the rubber duck.

Speaker:

Like you'd have a partner that you can work with, that you can tell to

Speaker:

do things that you can bounce ideas off of, you can look to for some

Speaker:

answers, because they're thinking it's going to be different than yours.

Speaker:

so they might have a different tack and a different approach.

Speaker:

you could farm out a lot of the work that you don't want to do.

Speaker:

The repetitive boilerplate, CRUD operations.

Speaker:

it's all there, but like any junior developer, super smart junior developer,

Speaker:

that is, they can perform some bafflingly poor, thinking and wind up with just some

Speaker:

nonsense that isn't necessarily usable.

Speaker:

so you gotta watch them.

Speaker:

I suspect that if we were to partition the body of listeners to this, or maybe even

Speaker:

developers in general, There'll be almost none in the category of I haven't heard

Speaker:

of it or I haven't even played with that.

Speaker:

Other than people who might be on a very long vacation, "Cast Away" style.

Speaker:

I don't see how you can really escape that.

Speaker:

Then you probably have a category of people.

Speaker:

'Okay, I played with it.

Speaker:

I went to talk to ChatGPT, spit out some code.

Speaker:

I saw roughly what you can do, but I never really got much value out of that'.

Speaker:

And then the category of people who actually go and use it day-to-day.

Speaker:

Because it is helping their job.

Speaker:

There might be some caveats to that.

Speaker:

Obviously, data privacy and not knowing where the code actually goes and not

Speaker:

knowing whether it's going to be trained on and that kind of stuff that can,

Speaker:

throw a wrench in some people's work.

Speaker:

So should we start with the category of people who might have played with

Speaker:

it a little bit, and, they went to ChatGPT, asked the same questions,

Speaker:

"how tall is this building?"

Speaker:

And, "can you search that for me?"

Speaker:

And they stalled there, from a basic development point of

Speaker:

view, what kind of value, can we extract just chatting to ChatGPT.

Speaker:

What can it do?

Speaker:

I had a good, an interesting, experience, very early on when I was doing research

Speaker:

for the book, I had it on my phone, I would carry it around and just

Speaker:

periodically I would hand my phone to someone just to give them a taste.

Speaker:

I remember I was at a party and a woman, she was, an expert, it was

Speaker:

archeology or maybe it was art history.

Speaker:

And she really, started asking me some questions.

Speaker:

Some of them were factually incorrect, but over the course of her conversation

Speaker:

with ChatGPT, she became really impressed with the accuracy and justification for

Speaker:

some of the answers it was providing.

Speaker:

She would say like, 'why did you refer to this as the oldest, example of this

Speaker:

architecture or painting style?' And it gave her a fairly convincing reason.

Speaker:

I think there's value in, exploring this technology, even if you're not

Speaker:

going to use it in your everyday development effort, just because it

Speaker:

gives you a sense for of the future.

Speaker:

things are going to dramatically change, there are implications that are

Speaker:

rippling all throughout academia now.

Speaker:

I feel that you just should keep yourself informed, about what's coming and where,

Speaker:

these changes are going to be made and how it could potentially affect you.

Speaker:

So from the point of actually going and using that, the way that, you described

Speaker:

in your book, what's the experience like at this very lowest level of just

Speaker:

launching ChatGPT, asking some questions.

Speaker:

Because I remember doing a, a while back, it would spit out some code and I had to

Speaker:

copy, paste it, and I had to add imports.

Speaker:

how good is it right now?

Speaker:

you're fresh of writing chapters about that.

Speaker:

How useful is it?

Speaker:

it's going to, it largely depends on if you're going, 3, 3.5 or 4.

Speaker:

4 is much better, exponentially better than 3.5, but, it's solid, I would say.

Speaker:

there's a lot of caveats.

Speaker:

Most likely going to be looking at code that is one to two years old.

Speaker:

so it was trained on data that was, potentially the old version of a library.

Speaker:

they may have had breaking changes.

Speaker:

especially if there's a fast moving language, like I, I was trying

Speaker:

to build something at Rust and I just, I asked ChatGPT to generate

Speaker:

some code and it wouldn't even compile with the newest compiler.

Speaker:

It's good, I would say, but it's not perfect.

Speaker:

It's a long way from perfect.

Speaker:

What are some remarkable limitations that you bumped into?

Speaker:

You must have seen some interesting, funny stuff.

Speaker:

Can you share some of that?

Speaker:

I haven't seen really outrageous things where it was just making up,

Speaker:

libraries or frameworks or anything.

Speaker:

it's, I would say it's unremarkable in its banality.

Speaker:

the Rust example was probably the best one, but it wasn't great.

Speaker:

I wish I had something funny, just jump ahead a little bit.

Speaker:

I had a really hard time in the testing chapter trying to get it to

Speaker:

write good tests and maybe You know, I don't even remember if I mentioned it

Speaker:

in the book, but at one point I just gave up and wrote the test myself.

Speaker:

because it was very hard to get it to understand what was the unit under test.

Speaker:

what was I actually trying to accomplish with my test.

Speaker:

No matter how much context I added, it was always trying to do

Speaker:

something just completely different.

Speaker:

when I was reading that chapter, I was also thinking at the

Speaker:

back of my head, "What does it say about the training data?"

Speaker:

The tests are so poor.

Speaker:

yeah.

Speaker:

Yeah.

Speaker:

Are all those tests just so poorly written that, that's where I end up?

Speaker:

But yeah, let's touch on that in a sec

Speaker:

who needs tests?

Speaker:

We'll test it in production.

Speaker:

It'll be fine.

Speaker:

There you go, testing in production, everybody.

Speaker:

yeah, don't worry.

Speaker:

We can cut that out.

Speaker:

that was a joke.

Speaker:

the most annoying bit was just that you have to chat.

Speaker:

So then obviously you've got things like Copilot that you also cover in your book

Speaker:

or just plugs in your VS code or whatever.

Speaker:

What's.

Speaker:

The added value of that.

Speaker:

Is it just that you can work as auto completion and it's more syntaxic

Speaker:

and you don't have to copy the code.

Speaker:

Do you get any other bonuses out of that?

Speaker:

the real value to a tool like, Copilot, again, versus ChatGPT is it

Speaker:

does keep you in the IDE and it can keep you in that flow state, where

Speaker:

it's only you and the code, right?

Speaker:

Whereas you're not having to pull yourself out of the context,

Speaker:

move to a different window.

Speaker:

and.

Speaker:

for certain projects, the actual code quality for Copilot was better.

Speaker:

just the, on a line by line basis or, class by class basis.

Speaker:

that's almost certainly due to the fact that it was fine

Speaker:

tuned specifically for code.

Speaker:

that's the main benefit I've found It's always adding helpful suggestions,

Speaker:

sometimes not so helpful too suggestions.

Speaker:

Like I don't need it to add a comment about the name of the file that I'm

Speaker:

working on, but, if I can start to define a method and then it gives me a

Speaker:

possible implementation, even if I don't accept it, it's at least, showing me one

Speaker:

possible implementation that I could use.

Speaker:

maybe it's not the exact one, the one I wanted, but having that suggestion

Speaker:

can be very valuable to clarify my thinking or to even, change it.

Speaker:

Maybe it's a better implementation than I was thinking of.

Speaker:

So that's, those are the major advantages I found.

Speaker:

It always works very well in demos where you've got, the usual suspect

Speaker:

and HTTP server in a popular framework in a popular language and

Speaker:

you do something that has been done to death million times on Github.

Speaker:

how well does it work with custom code base?

Speaker:

Oftentimes you find yourself in a situation when your company has a

Speaker:

decent or a large amount of code libraries, stuff that obviously

Speaker:

wasn't trained on, because it's not in public domain, it's not on GitHub.

Speaker:

how well does it work with this kind of situations?

Speaker:

We'll face some challenges there If you're working in a very niche

Speaker:

problem, for example, you're trying to write, an API gateway or something.

Speaker:

I suppose there's probably a good open source examples out there, but if you're

Speaker:

working in a, a fairly niche industry, everything is going to be closed source.

Speaker:

you'll probably struggle, with, it's suggestions.

Speaker:

I don't think they're going to be particularly helpful.

Speaker:

although it's going to try and, in that trying, maybe it does inspire you.

Speaker:

maybe it does give you one possible implementation.

Speaker:

it's really good at just generating something.

Speaker:

and if nothing else, It can help you plan your approach.

Speaker:

and you could ask it questions in line and have it answer, one of the more

Speaker:

interesting things that I found as I was working with the Copilot specifically,

Speaker:

one of the almost magical things is you type in a question in a comment

Speaker:

and then suddenly you prompt it for an answer and it'll give you one.

Speaker:

You're like, that's fairly interesting.

Speaker:

I wouldn't have thought, to take that tack, but then I used it over and over

Speaker:

yeah.

Speaker:

there are this moments of magic and a famous quote about

Speaker:

sufficiently advanced technology being indistinguishable from magic.

Speaker:

Yeah.

Speaker:

People get that a lot.

Speaker:

And I agree with that.

Speaker:

but how does it actually work?

Speaker:

So let's say I've got my VS code open and I've got some code and it's got

Speaker:

some imports and existing code base.

Speaker:

does it upload the whole thing to OpenAI, to be able to generate the useful things?

Speaker:

What's the context that OpenAI ends up having somewhere in the training data?

Speaker:

yeah, it is going to encode, the context, which will be result in a good

Speaker:

portion of that code being uploaded.

Speaker:

OpenAI promises the pinky swear that, it's not being saved anywhere.

Speaker:

I don't think we have any way to assess whether that's true or

Speaker:

not to delve into conspiratorial thinking, but, you should be careful.

Speaker:

certainly if you're working on proprietary software but that's why there are

Speaker:

other alternatives that are entirely offline, that you can delve into if

Speaker:

you're really very privacy concerned.

Speaker:

cause yeah, frankly, we don't know how it's being used

Speaker:

once it leaves our machine.

Speaker:

we'll definitely touch base, on LLAMA and, other alternatives that

Speaker:

you discuss in your book, but.

Speaker:

is there a way to control or at least tell it, 'okay, only upload this,

Speaker:

folder', or is it just fully automatic?

Speaker:

It just decides by itself what it sends.

Speaker:

you can tell it, but is it going to honor that?

Speaker:

I don't really have a good answer.

Speaker:

Okay.

Speaker:

So something to check

Speaker:

yeah.

Speaker:

Something to check.

Speaker:

for anybody who's now browsing manning.com, there is a live version

Speaker:

where you can see elements of the book.

Speaker:

Figure 2.18 is a nice summary.

Speaker:

There is a bunch of figures, there is circle for unsupported, triangle for

Speaker:

supported, and a square for exclusively is comparing ChatGPT, just being used

Speaker:

by itself to Copilot and CodeWhisperer.

Speaker:

and it's summarizing whether it can generate methods, classes,

Speaker:

projects, generate documentation, switch languages and stuff like that.

Speaker:

So for anybody who wants to delve a little bit more into the

Speaker:

details, I think that's very handy.

Speaker:

For anybody who might be beyond that, so they went to ChatGPT, they

Speaker:

spoke to it, and they got some code, and it was a generally pleasant

Speaker:

experience, and they want more.

Speaker:

The use Copilot.

Speaker:

what's the next kind of checkpoint?

Speaker:

Where do they go from there?

Speaker:

How do they start designing software a bit, higher level

Speaker:

than just snippets of code?

Speaker:

How useful is the AI in here?

Speaker:

for example, I was designing something yesterday and I was working through

Speaker:

a conversation with ChatGPT that is one of the key things that the ChatGPT

Speaker:

excels at is helping you to design the software to really underscore that.

Speaker:

not just to design your application.

Speaker:

not just, lines of code, but it's perfectly capable of that.

Speaker:

not even just the classes, but like here's the patterns that you want to apply.

Speaker:

Here's the architecture.

Speaker:

have it generate some of the documents in a text format.

Speaker:

So plant UML or mermaid, like those are really.

Speaker:

What's those are really good, useful things, because then you can always

Speaker:

take those, save those and pass them back to ChatGPT, to refresh the context.

Speaker:

so yeah, As a co founder of, and the CTO of a startup, I found it

Speaker:

really invaluable, as a partner to help me design that software.

Speaker:

I think one of the things that really opened my eyes was that I never thought

Speaker:

to talk to ChatGPT about open source alternatives, and maybe trying to

Speaker:

select a database and talking about the different properties, like it

Speaker:

was just second nature for me to open the different docs and just start

Speaker:

comparing features and stuff like that.

Speaker:

And it never occurred to me that I can just go and ask ChatGPT because it's

Speaker:

got quite a lot of knowledge about that.

Speaker:

Yeah.

Speaker:

think in the book you're talking about, open source alternatives

Speaker:

to what you're writing, which is

Speaker:

an IT asset management, system, Actually, I don't know if this part's going to work.

Speaker:

just, so just be aware or just be advised.

Speaker:

I got a lot of feedback that it was really boring, right?

Speaker:

That people didn't like the actual project that you have

Speaker:

to work on throughout the book.

Speaker:

But I wanted it to be like a boring book on a boring topic, boring,

Speaker:

application, because most of what we write is not interesting.

Speaker:

It's we pick up data and we shuffle it and we move it around, right?

Speaker:

A lot of what we do is not exciting.

Speaker:

it was definitely intentional.

Speaker:

but, again, maybe something to fix in this in a second edition, if it's coming.

Speaker:

but one of the more interesting things about my engagement model with these

Speaker:

tools as I worked with them, to pick up on what you were saying about learning

Speaker:

more about a database or having it, help it select database or selecting

Speaker:

open source projects, Is very early on.

Speaker:

I was being extremely prescriptive.

Speaker:

I would say, create, software that's using this library in this framework

Speaker:

and, this language and all of that.

Speaker:

but later on, and even to this day, when I have a problem, I

Speaker:

feed in the business requirements.

Speaker:

And then I ask it to make recommendations for me.

Speaker:

and then I can assess those.

Speaker:

but at the very least it starts the process.

Speaker:

it gets it going.

Speaker:

so hopefully, that answered your question or was at least in the neighborhood.

Speaker:

Yeah, definitely in the neighborhood, same district.

Speaker:

Same zip code.

Speaker:

Yeah.

Speaker:

the way my mind works is that when I hear the idea of a free, junior available 24/7

Speaker:

my mind wanders to things like already mentioned docs, We hinted at tests coming

Speaker:

a bit later, but I think one of the things that are painful in more than one

Speaker:

way and people never want to do them is refactoring and upgrading to a new version

Speaker:

of something or maybe changing language, which is surprisingly labor-intensive.

Speaker:

It always ends up being more work than it looked initially.

Speaker:

How good, is the AI at the moment in this kind of things?

Speaker:

Refactor, rewriting in a different language, upgrade a library.

Speaker:

Can you just say:

Speaker:

' Hey, this is a library with a breaking change.

Speaker:

Give me the new version of the library and updated tests and everything'?

Speaker:

if it wasn't, it's post breaking change was in the training

Speaker:

data, you should be fine.

Speaker:

if not, you're going to have a more involved conversation.

Speaker:

but more generally, it does really well in translating from one language to another,

Speaker:

specifically programming languages.

Speaker:

I couldn't.

Speaker:

assess its, quality of English to French or something like that.

Speaker:

But I can tell you, there was a few examples where I was working in Python

Speaker:

and then I said, 'Oh, what would this look like in Go?' And it gave me

Speaker:

just a literal translation into Go.

Speaker:

And I was like, this doesn't feel very idiomatic, make it idiomatic

Speaker:

And it would be, as good or better than I would have written it myself.

Speaker:

so it does surprisingly well in going from one language to another.

Speaker:

it can, and then on to refactoring you can ask it for certain patterns that

Speaker:

you may want to apply as you refactor, different design schemes, like Maybe

Speaker:

I need to pull out an interface.

Speaker:

Maybe I need to, some kind of like parent class.

Speaker:

maybe this needs to be an adapter or you take your pick from the gang, the gang

Speaker:

of four, and it's, it knows them and they can provide examples in any language, that

Speaker:

you can think of that it was trained on.

Speaker:

so it can take a lot of way, a lot of that drudgery away and

Speaker:

a lot of that anxiety away.

Speaker:

One of the most important benefits that we can derive from at least their

Speaker:

current implementation of these tools and of genAI is to just keep us going,

Speaker:

to keep us motivated, to keep us engaged, to keep us building software.

Speaker:

it, it can be really mentally taxing and this can help ease some of that,

Speaker:

intellectual heavy lifting, not that we should just suborn our thinking

Speaker:

to it entirely, but it can help.

Speaker:

Did you notice any discrepancies between quality, in different languages?

Speaker:

Because what I'm picturing is that the body of training data

Speaker:

came from somewhere like GitHub,

Speaker:

probably.

Speaker:

And if you look at GitHub, there's going to be a disproportionate

Speaker:

amount of JavaScript of questionable quality too, but, you're going to

Speaker:

have probably increasing and quite significant amount of Go as well.

Speaker:

but you might not have too much,

Speaker:

I don't

Speaker:

Haskell Yeah, Haskell,

Speaker:

I don't know, SQL, whatever it is.

Speaker:

did you notice anything funny about that?

Speaker:

I would say that is.

Speaker:

roughly in line with what I observed, and I didn't, I wouldn't necessarily have deep

Speaker:

dug, too deep into, very niche languages.

Speaker:

but, definitely the examples that you're gonna find in, if you're working in

Speaker:

Python, Go, JavaScript, or TypeScript.

Speaker:

like those are going to be more voluminous and likely higher quality.

Speaker:

the one time I tried to use it to write, Rust, it failed spectacularly.

Speaker:

It was beautiful.

Speaker:

It was glorious.

Speaker:

I was trying to, throw together an API gateway.

Speaker:

Just see how, just how difficult was this going to be.

Speaker:

and in Rust, I wanted something high-performance.

Speaker:

And I just, I asked it to start writing some code and it created a number of

Speaker:

files and it just, none of it works well together and it wouldn't compile.

Speaker:

Although it's Rust, so it would take a while to convince the

Speaker:

compiler that it's good enough.

Speaker:

but yeah, it was, not the most pleasant experience.

Speaker:

But also, to be fair, at the time, I only spent a few hours learning

Speaker:

the basic syntax of Rust, so I don't know really what I was expecting.

Speaker:

So was it ChatGPT, or was it me, or was it a mixture of both?

Speaker:

probably the latter.

Speaker:

Yeah, I think we all occasionally bump into those weird restrictions

Speaker:

based on the training data.

Speaker:

One that I keep remembering was when I wanted Midjourney to generate

Speaker:

for me a picture of Triceratops.

Speaker:

And it would give me any other dinosaur when I was asking

Speaker:

for it, but not this one.

Speaker:

It was all T-Rex and T-Rex.

Speaker:

then I started throwing random names, give me Brontosaurus, and

Speaker:

it just gave me a Brontosaurus.

Speaker:

So I was very upset at the time, I made peace with that.

Speaker:

And there are some things that just weren't in the training set and

Speaker:

they didn't emerge from training.

Speaker:

come on, a triceratops?

Speaker:

They're like the best.

Speaker:

Yeah.

Speaker:

You would think so, right?

Speaker:

Very weird.

Speaker:

Yeah.

Speaker:

And about listening about this from midjourney, this still needs fixing.

Speaker:

This is months later and you still can't get a decent triceratops.

Speaker:

I was using DALL-E and I asked her for a pug, a Pegacorn.

Speaker:

So that's a pug, a unicorn and a Pegasus.

Speaker:

And I got a pretty good one, pretty good representation.

Speaker:

And then I said, make it cute.

Speaker:

And it was the most adorable thing I've ever seen.

Speaker:

Wow.

Speaker:

but I.

Speaker:

Did not try a triceratops

Speaker:

I know what I'm going to do after this.

Speaker:

Yes, I encourage everyone to go create their own Pug-a-peg-a-corn

Speaker:

exactly.

Speaker:

Let's move to testing software.

Speaker:

so you already said a little bit about how difficult it actually is.

Speaker:

you give a more concrete example?

Speaker:

what is wrong with the test is generating some of the time?

Speaker:

in this case it was really struggling to Figure out what I was actually

Speaker:

trying to do with the test.

Speaker:

specifically, it was an integration test.

Speaker:

And so I was trying to go mostly end to end in terms of, serving data over Rust.

Speaker:

it was missing the point largely of the actual test.

Speaker:

which was very strange.

Speaker:

it was in Python, so there should've been a number of

Speaker:

instances in the training data.

Speaker:

to cover this.

Speaker:

Did you just say, I want an integration tests, test everything, or did you

Speaker:

describe more, end to end I would like the data to flow through the whole thing?

Speaker:

yeah, I felt it was fairly, comprehensive.

Speaker:

I think at that point it's, it was, specifically the test was, Copilot or

Speaker:

I was having Copilot write the test and I believe I even went to ChatGPT and

Speaker:

asked it, 'how would I write a prompt to get Copilot to do an integration

Speaker:

test, end-to-end test for fast API.

Speaker:

and the payload would look like this.

Speaker:

I eventually started having ChatGPT write my prompts for me,

Speaker:

which it did surprisingly well.

Speaker:

And it's meta.

Speaker:

that's very meta.

Speaker:

Okay.

Speaker:

Was there a really good use case in terms of testing?

Speaker:

Unit test it was perfectly fine at, even some cases where I felt

Speaker:

it should have gotten stuck.

Speaker:

so I had a number of, again, not to get too specific, about the actual, the ITAM,

Speaker:

the IT asset management project that is all throughout the corpus of the book,

Speaker:

there's a number of, in accounting, assets depreciated at a certain rate and general

Speaker:

accepted accounting practices outlines a few different ways that you can do it.

Speaker:

and so I, I used a strategy pattern and I had a number of different

Speaker:

ways, that each of the two, to calculate that depreciation.

Speaker:

So the depreciation of the asset, maybe it's straight line.

Speaker:

So it's like over five years.

Speaker:

So one fifth of the value is lost every year and you can write part of that off,

Speaker:

but again, I'm not an accountant, so this does not count as, financial advice, but,

Speaker:

or

Speaker:

or medical, yeah, I'm not a doctor, but it works surprisingly

Speaker:

well, I was pleasantly surprised.

Speaker:

Fair enough.

Speaker:

So we've written some code, we've designed some software.

Speaker:

Let's say that we tested it for the most part.

Speaker:

but the reality of it is that we're probably going to spend more time

Speaker:

reading code and understanding code.

Speaker:

perhaps the code that we wrote a couple of years back.

Speaker:

Yeah.

Speaker:

how well does the part of describing existing code actually work at the moment?

Speaker:

Yeah, it works surprisingly well into, translating the code that you wrote

Speaker:

into giving it a very, simplified answers, descriptions of here's

Speaker:

how it, here's how it's functioned.

Speaker:

Here's how it's working.

Speaker:

here's what it expects.

Speaker:

You can even have it describe an entire system to you.

Speaker:

I have not, I did not though attempt to do, what is probably one of the

Speaker:

hardest things, within that space.

Speaker:

And that is, I didn't feed it a Perl program and ask it what it actually did.

Speaker:

I have this feeling it probably would have broken a ChatGPT.

Speaker:

Sorry.

Speaker:

Takin pot shots at Perl.

Speaker:

you should have given it a regular expression in

Speaker:

Oof.

Speaker:

and try to see what happens.

Speaker:

And then next thing you know, OpenAI's knocking at your door, kicking you out.

Speaker:

That sounds about right.

Speaker:

Or, T100 is just kickin in the door.

Speaker:

I guess in my mind there's this limitation of the amount of context,

Speaker:

length that you can feed it, right?

Speaker:

So if your code base becomes significantly large.

Speaker:

Is that not going to be a problem by getting, to get it to even describe it.

Speaker:

to describe your entire codebase, yes.

Speaker:

but you can start to chunk it up.

Speaker:

you can work around that limitation by sending it only pieces.

Speaker:

And you're probably not going to get the full context there but

Speaker:

it can help guide your intuition.

Speaker:

that's why if you have some kind of class diagram or, some architectural

Speaker:

diagram that's text-based.

Speaker:

so like plain UML, then you can distill your entire, code

Speaker:

base into a single document.

Speaker:

Now it's still, again, might be, if it's a code base of thousands of classes,

Speaker:

you could still hit those limitations, but it's going to be your best bet

Speaker:

to get a distillation in natural language, what your classes or what

Speaker:

your code is trying to attempt to do.

Speaker:

it really does excel at, method by method descriptions of what this does.

Speaker:

between manual browsing through the code and trying to understand the intent and

Speaker:

Jarvis, it's halfway through, right?

Speaker:

It's not quite, here's the intent and here's what it imported and

Speaker:

here's my recommendations, Mr.

Speaker:

Stark.

Speaker:

It's more here's, I can ask about this method.

Speaker:

Can't be bothered to read it.

Speaker:

It's 2000 lines and it can give me the gist of

Speaker:

Exactly.

Speaker:

Exactly.

Speaker:

there's also the security aspect that, you're discussing in one of the chapters.

Speaker:

Can you talk about that a little bit?

Speaker:

Yeah, and actually there's, a funny, story that's embedded in that too.

Speaker:

it's good at, picking up on what we were just talking about it, the non

Speaker:

exclusive path, it can explain ways that your code might be exploited.

Speaker:

it's not the same as having a, security expert on your team.

Speaker:

it will miss things.

Speaker:

but it's definitely better than nothing.

Speaker:

and it can make some pretty great, recommendations in terms of,

Speaker:

how you can structure your code.

Speaker:

one of the funny things, I really wanted an example of,

Speaker:

a SQL injection in the book.

Speaker:

So I actually asked, ChatGPT to give me an example of a SQL injection.

Speaker:

but it wouldn't.

Speaker:

No matter how I tried to coerce it, no matter how I, No, I swear

Speaker:

I'm not doing this for evil.

Speaker:

This is just for illustrative purposes only.

Speaker:

And, it just would not give me a valid, SQL injection exploit

Speaker:

that I could include in the book.

Speaker:

so do with that as you will.

Speaker:

Yeah.

Speaker:

that's a very interesting ethical, discussion about that.

Speaker:

There's probably gonna be some way you can, I don't know if you heard about

Speaker:

that exploit where, if you asked it to do something nefarious, it would say no, but

Speaker:

if you asked it to do something and that something equals ASCII art of something

Speaker:

nefarious, there was no problem at all.

Speaker:

So I suspect, like it's very hard to Limit a model like that because there's endless

Speaker:

opportunities to express it differently and you only need one of them to work.

Speaker:

So an interesting one.

Speaker:

So it will let you do an SQL injection even when your pinkies

Speaker:

were, it was for the good,

Speaker:

And I tried to do that, it was an exploit a little bit earlier on

Speaker:

where you could give it a person, a persona of Dan and Dan is allowed

Speaker:

to do things that ChatGPT isn't.

Speaker:

And it's still, it wouldn't let me do it as, I think it

Speaker:

was Dan and it was an acronym.

Speaker:

but yeah, similar thing, but maybe I should have tried out ASCII art next time.

Speaker:

but listeners do not intentionally put, SQL injection exploits in your code.

Speaker:

oh yeah, that needed to be said.

Speaker:

What are some of the examples of what he was able to, figure

Speaker:

out from your code, in terms of security holes and stuff like that?

Speaker:

Do you have any interesting examples of success?

Speaker:

What did it actually find?

Speaker:

I would have to go back and consult, the book.

Speaker:

my code is just so good that there was no exploits to be made.

Speaker:

No, that's not

Speaker:

And there you go.

Speaker:

So Nathan doesn't want to share too much

Speaker:

about the book.

Speaker:

You're going to have to go and buy it.

Speaker:

of the, yeah, one of the things I wanted, I should have mentioned up front is,

Speaker:

this was the first time, that I had ever built like a true application in Python.

Speaker:

I had used it for scripts previously, just, to do something, but, text

Speaker:

modification, things like that.

Speaker:

But I never built an actual application.

Speaker:

it actually helped me learn how to build applications while using it..

Speaker:

There's a book called.

Speaker:

Octopus, my teacher, I guess for you, it's more like ChatGPT, my teacher.

Speaker:

All right.

Speaker:

that's good.

Speaker:

we've covered, I think most of the big chunks other than

Speaker:

actually running the software.

Speaker:

let's say that, it runs, we package that.

Speaker:

And then we've got things like Docker, Terraform, the YAML hell that comes

Speaker:

with Kubernetes on one hand, I would expect that this is fairly repetitive.

Speaker:

so ChatGPT would excel.

Speaker:

It's not a very tricky language.

Speaker:

It's just very verbose, and the white spaces make your life miserable.

Speaker:

How good is it with that kind of stuff?

Speaker:

it was actually really good with working out YAML and making, just different

Speaker:

scripts, helping build out, dev pipelines through GitHub actions, things like that.

Speaker:

it did really well.

Speaker:

one of the very interesting things that I discovered though,

Speaker:

was CodeWhisperer, the AWS.

Speaker:

generative AI, large language model actually doesn't support

Speaker:

anything but programming languages.

Speaker:

So it didn't even understand, how to do like Terraform

Speaker:

infrastructure as code, which you'd think it would be very good at.

Speaker:

that was a bit surprising.

Speaker:

but it's by design, it's intentional.

Speaker:

it's hard to see it as a limitation

Speaker:

Curious.

Speaker:

So do they have another tool for the YAMLs of the world?

Speaker:

Or they just out-of-scope'd it.

Speaker:

Yeah, just outta scoped, I didn't try the, what is it, cloud, not

Speaker:

CloudFront, but whatever their, their deployment based, their, code as, or

Speaker:

infrastructure as code, specific thing.

Speaker:

I didn't try that.

Speaker:

maybe I should have.

Speaker:

but, I was.

Speaker:

Shocked.

Speaker:

I think I even mentioned that in the book, that I had originally

Speaker:

intended that chapter to be written in using CodeWhisperer.

Speaker:

let's say, for example, you want a quick Docker file, you can write it.

Speaker:

It's not too hard, but why do it if you can't get it for free?

Speaker:

So what you open a Docker file and you write in a comment what you want it to

Speaker:

do and you hit tab and the magic happens.

Speaker:

That's roughly what you need to do.

Speaker:

Yeah, roughly.

Speaker:

add a prompt as it were in a comment.

Speaker:

it doesn't have to be a comment.

Speaker:

you can just add the prompt and then later delete it.

Speaker:

have it generate the Docker file for you or the, the Kubernetes file.

Speaker:

I don't know if I tried using, patterns in a Terraform, just

Speaker:

to ease some of the repetition.

Speaker:

not necessarily have, the sprawling mess that Terraform can become.

Speaker:

but, I'm sure it could accommodate that as well.

Speaker:

it's both, Copilot and ChatGPT did seem to have extensive knowledge

Speaker:

of Terraform syntax and features.

Speaker:

So that all together adds up to a pretty competent, junior developer, like you

Speaker:

described it at the beginning that you need to supervise, but it can do a lot of

Speaker:

the legwork for you and much faster too.

Speaker:

did you follow Devin, the supposedly first AI-driven coworker,

Speaker:

no, that's interesting.

Speaker:

Tell me more.

Speaker:

it was a few weeks ago, they made this big announcement.

Speaker:

There was a video with a demo showing basically doing the

Speaker:

whole thing from scratch.

Speaker:

So not only did it do the Copilot stuff, but it bootstrapped the whole project,

Speaker:

generated all the files and had, like a browser window that had access to as well.

Speaker:

to actually go and, verify that it works.

Speaker:

Obviously, it was doing something that it always does in these demos, which

Speaker:

is, an HTTP server with a REST API.

Speaker:

Which is cheating if you ask me.

Speaker:

a lot of people were very impressed and there was a lot of, angst among people

Speaker:

on the internet arguing over whether this is the end of software engineering

Speaker:

as we know it, or whether it's a scam.

Speaker:

And then, a few days ago, there was a critique that resurfaced about Devin and

Speaker:

that entire project and whether he was, A little bit polished up in the demo,

Speaker:

In other words, just a typical software demo.

Speaker:

Yeah, that's just typical software demo.

Speaker:

So I think we have a similar problem to maybe different stakes, but to

Speaker:

self driving cars that it can't be like 95% good without supervision has

Speaker:

to be like, I don't know, 99% good.

Speaker:

before we can live it with our supervision and sure, a badly written API.

Speaker:

Probably most of the time it's not gonna get anybody killed, fingers crossed, but,

Speaker:

you still need that supervision, right?

Speaker:

and Devin was supposed to do away with that.

Speaker:

So I'm looking forward to seeing how that story develops and

Speaker:

how they answer the critique.

Speaker:

And I guess when people can actually go and play it, we'll

Speaker:

find out whether, was all fluff.

Speaker:

it's another

Speaker:

Yeah, no, that's interesting.

Speaker:

I have been following, there was a story not long ago, that, because of

Speaker:

the proliferation of, ChatGPT and, and Copilot and alike that the software

Speaker:

has been getting less and less secure.

Speaker:

and, Because, it is easy for, bugs to introduce themselves if you're

Speaker:

really just copying and pasting.

Speaker:

So that's why, we're not at that stage yet.

Speaker:

We may never be.

Speaker:

Where it's just, there's no human in the loop, right?

Speaker:

For these things that it can just generate code on its own

Speaker:

and, push it to production.

Speaker:

the role of a professional developer is here to stay for the foreseeable future.

Speaker:

We're just going to be better at what we do.

Speaker:

again, if we're mindful and not allowing these bugs to just creep in.

Speaker:

I like the optimistic point of view here, but yeah, a lot of people

Speaker:

I think would agree with you that this is like it was going to happen.

Speaker:

Although when you look at some of the software, you do wonder how much of

Speaker:

that was actually supervised and how much was just dumped automatically.

Speaker:

But that's for another story altogether.

Speaker:

Let's talk about the local LLMs, and how good they are by comparison, because

Speaker:

I've poked both, but I've never really done like a side by side comparison

Speaker:

to really tell how good they are.

Speaker:

You used Llama 2, I think, and OpenOrca, and done some side-by-side comparison.

Speaker:

So how good are they compared to what you get with Copilot?

Speaker:

Actually, this was probably my favorite chapter, to write

Speaker:

and to do the research on.

Speaker:

It was just super fun.

Speaker:

it was an old Lema 2 model.

Speaker:

it was generations old at this point.

Speaker:

so I've been meaning to revisit it.

Speaker:

it produced competent, text.

Speaker:

natural language processing, give me a description of this.

Speaker:

the code quality was not great.

Speaker:

but again, I'm sure this was, six months ago or plus.

Speaker:

So I'm sure that the model is 10 times better now.

Speaker:

So definitely worth revisiting.

Speaker:

yeah, I would say on balance, most of the models that I was running, that I was

Speaker:

running locally did not perform as well.

Speaker:

but they performed competently.

Speaker:

so if you were in a pinch and you didn't have access to the internet and you'd had

Speaker:

some foresight and downloaded these models prior, you could still get the job done.

Speaker:

but they wouldn't necessarily be my go to.

Speaker:

although I did, yeah, I did, just recently.

Speaker:

Redownload a new model and it does seem to be much better at this point.

Speaker:

I think it was Mistral.

Speaker:

that's one of the more interesting areas, in my mind, because that helps

Speaker:

get around some of the unknowns.

Speaker:

because it, I did, I turned off my wifi.

Speaker:

I pulled the network cable.

Speaker:

I made sure that I was entirely off the network, prior to using them

Speaker:

because I wanted to make sure no context was leaving my computer.

Speaker:

Privacy, of your code of, personal data is a primary concern.

Speaker:

it's probably the best option out there today.

Speaker:

I think this is a really good argument for that.

Speaker:

A lot of people will be in a situation where their employers are just not

Speaker:

comfortable with just going somewhere.

Speaker:

no matter what pinky swears, you got.

Speaker:

And I think that this opens, like the remainder of the

Speaker:

market that really matters.

Speaker:

And I think we're all waiting for Llama 3 to drop, any week now.

Speaker:

I'm just worried that it might be too big to run comfortably on your M3.

Speaker:

even with quantization, but, let's see, it might actually not increase

Speaker:

in size yet become more competent.

Speaker:

There are other models too, like Santa Coder, I think at some

Speaker:

point was very popular as well.

Speaker:

Did, you manage to get a workflow that's more like Copilot and

Speaker:

less like chatting to ChatGPT?

Speaker:

that would be a really good challenge.

Speaker:

To try to turn one of these into a more of a Copilot model.

Speaker:

it was ultimately one that I just, couldn't get done in time.

Speaker:

um, so no,

Speaker:

hopefully you knew what you were getting yourself into, but you

Speaker:

wrote an AI book, so you have to update it every three months now.

Speaker:

that is true.

Speaker:

one of my favorite titles, of late, was, another Manning book, but it

Speaker:

was the completely, out of date or the complete, yeah, obsolete.

Speaker:

Exactly.

Speaker:

A book on, generative AI.

Speaker:

I thought that was really clever.

Speaker:

lean into it,

Speaker:

yeah, that's a book by David Clinton.

Speaker:

we had him on the podcast, a couple of weeks ago and, I think it's

Speaker:

called the Complete Obsolete Guide to generative AI, it's hilarious.

Speaker:

I had so much fun actually just reading that.

Speaker:

the humor is, is nice level there.

Speaker:

definitely highest possible recommendation buy that book over mine.

Speaker:

probably worth mentioning when Devin thing was going on, there was

Speaker:

some kind of open source response.

Speaker:

I think it was SWE, OpenSWE, something like that.

Speaker:

it's probably easy google'able and I haven't gotten to actually

Speaker:

testing out, but I was supposed to.

Speaker:

I think what we really need to get to is to get one of those open models

Speaker:

to behave, 85% as well as Copilot.

Speaker:

And then it's basically game over.

Speaker:

If I can run it on your laptop, there's no subscription, there's no data leaving.

Speaker:

it's a no brainer at that stage.

Speaker:

what are we doing with the CPU cycles and the GPU cycles on the

Speaker:

MacBook when we're developing?

Speaker:

Anyway.

Speaker:

Yeah.

Speaker:

there

Speaker:

Yeah.

Speaker:

And it just makes, it makes sense like from a corporate,

Speaker:

decision making you could host it, not necessarily, centralized

Speaker:

host training off your own data.

Speaker:

like that's, yeah, that's really game over.

Speaker:

although, yeah,

Speaker:

Another alternative way of saying that.

Speaker:

It's just the beginning of the fun.

Speaker:

yeah.

Speaker:

Beginning of the arms race.

Speaker:

Yeah.

Speaker:

What's the next thing here?

Speaker:

What do you expect to come in the coming months and years, Obviously

Speaker:

wild predictions, all the usual disclaimers, but what's your take?

Speaker:

What's next in coding and AI?

Speaker:

I'm gonna give a boring answer.

Speaker:

I think it's just gonna be incremental improvement.

Speaker:

AGI if it's possible is a long No.

Speaker:

what is, what are they calling it?

Speaker:

Artificial General Intelligence, yeah, AGI.

Speaker:

Yeah.

Speaker:

AGI, it.

Speaker:

I think that's a ways off if at all, if it's even feasible, we'll see incremental

Speaker:

improvements, where the models, I don't want to say hallucinate less because

Speaker:

that's that's a feature, not a bug, but, where they get, more and more

Speaker:

refined, the output, It becomes, more, more timely, we're starting to see where

Speaker:

it can actually connect live to the internet so I just think there's going

Speaker:

to be incremental advances like that.

Speaker:

until there's a real breakthrough and that will, change the game, like in

Speaker:

the same way that, the transformer changed the way that we did the natural

Speaker:

language processing and text generation.

Speaker:

And, until there's something like that, it's just going to

Speaker:

be just incremental improvement.

Speaker:

So life goes on, NVIDIA makes more chips, they make faster chips, they become even

Speaker:

bigger and they leave the gamers behind even further and we make bigger models, we

Speaker:

train them better and we get a little bit closer to a superstar junior developer.

Speaker:

Is that roughly what we're talking about here?

Speaker:

that's what I would predict.

Speaker:

I'm, happy to be wrong Unless it's completely detrimental to all of

Speaker:

our fine men and women out there, giving their blood, sweat and tears

Speaker:

every day in developing software.

Speaker:

Not investment advice, by

Speaker:

Yes, exactly.

Speaker:

another disclaimer for our us-based, clientele.

Speaker:

And what's next for you, Nathan, do you have an eye for the next book?

Speaker:

I have a couple of ideas brewing.

Speaker:

I really am going to avoid doing just a second edition of this.

Speaker:

even though I've alluded to it several times, but I have a couple of ideas

Speaker:

that are really brewing and in terms of okay, so now we know how to use them.

Speaker:

we've had some practice like now let's apply it to very

Speaker:

specific, very niche problems.

Speaker:

and are there ways that we can extend it?

Speaker:

Are there ways that we can, train it on our own data, things like that.

Speaker:

that's where I would see the next logical, area for me to move into.

Speaker:

but I would still want something very practical.

Speaker:

so yeah, I'll probably just wind up doing a second edition.

Speaker:

the book once again is called "AI-Powered Developer".

Speaker:

It's published by Manning, which means that it has been available in

Speaker:

the early access for a while now.

Speaker:

So if you go to manning.com, you can get immediate access and start reading that.

Speaker:

It's currently in production, which means that it's going to take a little bit of

Speaker:

time before it actually hits places like Amazon in a physical copy and before

Speaker:

Nathan can have a party to celebrate that.

Speaker:

my guest was Nathan B.

Speaker:

Crocker, the co founder and CTO at Checkr.

Speaker:

Thank you very much, Nathan.

Speaker:

I'll see you

Speaker:

Thank you.

Speaker:

It was a pleasure.

Speaker:

Be well.

Links

Chapters

Video

More from YouTube