Help us become the #1 Data Podcast by leaving a rating & review! We are 67 reviews away!
Anthropic just dropped their entire internal data playbook. Here's what they're doing and how it affects your career.
π Join 30k+ aspiring data analysts & get my tips in your inbox weekly π https://datacareerjumpstart.com/newsletter
π Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training π https://datacareerjumpstart.com/training
π©βπ» Want to land a data job in less than 90 days? π https://datacareerjumpstart.com/daa
π Ace The Interview with Confidence π https://datacareerjumpstart.com/interviewsimulator
π Read Anthropic's full data playbook π https://claude.com/blog/how-anthropic-enables-self-service-data-analytics-with-claude
β TIMESTAMPS
00:00 β Anthropic dropped their data playbook
02:39 β Why AI analytics keeps failing
05:24 β How they hit 95% accuracy
09:24 β What a Claude skill is
14:39 β None of this is actually new
17:09 β Still hiring data people
π CONNECT WITH AVERY
π₯ YouTube Channel
π€ LinkedIn
πΈ Instagram
π΅ TikTok
π» Website
Mentioned in this episode:
July Cohort of DAA
Join the July Cohort of DAA and become an analyst! Be sure to check out our current deal to save BIG! See you in class!
Avery Smith-3: So Anthropic, the makers
of Claude, literally just dropped an
2
:absolute masterclass on how they analyze
data internally, and they posted a blog
3
:post that is four thousand five hundred
words, and there's a lot in there.
4
:So I summarized that entire blog
post, and I will explain it to you
5
:like you're five years old in today's
episode, and literally you can steal
6
:it and learn how to analyze data
just like a Claude data analyst.
7
:So this is what Claude
is actually claiming.
8
:They're claiming that they now
do self-serve analytics, which
9
:is kind of a funny phrase.
10
:Basically, it means allowing non-technical
people, non-data analysts to do data
11
:analytics in easy ways, and this has
been a thing for the last decade or so.
12
:In fact, it's one of the main reasons
why Tableau and Power BI became
13
:so important with dashboards is it
allows business people, non-technical
14
:people to actually kind of analyze
their data in predefined ways.
15
:It's been really hard to do for
the last ten, fifteen years.
16
:Now, basically, Anthropic just
tweeted that they are able to do
17
:ninety-five percent accuracy on
all of their business analytics
18
:queries with Claude, which is crazy.
19
:That basically means that w- if anyone has
some sort of an analytics question, they
20
:can answer it now with ninety-five percent
accuracy using this internal playbook.
21
:So what are they actually doing, and
how can you replicate it in your own
22
:organization, or how can you bring
this to an interview to make you a
23
:more marketable aspiring data analyst?
24
:So- Basically, like I said, self-serve
analytics has always kind of sucked.
25
:It's when non-technical people are
analyzing the data sets, and there's
26
:basically two different ways to do it.
27
:Option A is you open up to everyone,
which basically means you have
28
:non-data analyst people trying to
analyze data, and a lot can go wrong.
29
:You can get really messy,
different queries, maybe
30
:messy dashboards, conflicting
definitions, those type of things.
31
:Or you lock it all down, which basically
means that, uh, you create a bajillion
32
:different types of dashboards, but it
never really answers anyone's question
33
:when they want it the way they want it.
34
:And, uh, that's been, that's
been tricky in the past.
35
:So now there's AI, and now you
can give, you know, Claudeβ¦
36
:You can give someone Claude or
ChatGPT and point it to a database,
37
:and you can have them ask ChatGPT
or Claude questions to the database.
38
:Uh, but there's a big issue.
39
:Number one, that we all think the
AI doesn't hallucinate, doesn't
40
:lie, doesn't make things up, and
it does, and it can be wrong.
41
:Uh, and number two, it gives everyone
like, "Oh, this is a hundred percent
42
:accuracy," but it's, it's not, and
that can cause a lot of issues.
43
:So, um, you know, AI is a great
solution for self-serve analytics, but
44
:it causes a lot of problems as well.
45
:So how did, uh, Anthropic
actually solve it?
46
:Because what they're claiming
that ninety-five percent of their
47
:business analytics queries are now
automatedly solved by Claude, and
48
:they're ninety-five percent accuracy,
accurate, um, which is a big claim.
49
:Like that's basically like,
"Hey, Claude is now our company
50
:data analyst, essentially."
51
:Now, I, I will mention here, um, that
the data team can now work on bigger
52
:and better problems that are like
less sequel monkey questions, right?
53
:Um, so it's not like they're
getting rid of their data
54
:analyst or their data scientists.
55
:It's just you don't have to do as
many ad hoc reportings, and you can
56
:just focus on more important things.
57
:And just managing this Claude
infrastructure of creating this
58
:company-wide, uh, self-serve
analytics platform is a beast, and
59
:we'll get to that here in a second.
60
:Um, basically, in this article,
their thesis is data is very
61
:different than software.
62
:If you've, you know, heard about Claude
or Codex, um, for programming and software
63
:engineering, it can do those things
really, really well out of the box.
64
:Um, because coding has
lots of right answers.
65
:There's ways to test things.
66
:There's documentation that goes with code.
67
:Um, and all those, you know,
infrastructure can basically
68
:catch hallucinations.
69
:It's a more solved problem.
70
:Analytics, it's quite a bit
different because there's only one
71
:right answer, and you don't really
know what the right answer is.
72
:There's no way to actually test
what the answer is versus i-in
73
:programming, you're like, "Does this
box open up if I click the button?"
74
:You can test that.
75
:There's no way to know, like if I ask
Claude for the m- you know, the mean
76
:of our sales over the last month, you
really have to like go actually run the
77
:query yourself To make sure that Claude's
not giving you, uh, a false answer.
78
:So, um, their, their argument
is we're not having issues
79
:coming up with code generation.
80
:It's basically all of the context
and verification that goes around
81
:solving a business analytics problem.
82
:And LLMs historically have been pretty bad
at this, uh, for a multitude of reasons.
83
:One is that we give it unclear directions.
84
:I don't know about you guys, but if
you're anything, uh, like me, you don't
85
:necessarily give Claude or ChatGPT the
most specific instructions on planet
86
:Earth, and there's some ambiguity.
87
:And the problem with that is, like,
it can go into the database and, like,
88
:it thinks it knows what you're talking
about, but it finds a different column,
89
:or it's not using the same definition.
90
:You're not basically on the
same page as ChatGPT unless you
91
:give really explicit directions.
92
:Number two, there's data staleness,
which basically means that your database
93
:is constantly changing, uh, over time.
94
:Definitions change, tables change,
and, uh, these AI LLMs, they're not
95
:really good at following with that.
96
:Like, they don't have the business
context, the domain context that you
97
:may have as a human being on the other
side of like, "This is why we made
98
:those changes," you know, "This is
why it's better," so on and so forth.
99
:And then number three is it just doesn't
know where to find the right thing.
100
:Like, it thinks the data's in there,
it's looking, but it's not entirely sure
101
:Avery Smith-4: So here's what
Anthropic did to try to solve this
102
:problem, and they're calling it
Anthropic's Agent Analytics Stack.
103
:And there's basically four
different stages right here, and
104
:each one is built to try to take
one of those previous problems
105
:that we talked about and solve it.
106
:So the first one is data foundations,
and basically, it just means you
107
:have really solid data foundations.
108
:It means you're very clear on what
a table is, what it actually has,
109
:what a row represents, what a column
represents, and how often it's updated.
110
:Um, number two is you only have one
source of truth, and the idea is
111
:if you have a sales table in your
database, you don't have, like,
112
:another sales table in your database.
113
:Like, there's only one sales
table, and that is the sales table.
114
:There are no other sales tables.
115
:And for some of you guys listening
who might be more junior data analysts
116
:or aspiring data analysts might be
thinking, "Well, that makes sense.
117
:Why would it ever be a different case?"
118
:And the issue is when you get to, like,
large organizations, something like
119
:Anthropic or when I worked at ExxonMobil,
you gotta think that there's literally
120
:seventy thousand plus employees, and all
of them might need access to that table,
121
:and they might need it slightly different.
122
:So you might have someone that's
like, "Oh, this is their sales
123
:table, but we only need the weekly
averages," so they create, you know,
124
:the weekly average sales table.
125
:And then there's someone else who's like,
"Oh, well, we actually only need the
126
:sales from Monday, Wednesday and Friday,"
and so they create this other table.
127
:And basically, you just get a bajillion
versions of really the same table.
128
:So, uh, one source of truth,
really important here.
129
:Number three, they develop skills.
130
:These are like Claude skills for
LLMs that specifically do a repeated
131
:task with specific instructions and
maybe even some, uh, accompanying
132
:code to make it really repetitive.
133
:LLMs have inherent
randomness built into them.
134
:They are non-deterministic, as in
you don't get the answer every time,
135
:the same answer every time you ask
the same question, and skills helps
136
:make it more deterministic, that
there actually is a specific answer.
137
:This is exactly what you should be doing.
138
:So it's basically like instructions and
almost code files to actually follow
139
:every single time this gets asked.
140
:And the fourth one is validation, and
that is making sure that the LLMs are
141
:actually doing what you think they
are and validating their answers.
142
:So let's dive in a little bit deeper.
143
:So like I said, uh, layers one and layers
two, basically this is just having good
144
:data governance and good data foundations.
145
:One source of truth.
146
:Um, they also make sure that they have
like little, uh, descriptions for each one
147
:of your different tables that describes
what the table is and what it isn't.
148
:Uh, you know, LLMs are really good
at reading text, so if you add a
149
:little bit of text with your tables
that explains what's going on, the
150
:LLM understands the context a little
bit better versus just looking at the
151
:rows and the columns and guessing.
152
:Um, you can think of this as
like a README file for your data.
153
:In code, in building software, in
software engineering, in programming,
154
:we've always had README files.
155
:If you're unfamiliar, a README
file, you can just think of it
156
:as like a summary of the actual
what's going on in your code base.
157
:Like all of these different folders,
all these different files, all these
158
:different code scripts, what's going on.
159
:So it's just a human way to
describe what's going on for
160
:your code or your different, you
know, databases in this case.
161
:And they also feed it
company knowledge maps.
162
:So for this system, they give it roadmaps,
org charts, decisions, so like a bunch
163
:of business context that isn't data.
164
:It's not data related.
165
:It's all business and domain related,
but that extra information helps the
166
:LLMs make smarter choices on how to
actually analyzing the da-- how to
167
:analyze the data based off of what
the, what the context says So they
168
:actually tried an experiment here,
which I thought was really interesting,
169
:where they basically took all the data
analysts' and all the data scientists'
170
:old sequel files, and they said, "Here,
Claude, you know, learn from these.
171
:These are all, all the things that
our engineers and our analysts and our
172
:data scientists have done over time.
173
:Uh, learn from it."
174
:And it actually didn't really
help, which was really interesting.
175
:Um, it didn't know what code to use when.
176
:Um, and they found that there
was a right answer eighty percent
177
:of the time, but Claude wasn't
good at pulling that answer out.
178
:And so what's actually been the biggest
skill, uh, uh, I guess the biggest,
179
:uh, unlock is actually having skills.
180
:And that went from twenty-one
percent accuracy in actually
181
:analyzing data to ninety-five
percent accuracy in analyzing data.
182
:And if you're unfamiliar with,
like, what a Claude skill is, or
183
:I think they have some equivalent
in ChatGPT and OpenAI and Codex.
184
:But basically a, an LLM skill,
an AI skill is a reusable
185
:step-by-step pattern to follow.
186
:Think of it almost like a recipe for
AI LLM models to actually follow.
187
:So like I said, majority of the
time they're written kind of like
188
:a human would write them, and it's
just like, "Hey, AI, do exactly this.
189
:Step one, step two, step three.
190
:Look out for this.
191
:Be aware of this."
192
:And it might have some coding files
specifically like, "This is what your code
193
:should look like if you generate code."
194
:Um, so theyβ¦
195
:It, it's, it's essentially what a
senior analyst's thoughts written down
196
:on paper, uh, for a specific task.
197
:So you might have a skill on how
to, you know, create a, a bar, a
198
:bar chart, or you might have a skill
on how to do a hypothesis test or
199
:AB testing or something like that.
200
:And it's basically like you have
your, your team get together and write
201
:down exactly what the process is.
202
:It's like a standard operating procedure
that you'd give to a junior analyst,
203
:"Hey, follow this," except for now the
junior data analyst is Claude or an AI One
204
:issue they saw was if you don't actually
update these skills, like if you don't
205
:like constantly add to them and improve
them, that the accuracy slides over time.
206
:They actually were at ninety-five
percent accuracy, and then they
207
:jumped down to sixty-five percent
accuracy in only a few weeks.
208
:Um, so you need to make sure
you're updating your skills.
209
:And the last thing is they wanted to make
sure that their skills were everywhere.
210
:So analytics is really changing.
211
:Uh, and this-- You probably haven't
seen this in big organizations now.
212
:It's just kind of rolling out to
maybe, you know, these more frontier
213
:trillion-dollar companies, um, and maybe
like small solopreneurs like, like me.
214
:Um, but the way that we do
data analytics is changing.
215
:So obviously, like in the past, you'd use
Excel to do data analytics, and there's
216
:still literally billions and billions of
Excel files that we will analyze in Excel.
217
:Uh, but gradually, you know, ten,
fifteen years down the road, I'm
218
:not sure if that will be the case.
219
:We will probably be analyzing data
in a different way than we are now.
220
:And before you're really scared and like,
"Oh my gosh, this is awful, AI's coming
221
:for my job," well, just think about this.
222
:Uh, basically, Power BI
came out fifteen years ago.
223
:So fifteen years ago, there were
like basically no dashboards.
224
:Tableau was around, but not super popular
at the time, yet it was about to be.
225
:Uh, about twenty eighteen it
started to get really popular.
226
:So it's just like, yes, the way that
we analyze data changes over a decade.
227
:That's the truth.
228
:Um, and just know that right now we are
moving into, you know, analyzing our data
229
:with these chatbots, and those chatbots
may be in multiple different places.
230
:So for example, at my company, um, I try
to analyze data on, you know, my YouTube
231
:watches or my podcast listens, and I've
been trying to tr- to automate that as
232
:much as I can or make it easier for me
to follow, you know, all these analytics.
233
:And so we actually have a bot that
will help me with these analytics where
234
:I can just ask it natural language
questions like, "How many, uh, views
235
:did the last YouTube video get?"
236
:You know, "How many listens
did this podcast episode get?"
237
:And we can actually do that on a website
that I've built and also in our Slack.
238
:So they want to make sure that they
have the truth and those-- these skills
239
:avail-available everywhere, whether
it's, you know, you're coding, whether
240
:you're using like a website or a
dashboard or whether you're in Slack.
241
:So those are the keys to having
good skills in your organization.
242
:And the last thing is, even
if it has a good skill, how
243
:do you know that it's correct?
244
:And that's what we call verifications.
245
:And so what, what Anthropic's doing, what
Claude's doing is for any analytics they
246
:do, they have the sources in the footer.
247
:Like this is where we
got this information.
248
:This is how we calculate it.
249
:This is the table we used.
250
:Um, so that way it's like very clear
that you could look at the table and
251
:be like, "Oh, that is the right table,"
or, "It's not even the right table."
252
:They also have a freshness
and a version stamp on every
253
:data model and how old it is.
254
:So like think about like i- if
your data changes over time.
255
:They're basically timestamping
everything, so that way you know,
256
:okay, we can trace it back to this
database on this day type of a thing.
257
:Uh, they're also doing correction
harvesting, which is a really fancy way
258
:to say they're giving the AI feedback.
259
:So every time that this Claude
data analyst gets something wrong,
260
:the humans are saying, "Hey,
you actually did this wrong.
261
:You know, you're supposed to
grab from database A, and you
262
:grabbed it from database B."
263
:Or maybe you, you know, you
did your query wrong some way.
264
:And every time that feedback goes
from the human to the agent, the agent
265
:actually updates itself, and it's
like, "Oh, okay, I'm gonna mark that
266
:as something to try in the future."
267
:And the last thing they add is basically
before it gives any answer back to the
268
:human, they run a second agent against
it that's called an adversarial review.
269
:And basically, if, if you are the AI
data analyst and you come up with an
270
:answer and you're like, "The average over
the last, you know, the average revenue
271
:over the last month was thirty thousand
dollars," this ad-adversarial re-review
272
:comes in and says, "Is it though?
273
:Like, does, does that actually make sense?
274
:Uh, like, it's been this for the last
month and this for the last month.
275
:Are you a hundred percent sure?"
276
:Um, it's basically trying to prove
the first agent incorrect before
277
:actually giving them the model,
the information to the human.
278
:So that way, it's like almost like a peer
review, a double check from an agent to
279
:actually make sure that the analytics
is correct So this might be really
280
:interesting to some of you guys, and this
might be really scary to some of you guys.
281
:You're like, "Oh my gosh, these
AI agents are coming for my job."
282
:Well, the first thing I'll tell you
that none of this is actually new.
283
:It's just kind of packaged
in a fancy prettified way.
284
:Like, if you literally take AI out of
this, it's just pure data fundamentals,
285
:things that we've had for decades.
286
:We've talked about this for years.
287
:Like, yes, it's good to
have good data quality.
288
:Yes, it's good to have good data
governance, like to actually know
289
:what, what tables mean and what
columns mean and what rows mean.
290
:Yes, we should repeat
our analysis when we can.
291
:If we can analyze the data in a
uniform way, we should do that.
292
:And yes, we should have verification.
293
:Like, if I do an analysis,
someone else should check it
294
:to make sure it all looks good.
295
:This is not new.
296
:It's just AI-fied, essentially.
297
:The next step is this is actually a ton
of work to do, and really I don't see, you
298
:know, a whole lot of companies being able
to pull this off bec- other than, like,
299
:Anthropic, for example, because Anthropic
has literally trillions of dollars.
300
:Uh, you know, they're growing like crazy.
301
:They have tons of employees.
302
:But all that documentation, all that
governance, all that quality, all
303
:that metric mapping and, you know,
adding all the business information
304
:to Claude, it takes hundreds of hours.
305
:It takes so much time.
306
:Before we even talk about maintenance,
like we talked about how they slipped
307
:from ninety-five percent accuracy
to sixty-five percent accuracy
308
:by not maintaining their skills.
309
:Like, there's so much upfront
work and so much maintenance
310
:work on this that it's insane.
311
:I'm not the only person
who actually noticed this.
312
:Uh, Kristen Lum said, "This work takes
hundreds and hundreds of upfront hours
313
:at any moderately sized organization, and
that's not even counting maintenance."
314
:So there is tons of work to be done
even if this is working, even this is
315
:set up, you know, at normal companies.
316
:I mean, I'm not Ex-ExxonMobil.
317
:I haven't been at
ExxonMobil in five years.
318
:I have no clue where they're at.
319
:I have no insight.
320
:A lot of people that I knew
there no longer work there.
321
:But, like, just like the security
and privacy- concerns that
322
:Exxon would have about all of
this would take years to solve.
323
:Not, not even like
implementing and setting it up.
324
:Maybe that's changed, I don't know.
325
:But my point is these large
organizations, even ones with
326
:billions of dollars, this is gonna
be difficult for them to pull off.
327
:Um, the crazy thing about all this
is they literally just gave this out.
328
:It's like they literally give you a skill
sheet, um, a skill file that you can
329
:literally just copy and use for your own
personal analysis, or you can use it on
330
:your team and organization's analysis.
331
:Um, I have a little part of it right
here, or you can just go to the
332
:blog post and find the full file.
333
:My point here, though, is with all these
jobs are- with all these things that we
334
:have to be doing for AI to become a good
data analyst, it's like Anthropic's not
335
:getting rid of the data analyst right now.
336
:They have four hundred roles open, and
eight of them at least are in data.
337
:They have four thousand seven hundred
and forty-two employees on, on
338
:LinkedIn and, uh, one- one thousand
four hundred and seventy-eight of
339
:them deal with data, and a hundred and
ninety-six of them are data analysts.
340
:So if this company that has mastered
ninety-five percent accuracy, the AI data
341
:analyst is still hiring data people, I
think that AI jobs aren't going away.
342
:Like, this is the company
that if they could get rid
343
:of humans, they would, right?
344
:If you've heard the CEO talk about
it, he thinks it's happening,
345
:and you don't really see that
in their hiring numbers yet.
346
:Um, my point of view is like this is
literally going to free you up to do
347
:higher value work, including creating
and maintaining systems like this.
348
:Like, like I said, like you guys
as data analysts are the people
349
:best suited for the AI period.
350
:Like, you guys know numbers, and
if you can compare numbers with
351
:AI, you're going to be undefeated.
352
:You're gonna be employed for a really
long time, and just the fact that you're
353
:listening to this right now tells me
you're one of those people because
354
:you're interested in data, you're
interested in AI, and if you can really
355
:carve a niche that's AI plus data, I
think you're gonna land an awesome job.
356
:I think you're gonna get
promoted to an awesome job.
357
:I think you're gonna make a lot of money
in your career for a really long time.
358
:So if you found this fascinating,
my name's Avery Smith.
359
:Please hit subscribe because I really
want to talk about how data and AI
360
:intertwine over the next six months,
and I want you to be on this journey.
361
:I will see you in the next episode.