Artwork for podcast Data Career Podcast: Helping You Land a Data Analyst Job FAST
201: What I ACTUALLY Do as a Data Analyst
Episode 20110th March 2026 • Data Career Podcast: Helping You Land a Data Analyst Job FAST • Avery Smith - Data Career Coach
00:00:00 00:12:49

Share Episode

Shownotes

Help us become the #1 Data Podcast by leaving a rating & review! We are 67 reviews away!

I'm a senior data analyst with 10+ years of experience and I'm breaking down exactly what I did, what tools I used, and what problems I solved across very different industries.

💌 Join 30k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://datacareerjumpstart.com/newsletter

🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://datacareerjumpstart.com/training

👩‍💻 Want to land a data job in less than 90 days? 👉 https://datacareerjumpstart.com/daa

👔 Ace The Interview with Confidence 👉 https://datacareerjumpstart.com/interviewsimulator

⌚ TIMESTAMPS

00:00 – What nobody tells you about data analyst work

01:00 – Predicting refinery outcomes with math models

04:05 – When data analytics meets machine learning

07:00 – Finding needles in millions of log files

09:23 – How one analysis ended up driving marketing & sales

🔗 CONNECT WITH AVERY

🎥 YouTube Channel

🤝 LinkedIn

📸 Instagram

🎵 TikTok

💻 Website

Mentioned in this episode:

🚀 March Cohort — Data Analyst Bootcamp (Starts March 9th)

Ready to break into data analytics? Our March cohort kicks off with a live call on March 9th at 7pm ET where you'll meet your peers and mentors on day one. Save 20% when you enroll now, plus get two free bonuses: 6 months of Data Fairy (your AI co-pilot through the bootcamp) and a bonus course — "The AI-Proof Analyst: Why Thinking Still Wins." Claim Your Spot → https://datacareerjumpstart.com/daa

https://datacareerjumpstart.com/daa

Transcripts

Speaker:

Avery Smith-1: I'm a senior data analyst

with 10 plus years of experience.

2

:

What did I do in those 10 years?

3

:

What tools did I use?

4

:

What problems did I solve?

5

:

That is the topic of today's episode,

and I'm gonna tell you everything

6

:

so that way you know what to expect

as a data analyst in the future.

7

:

I've had a really vast career where I've

worked for one of the biggest oil and

8

:

gas companies in the world, and I've

also worked for a 10 person biotech

9

:

startup that you've never heard of.

10

:

Before, so let's get into it.

11

:

By the way, if you're new here, my

name is Avery Smith and I try to

12

:

share useful data content that will

help you start your data career.

13

:

If that's of interest to you, you

gotta check out my newsletter.

14

:

30,000 other aspiring data

analysts are already subscribed.

15

:

Go to data career jumpstart.com/newsletter

16

:

or find the link in the

show notes down below.

17

:

So the first company I wanna

talk about is ExxonMobil.

18

:

And what was it like being a data analyst

and a data scientist at ExxonMobil?

19

:

Obviously this is one of the

biggest companies in the world.

20

:

There's like 70,000 employees and

they do a lot of different things.

21

:

Now, I worked in the downstream.

22

:

Part of the business, which

basically means the refiners.

23

:

These are the people that are taking oil

and turning it into gasoline essentially.

24

:

And what do we do there as data analysts?

25

:

Well, we tried to make a mathematical

model of every single part of the

26

:

refinery, and I don't think this is,

you know, groundbreaking to those who

27

:

are in the oil and gas business or

any sort of manufacturing business.

28

:

If you can create what's called

like a digital twin or like a math

29

:

twin of your process, you'll be able

to experiment with the math model

30

:

instead of experimenting in real life.

31

:

So you can be like, well, if I twisted

this temperature, or I changed this

32

:

pressure, or we, you know, added

this new oil, what would change?

33

:

Would we make more money?

34

:

Would we make less money?

35

:

What would go well?

36

:

What would go poorly instead of actually

experimenting In real life, you can

37

:

experiment with these simulations with

your data model, and that way you don't

38

:

actually have to do it in real life.

39

:

Now to create these models, there's lots

of different ways that you can do them.

40

:

I'm not getting into the

nitty gritty of like.

41

:

Modeling these types of things.

42

:

But when you think model, the simplest

version that you can think of in

43

:

your head is linear aggression.

44

:

And if you're not familiar

with linear aggression, you

45

:

learned it definitely in school.

46

:

It's the simple thing

of Y equals MX plus B.

47

:

That's the simplest form.

48

:

So basically you have an input.

49

:

An X.

50

:

If based upon your input, can you

predict what the output is going to be?

51

:

If it, you know is a linear relationship,

you'll be able to have the slope that's

52

:

the m and some sort of a y intercept,

and basically guess what the output

53

:

the Y is going to be based on the X.

54

:

Now you can do that a

lot more complicated.

55

:

You could do multivariate, linear

regression, which is like y equals.

56

:

M1 X one plus M two X two plus X 3M three.

57

:

Oh, it's so confusing.

58

:

But my whole point here is like we

were doing these mathematical models,

59

:

and the simplest form that you

can think of is linear aggression.

60

:

So I created a lot of these

models as a data analyst.

61

:

And I also used data analytics to try to

understand our simulation results better.

62

:

So we'd actually run dozens,

hundreds, thousands of simulations

63

:

trying, you know, different things.

64

:

Well, what if this pressure went up by a

little bit, or this temperature went down?

65

:

To actually look at a thousand

different results is really hard to do.

66

:

So we used data analytics

to try to understand the

67

:

results a little bit better.

68

:

And a lot of this was done in a

Power BI dashboard, so I used a lot

69

:

of Power BI dashboards right there.

70

:

And to do the modeling.

71

:

We actually did a lot in Excel, believe

it or not, and we did a lot in Python

72

:

and we even used a more proprietary

software that you don't hear a whole lot.

73

:

It's from sas.

74

:

It's called Jump, JNP, to do our modeling.

75

:

So those are the tools that we're using

at Axon, and that's the problem that

76

:

we're trying to solve is basically,

hey, if we wanna make changes inside of

77

:

our huge manufacturing system, can we

actually come up with a way to test it

78

:

before testing it in real life so we can

kind of know and expect what to happen?

79

:

I think that's common for,

you know, manufacturing.

80

:

I think that's common for any sort of

like time series data you might have

81

:

is if you can create a model, it's

useful for the company to be able

82

:

to predict the future and be able to

figure out what's going to happen.

83

:

A lot of the times this type of

analytics is called prescriptive

84

:

analytics, where you're actually like

trying to not predict what's going

85

:

to happen in the future, but trying

to decide if you make these changes.

86

:

How will the system basically be affected?

87

:

The next data job I wanna talk about was

when I was a data analyst at this nano

88

:

biotech startup, like think 10 people.

89

:

When I joined the company, this

company made really cool nano sensors.

90

:

So think of it as almost like a game

boy, uh, game, like from the olden days,

91

:

that's like the size of this little board.

92

:

And on this board there was a bunch

of different sensors this, you

93

:

know, chemistry company had built.

94

:

And the sensors would basically react to

what was in the air and we would track.

95

:

How their electricity basically,

or their, their amperage or their

96

:

current, through these different

sensors would change when these

97

:

different chemicals in the air hit it.

98

:

So, for example, if you were holding

it in the air, you know, all the

99

:

lines would be kind of stagnant.

100

:

But for example, let's say you

brought an orange next to it, it

101

:

would basically smell the orange.

102

:

And each sensor would react differently

to that orange being nearby.

103

:

And when you have, uh, an array of

these 12 different sensors, you can

104

:

basically create the equivalent of

like a fingerprint, but for smells.

105

:

So think of it as like the smelling device

that would basically take smell prints.

106

:

My job as a data analyst there was to

actually look at the time series data.

107

:

'cause we'd run these experiments where

you'd have like basically background

108

:

noise for a certain amount of time

and then you'd introduce something

109

:

like an orange for maybe 30 seconds

and then take the orange away.

110

:

And we'd look at these time series and

we're trying to use these time series data

111

:

to actually create these smell prints.

112

:

And that's a very difficult thing to do.

113

:

It actually most of the

time took machine learning.

114

:

So once again, this is maybe a

more advanced data analyst role.

115

:

'cause most data analyst roles.

116

:

You're not really using machine learning.

117

:

This type of machine learning is often

called classification, where you're

118

:

basically trying to match data to a

certain category based off of its data.

119

:

So for example, I could bring

an apple near it, right?

120

:

And the sensors would react.

121

:

Maybe they'd go all down, and if

I brought an orange next to it,

122

:

maybe all the sensors would go up.

123

:

And so you can come up with some sort of

an algorithm that would be like, okay,

124

:

if the sensors go up, it's an apple.

125

:

If they go down, it's an orange.

126

:

Now that's really oversimplifying

it because apples and oranges,

127

:

those are only two things that

exist in the universe, right?

128

:

There's like so many

different things that exist.

129

:

We were playing a little

bit bigger stakes.

130

:

You can think of it when

you go to uh, TSA line and.

131

:

And sometimes they, you know, swab you

and they're trying to see if you have

132

:

like any drugs or any bombs on you.

133

:

That was kind of the stakes that we were

playing with in some of our use cases.

134

:

So I would take this data that oftentimes,

you know, was time series based.

135

:

We usually had like 12 to 16 to

24 different sensors on there.

136

:

And I would try to make these

smell prints using classification

137

:

models in machine learning.

138

:

Now, a lot of the time I was

doing this in Python python's.

139

:

Great for doing things

in machine learning.

140

:

There was even some simple

algorithms that I created that were.

141

:

Based in Excel, but

they are pretty simple.

142

:

The more complicated stuff.

143

:

I was doing Python at the time.

144

:

Also, just because we were doing

a lot of these experiments, SQL

145

:

would've been really helpful.

146

:

We weren't actually using SQL

as much as we should have.

147

:

We really should have been using sql.

148

:

Uh, looking back on it a little bit more.

149

:

The third experience I wanna tell you

about was when I was doing my own,

150

:

uh, data science consultancy firm,

and I got hired by a cybersecurity

151

:

company to help them with a few things.

152

:

So obviously we live in this digital age.

153

:

Cybersecurity is really

important, so there's a lot of

154

:

opportunity in cybersecurity.

155

:

And the interesting thing

about cybersecurity is a

156

:

lot of the data is like.

157

:

Hidden in logs, because basically anything

you do online, anything you do on the

158

:

internet gets logged one way or another.

159

:

Like it's, it's in there.

160

:

They're capturing everything, but when

you capture everything, you're kind

161

:

of capturing nothing at the same time

because it's really hard to figure out

162

:

what's the signal amongst so much noise.

163

:

And so this company in particular

was basically getting a bunch of

164

:

internet logs for companies in what

you can consider their workspaces.

165

:

So for instance, all of their Microsoft

logs, all of their Google logs, if

166

:

they're using Slack, their Slack logs,

maybe their employee customer history.

167

:

Just think of like anything

a company might be interested

168

:

in from a cybersecurity stand.

169

:

We were just getting a bunch of the logs.

170

:

Now in these logs, there's maybe

little needles in the haystack.

171

:

There's maybe little gems

that can be pulled out.

172

:

It requires a lot of analysis to

try to figure out what's in there.

173

:

Just imagine you're getting

like a ton of hay and you have

174

:

to find this little needle.

175

:

And so my job was to go in there and try

to see if there was any needles, anything

176

:

that was like really worth diving into

and investigating more, and also just

177

:

summarizing everything that was happening.

178

:

This is how many logins

you had on Google today.

179

:

This is how many, you know,

logouts you had on Microsoft.

180

:

You know, this is how many users

you had from these different states.

181

:

Just like from these giant enterprise

organizations where they have thousands of

182

:

employees and a bunch of things going on.

183

:

Like how do you know

everything's going okay?

184

:

Are you sure that like everyone

is where they say they are?

185

:

Are you sure you don't have any intruders,

you know, people accessing stuff from

186

:

a place that you probably shouldn't?

187

:

Those types of things.

188

:

So we were basically taking.

189

:

These huge dumps of logs that weren't

really important, that weren't really

190

:

interesting, and aggregating them and

trying to find the interesting things.

191

:

And then also making sure

that nothing nefarious was

192

:

going on to do that analysis.

193

:

I was actually using all of Python, but I

could really choose what tool I wanted to.

194

:

I just chose Python personally because

I'm very comfortable in Python.

195

:

I'm, I'm decently good at Python, uh,

and I can do things quickly with Python.

196

:

I probably couldn't have done

this as easily, like in Excel.

197

:

You probably could have done similar

stuff in SQL if you wanted to.

198

:

One thing I really like about Python

is it can do anything, maybe not

199

:

extremely well, but it can do anything.

200

:

Um, so like I was doing all my analysis.

201

:

Uh, in Python and I was creating

data visualizations in Python.

202

:

They even used a lot of the insights

I found, like in terms of aggregates.

203

:

They basically like aggregated all of

their customers data and would publish

204

:

like a, a yearly or, or biannual

report of like cybersecurity incidents.

205

:

And so they were kind of like with graphs

that I was creating with some of these

206

:

KPIs or metrics that I was monitoring.

207

:

That way they could kind of inform

the cybersecurity, you know, fields

208

:

all of their customers about like

what the trends and what we were

209

:

seeing on a big picture standpoint.

210

:

And that was actually really useful 'cause

people would start to like read that and

211

:

be like, oh, I really like this company.

212

:

I wanna work with them.

213

:

And that would bring in new customers.

214

:

So even though like I was doing

that analysis for individual

215

:

customers at an individual level.

216

:

That analysis actually ended up being

really useful for their marketing

217

:

team as well to get more sales and

more customers in the pipeline.

218

:

Now, I've actually worked for way

more than just these three companies.

219

:

I've probably done work for

about 12, including like the Utah

220

:

Jazz, Harley Davidson, and some

other really big names like MIT.

221

:

If you want to hear more about

those, I'll be talking about

222

:

them more in my newsletter.

223

:

So you can

subscribe@datacareerjumpstart.com

224

:

slash newsletter, and

I'll be talking more.

225

:

About these experiences in the newsletter,

but if you want me to talk about it

226

:

on the podcast or YouTube as well,

let me know in the comments down below

227

:

and maybe I'll do some future episodes

on that if we get enough comments.

228

:

As always, thanks for watching

and I'll see you in the next one.

Follow

Links

Chapters

Video

More from YouTube