Inna Tokarev Sela on Approaching Data Challenges with Generative AI

Welcome to another episode of "Data Driven," where we dive into the ever-evolving world of data science, AI, and data engineering. Today's special guest is Inna Tokarev Sela, CEO and founder of Illumix. Join hosts Frank La Vigne, BAILeY, and Andy Leonard as they unpack Inna's groundbreaking insights into generative AI, the future of data management, and the intricacies of AI cost effectiveness.

Inna reveals the origin of her company's name, "Illumix," and discusses the pressing risks of 2025, particularly the total cost of ownership for managing generative AI. She highlights the inefficiencies of data customization and proposes a shift towards moving AI closer to the data to reduce costs. Through the unique lens of Illumix’s approach, Inna explains how they aim to illuminate organizational data by using a virtual semantic knowledge graph based on industry ontologies and business logic.

Timestamps

00:00 Ina Tokarav Sala: CEO of Illumix, AI readiness pioneer.

05:57 ROI and data are crucial for decisions.

08:56 Intermediate stage: copilots, insights, static dashboards persist.

11:12 Illumax targets structured data market, unlike others.

14:29 Bad data skews predictive analytics, causing errors.

19:48 Data modeling efficiency increases with virtual assistants.

22:33 E-commerce evolution: convenient online shopping preferred.

25:27 2025's biggest risk: High generative AI costs.

27:07 Focus on domain knowledge and metadata utilization.

31:44 Predicting patterns is profound, not crazy.

36:09 Industry trends are cyclical, like fashion trends.

37:49 Repatriating data due to AI cost efficiency.

40:47 Data processing everywhere raises security concerns.

45:00 Founder freedom: Experimentation unlike SAP's structure.

49:11 I'm considered controversial for being very visionary.

52:29 Truth's evolution parallels past technological shifts.

54:39 Frank's World: Kids show on recycling, BBC.

57:09 Thank you, Ina Tokarev Saleh, for insights.

Speaker: 00:00:00

Welcome back to Data Driven, the podcast that peeks into the

Speaker: 00:00:03

rapidly evolving worlds of data science, artificial intelligence,

Speaker: 00:00:07

and the underlying magic of data engineering. Today's guest

Speaker: 00:00:11

is someone who's redefining the rules of the game in AI and data,

Speaker: 00:00:15

Ina Tokarev Saale. She's the CEO and founder of

Speaker: 00:00:18

Illumix, a company pioneering the use of generative semantic

Speaker: 00:00:22

fabric to make organizations AI ready. We'll dig into how

Speaker: 00:00:26

Ina's background as a frustrated data user sparked her innovative

Speaker: 00:00:29

journey, why 80% of enterprise decisions still aren't data

Speaker: 00:00:33

driven, and her bold vision for a future with app free workspaces

Speaker: 00:00:36

where AI copilots handle the heavy lifting. Oh, and we're

Speaker: 00:00:40

tackling the ultimate question. If the future is already here,

Speaker: 00:00:44

why does it still feel so delightfully chaotic? Sit

Speaker: 00:00:47

back, grab your favorite coffee mug, or a Maryland state flag

Speaker: 00:00:51

one if you're feeling fancy, and let's dive in.

Speaker: 00:00:57

Alright. Hello, and welcome back to Data Driven, the podcast where we explore the emergent

Speaker: 00:01:00

fields of data science, artificial intelligence, and, of course, it's all made

Speaker: 00:01:04

possible by data engineering. And with me today is my most favoritest

Speaker: 00:01:08

data engineer in the world, Andy Leonard. How's it going, Andy? It's going well,

Speaker: 00:01:12

Frank. It always warms my heart when you introduce me like that. Well, you are

Speaker: 00:01:15

my most favorite data engineer. Well, that's cool. You're well, you're my

Speaker: 00:01:19

most favoritest. I like, there's so many things. Right? Data

Speaker: 00:01:23

scientist, developer, evangelist.

Speaker: 00:01:27

I mean, there's all sorts of cool things that you do. Super,

Speaker: 00:01:30

certified person. What are you up to in certifies in certification?

Speaker: 00:01:34

12. Wow. Yeah. I'm in I'm in the

Speaker: 00:01:38

New York City area code now. So that's good. Next

Speaker: 00:01:42

up, the Bronx area code 718. So Wow. That's a

Speaker: 00:01:45

big jump. Yeah. Yeah. We're we're working on we're working on it, and I'm at

Speaker: 00:01:49

760 some odd consecutive days. I'm at the point now

Speaker: 00:01:52

where when I post anything on about Pluralsight

Speaker: 00:01:56

or, my number, the search or the number of

Speaker: 00:02:00

days, Pluralsight always sends me a congratulations, Frank. Keep

Speaker: 00:02:04

going. So, like, I'm on their radar now. So which is really

Speaker: 00:02:07

nice. I don't know. It's super cool. Yeah. It is super cool, which reminds me

Speaker: 00:02:10

I still have to do 2 days. But in the

Speaker: 00:02:14

virtual green room, we were talking about coffee mugs. We

Speaker: 00:02:17

were. And, we're we're I don't have a coffee mug with

Speaker: 00:02:21

me today, but, there's an

Speaker: 00:02:25

interesting anecdote from a previous show, which I think the show is live now, about

Speaker: 00:02:28

the Maryland state flag coffee mug, which is, pretty funny.

Speaker: 00:02:32

So today we have with us a very special guest,

Speaker: 00:02:35

Ina Tokarav Sala. She's the CEO and founder

Speaker: 00:02:39

of Illumix, and a pioneer

Speaker: 00:02:43

of generative semantic fabric, which I wanna know more about that, but it

Speaker: 00:02:47

empowers organizations with AI readiness throughout her career

Speaker: 00:02:51

leading data products, monetization, and as a data

Speaker: 00:02:54

stakeholder. Ina recognized the oxymoron of our

Speaker: 00:02:58

domain. Despite huge investments in data and analytics,

Speaker: 00:03:02

most business decisions are still not based on these data or

Speaker: 00:03:05

insights. And when I read that, I felt that one.

Speaker: 00:03:11

So she, she works she founded this company,

Speaker: 00:03:15

Lumix, which is, the the byline says, get your organization

Speaker: 00:03:19

data generative AI ready. So So welcome to the show, Ina.

Speaker: 00:03:23

And, tell us about this. Like, because I think this is a big problem

Speaker: 00:03:26

with generative AI. Well, first off, let's tackle the big

Speaker: 00:03:30

one, which is the idea that despite all this money that's been

Speaker: 00:03:34

thrown at data and analytics for at least 2 decades, probably

Speaker: 00:03:37

longer, a lot of decisions are not data driven.

Speaker: 00:03:44

Yeah. Fine. Can you hear me? Because

Speaker: 00:03:48

I see a little bit Yeah. We can hear you. Okay.

Speaker: 00:03:51

So yeah. Thank you. You're totally right. The the benchmark says

Speaker: 00:03:55

only 20% of decision making in enterprise is based on data.

Speaker: 00:03:59

And to me, I I have been around for a

Speaker: 00:04:02

while. So 25 years in data analytics, and it was

Speaker: 00:04:06

always about cloud, big data. But

Speaker: 00:04:10

what it actually boils down to? Are you able to

Speaker: 00:04:14

pull out whatever analysis of data you need when you have, like, question on

Speaker: 00:04:18

hand? Not really. And this is a situation in majority

Speaker: 00:04:22

of enterprises, right? Even if those huge data

Speaker: 00:04:25

teams and huge investments in infrastructure and all of that.

Speaker: 00:04:29

And to me, the biggest promise

Speaker: 00:04:33

of of LLMs in enterprise setting is to

Speaker: 00:04:37

to bring the contextual and relevant data

Speaker: 00:04:41

to the stakeholders in need.

Speaker: 00:04:44

Right? In this experience which is impromptu which

Speaker: 00:04:48

means it's improvised, it's governed and hallucination free, it's

Speaker: 00:04:52

transparent. So I I would totally love have to

Speaker: 00:04:56

have this experience where I'm in my Slack or Teams, right, and

Speaker: 00:05:00

I've been able to to chat with my data copilot

Speaker: 00:05:03

and ask a question and get the answer I can base decision happen.

Speaker: 00:05:07

Right? Not just an answer. I should be reverse engineering

Speaker: 00:05:11

with, you know, bunch of people.

Speaker: 00:05:15

Interesting. Interesting. But I don't think that I think that

Speaker: 00:05:19

the companies, they

Speaker: 00:05:22

they they they throw a lot of data. They store a lot of data. They

Speaker: 00:05:26

analyze a lot of data. But a lot of at the end of the day,

Speaker: 00:05:29

not all decisions, but a lot of decisions are not based on just the direct

Speaker: 00:05:32

decision of the data. They're based on quite frankly a lot

Speaker: 00:05:36

of it's particularly the higher the, higher the

Speaker: 00:05:40

level. Sometimes it's based on what's good for the person, not

Speaker: 00:05:43

necessarily the organization or the business, let alone the customer.

Speaker: 00:05:48

Do you think what are your thoughts on that? I'm familiar with the saying,

Speaker: 00:05:51

if you touch your data long enough, it will confess. That's

Speaker: 00:05:55

right. It goes exactly to the domain.

Speaker: 00:05:59

So I guess you can you can massage the results

Speaker: 00:06:03

right? But, secondhandly, when an

Speaker: 00:06:06

employee comes to me with suggestion with a business plan with,

Speaker: 00:06:10

you know some project I always ask like what's the ROI like what's

Speaker: 00:06:14

it going to be to spend and what's the impact on on you know

Speaker: 00:06:18

other activities and and what it's going to be on expense of

Speaker: 00:06:22

so having numbers having data to you

Speaker: 00:06:25

know to the basic decision or to bring to your boss is

Speaker: 00:06:29

always has been a struggle and it's still struggle today so I

Speaker: 00:06:32

think it overweights maybe some you know,

Speaker: 00:06:36

reluctance to have open data for all just for the

Speaker: 00:06:40

sake of of being able to to have specific context on it.

Speaker: 00:06:45

Interesting. That that is very interesting. And, you know, that I

Speaker: 00:06:49

think that's been the the purpose of a lot of

Speaker: 00:06:53

data driven activities in in corporations globally

Speaker: 00:06:57

is, you know, and for a very long time is how do you convert

Speaker: 00:07:01

data in its raw natural form into

Speaker: 00:07:05

information? Mhmm. And, you know, and and

Speaker: 00:07:09

defining information as, something I

Speaker: 00:07:13

can glance at and know, you know,

Speaker: 00:07:16

almost instantly how my enterprise is performing.

Speaker: 00:07:20

And that was kind of my opening line 20 years ago when I

Speaker: 00:07:24

started in data warehousing is to go talk

Speaker: 00:07:27

to a decision maker, CIO, CEO,

Speaker: 00:07:31

and, you know, try and do a very small, project,

Speaker: 00:07:35

a phase 0. And just ask them that, how do you know?

Speaker: 00:07:39

And the surprising answer, yeah, even then it was surprising,

Speaker: 00:07:43

was, you know, something along the lines of, well,

Speaker: 00:07:47

people email, information to

Speaker: 00:07:51

a lady out front or a secretary assistant guy out front,

Speaker: 00:07:55

and he or she compiles it and puts it into this summary,

Speaker: 00:07:59

and then they tell me. And so, you know, 1 PM

Speaker: 00:08:03

every day or, you know, Monday on 1 PM. I know how we

Speaker: 00:08:06

did last week. Something like that. It's very

Speaker: 00:08:10

manual processes. So does

Speaker: 00:08:14

does Illumix, address that? The

Speaker: 00:08:18

manual part? Yeah. Yeah. Totally. So

Speaker: 00:08:22

I don't think reports will go anywhere, but I think we'll

Speaker: 00:08:25

have, you know, at least 3 types of

Speaker: 00:08:29

experience with data. So I do I do believe in

Speaker: 00:08:33

application free future where you have a

Speaker: 00:08:37

question or a task and then you have a launcher and you

Speaker: 00:08:40

just, you know, articulate whatever request you have.

Speaker: 00:08:44

And in the background whatever applications, workloads, and data have

Speaker: 00:08:48

been engaged with each other to to basically come up with the

Speaker: 00:08:51

results. Right? So I do believe in this future. Right? So this is

Speaker: 00:08:55

the ultimate. Right? But I think we will have this intermediate

Speaker: 00:08:59

stage where we'll have a lot of copilots or

Speaker: 00:09:03

assisted insights in, in the context of

Speaker: 00:09:07

applications you're already using. So using your CRM systems, you will have

Speaker: 00:09:11

all kind of insights, suggestions, you know, data driven,

Speaker: 00:09:15

actions which which might come up with the system in your

Speaker: 00:09:19

workflow inside your context. Right? And you might have to have

Speaker: 00:09:23

this pure experience when you do go to analytic systems like BI

Speaker: 00:09:27

or something else where you do have your static dashboards,

Speaker: 00:09:31

day after day, same way that I go to, you know, to to my

Speaker: 00:09:35

CRM dashboards and see how pipeline is going and all of that. So I do

Speaker: 00:09:39

not them need to them to change. Right? I don't want to go to some

Speaker: 00:09:42

chatbot and and ask again and again the same question, like, what's the pipeline

Speaker: 00:09:45

conversion today? Right? I do want to have those static dashboards where I just,

Speaker: 00:09:49

you know, sneak peek and see if everything in line and

Speaker: 00:09:53

we we in the benchmark. So those three types of experiences, I

Speaker: 00:09:57

do not think they're going to to evaporate in

Speaker: 00:10:00

the future. Right now, we are mostly bound to the last type of

Speaker: 00:10:04

experience of being in the closed garden of our BI tools,

Speaker: 00:10:08

like this 3 modeled analytic experience and then we'll have this

Speaker: 00:10:12

phase where we do have embedded experience. Majority of the companies are

Speaker: 00:10:15

already suggesting some kind of improvements in the

Speaker: 00:10:19

space, some better, some halfway, let's

Speaker: 00:10:22

say. And and the ultimate goal is to to have this

Speaker: 00:10:26

launcher when for for majority of ad hoc

Speaker: 00:10:29

task of questions, you will have this improvised experience.

Speaker: 00:10:33

So a follow-up on that. You mentioned Copilot, and,

Speaker: 00:10:38

Microsoft has been the company that I've heard using that term most

Speaker: 00:10:41

often for some sort of digital assistance. It

Speaker: 00:10:45

to me, outsider looking in, although I I use the

Speaker: 00:10:48

tools, it it seems to have been a quantum leap,

Speaker: 00:10:53

this year in that technology. It just seems like last year, they were

Speaker: 00:10:56

talking about things that it might help with, and I've seen

Speaker: 00:11:00

all sorts of examples of this. But have you seen that? Has that been

Speaker: 00:11:04

your experience that in the last 12 months, these type of

Speaker: 00:11:07

assistants have just, you know, taken a giant step forward?

Speaker: 00:11:11

Mhmm. I will address this question together with the previous one, like, how

Speaker: 00:11:15

Illumax is is positioned in in this context. So I

Speaker: 00:11:19

do see many projects in the companies

Speaker: 00:11:23

which, and mainly, they're providing

Speaker: 00:11:26

copilots, for call centers or support centers

Speaker: 00:11:31

and mainly based on document summarization.

Speaker: 00:11:35

Right? So document summary is more,

Speaker: 00:11:39

lightweight and and risk averse use

Speaker: 00:11:43

of LLM technology where I can actually go and check the document

Speaker: 00:11:46

itself based on the resource. Right? So it's kind of and documents

Speaker: 00:11:50

are already articulated with lots of context in

Speaker: 00:11:54

business language. So it's kind of low hanging fruit and majority

Speaker: 00:11:58

of the companies go to the direction including, Microsoft.

Speaker: 00:12:02

Where Elamax goes Elamax actually,

Speaker: 00:12:05

tackles the market which is less,

Speaker: 00:12:09

less digested, the market of structured data. So you mentioned you

Speaker: 00:12:13

started your career in warehouse and, so warehouses,

Speaker: 00:12:16

databases, data lakes, business applications such as supply

Speaker: 00:12:20

chain, ARP, CRM, and all of that. All of that

Speaker: 00:12:24

con defined as structured data space. And despite the

Speaker: 00:12:27

name, it couldn't be less structured than it is at the

Speaker: 00:12:31

moment. Right? So you have If it is structured, it's not structured

Speaker: 00:12:35

the way you need it. Yeah. Exactly. So the nay namings are not meaningful, like

Speaker: 00:12:37

abbreviations, frank table, or for like abbreviations,

Speaker: 00:12:41

the, frank table or and this

Speaker: 00:12:45

transformation or alias. Right? So all those weird names especially under

Speaker: 00:12:48

SAP systems. I love that and and no

Speaker: 00:12:52

single source of truth. Right? In documents, you might have versions, but you do

Speaker: 00:12:55

still have some alignment to single source of truth. In data, you

Speaker: 00:12:59

can have many definitions even in the same

Speaker: 00:13:03

data source. And the thing is, if you put semantic

Speaker: 00:13:06

models like semantic search on top of them and it works by proximity,

Speaker: 00:13:11

you might have hallucinations and random answers every time you engage

Speaker: 00:13:15

with the tool. So this this is where we chose with

Speaker: 00:13:18

Illumix to to tackle the problem as,

Speaker: 00:13:22

basically, defining as a 3 step approach.

Speaker: 00:13:25

Right? The first step is getting data AI

Speaker: 00:13:29

ready. So there is no yeah. There is

Speaker: 00:13:33

no way of using generative I or AI analytics in general

Speaker: 00:13:36

if you do not have other data. But for analytics, which is

Speaker: 00:13:40

served to you as BI dashboard, it's actually feasible to do

Speaker: 00:13:44

manual data massaging. Right? Well, fun. Yeah.

Speaker: 00:13:48

Yeah. That's fun. That's near and dear to my heart as a as a data

Speaker: 00:13:51

engineer, data quality. Because

Speaker: 00:13:56

you can have the, you know, the fastest, best presentation, the

Speaker: 00:13:59

slickest graphics, and it could be totally lying to

Speaker: 00:14:03

you. And back, you know, even from the days of of

Speaker: 00:14:07

data warehousing all the way through today's semantic models and

Speaker: 00:14:10

dashboards, it's a the the quality

Speaker: 00:14:14

of the data store you're reporting against,

Speaker: 00:14:17

That that data quality, if you were to measure it, you know, there's a number

Speaker: 00:14:21

of ways to do it. But it's well north of

Speaker: 00:14:25

99% of that. And people see that, and they go, wow.

Speaker: 00:14:29

That that's super good. And it's like, no. No. It didn't. You can't do

Speaker: 00:14:32

predictive analytics off of something that's 99%

Speaker: 00:14:36

because that that 1% of bad data or

Speaker: 00:14:40

incorrect data or duplicate data will skew your results.

Speaker: 00:14:44

And what often, you know, the the layperson doesn't understand

Speaker: 00:14:48

is that if it lies to you and tells you you're gonna make a $1,000,000,000,

Speaker: 00:14:53

that's just as bad as it telling you you're only gonna make a

Speaker: 00:14:57

$1,000,000 if the if the truth is you're gonna you're at about 25,000,000.

Speaker: 00:15:01

That's your real projection if you were to follow that line out and do the

Speaker: 00:15:04

extrapolation, you know, properly. And you can make

Speaker: 00:15:08

bad decisions with an overestimation just as easily,

Speaker: 00:15:12

maybe more so than if it's an underestimation. Yeah.

Speaker: 00:15:16

Exactly. So this goes to to, to the ground truth of

Speaker: 00:15:20

your results as good as your data is. And you cannot

Speaker: 00:15:23

trust, simple semantic search

Speaker: 00:15:27

to solve all these problems for you. And

Speaker: 00:15:31

so for us, the baseline, the first use

Speaker: 00:15:35

case is to get data AI ready or generative AI ready And we

Speaker: 00:15:38

do use generative AI for that from day 1. We actually generated company

Speaker: 00:15:42

from 2021. Yeah. It's funny to say now. It it was very hard

Speaker: 00:15:46

to explain to our investors back then what it actually means.

Speaker: 00:15:51

Yeah. You know, I I get it. I mean, if you build on a crooked

Speaker: 00:15:54

foundation, you you can't get anything straight, you know,

Speaker: 00:15:57

out of that. So that makes perfect sense to me. And it and,

Speaker: 00:16:01

please correct me if I'm mischaracterizing, the work that Illumix

Speaker: 00:16:05

does. But is it automated,

Speaker: 00:16:09

AI automated, data quality? Is that really what you're

Speaker: 00:16:13

after? So, basically, we automated full

Speaker: 00:16:16

stack of LLM deployment for structured data, and it takes the

Speaker: 00:16:20

AI readiness part. AI readiness, which means we have automated

Speaker: 00:16:24

reconciliation, labeling, sensitivity tagging Okay.

Speaker: 00:16:27

Like lots of lots of data preparation which is automated.

Speaker: 00:16:32

Gartner actually named us as a call vendor for that lately. We have

Speaker: 00:16:35

this layer of a context automation. Right? So so any

Speaker: 00:16:39

LLM, any semantic model needs context and this context and reasoning

Speaker: 00:16:43

usually rebuild by data scientists. To me, it's controversial

Speaker: 00:16:46

because, you know we had data modelers which didn't

Speaker: 00:16:50

understand business logic and now we have data scientists who do not necessarily

Speaker: 00:16:54

fully understand business logic and the model into black

Speaker: 00:16:57

box experience of context. Right? So ElamX

Speaker: 00:17:01

reverses process. We actually automate context and we wrap it

Speaker: 00:17:05

up in augmented governance workflow so business people or

Speaker: 00:17:08

governance folks can actually certify it. So it's auto generated

Speaker: 00:17:12

context for LLMs but certifiable by humans. We do

Speaker: 00:17:16

believe that we need to bring human to the loop, right, to to certify

Speaker: 00:17:19

it. Yeah. And the last I love I'm sorry. I have

Speaker: 00:17:23

interrupted you, like, 3 times now, and I apologize. I haven't met 2. I

Speaker: 00:17:26

thought you paused. So finish please finish your thought.

Speaker: 00:17:31

No. No. I'm saying, like, 3 parts. So you already did data governance and the

Speaker: 00:17:34

actual alarm deployment because you need to interact with the whole thing, and the interaction

Speaker: 00:17:38

to have to has to be explainable and transparent. You need to understand

Speaker: 00:17:42

how, especially on structured data, you need to understand how

Speaker: 00:17:46

the question was calculated based, sorry, how answer was

Speaker: 00:17:49

calculated based on questions and how, data was

Speaker: 00:17:53

actually sourced, what's the lineage, what is the governance and access

Speaker: 00:17:57

control through search your clients. So all of that should be on the interaction layer.

Speaker: 00:18:01

So AI readiness, governance, and the interaction layer explainability to

Speaker: 00:18:05

the end user. Absolutely. Okay.

Speaker: 00:18:09

Thanks. And I do apologize again for the

Speaker: 00:18:13

interruption. So my my characterization of it as something that's just

Speaker: 00:18:16

data quality is is way low. There's a little bit of overlap between

Speaker: 00:18:20

data quality and what you're describing. You're talking about taking this into

Speaker: 00:18:24

that next level that is specific to, generative

Speaker: 00:18:28

AI and perhaps other, you know, AI related,

Speaker: 00:18:32

AI adjacent technologies, machine learning leaps to mind and stuff like

Speaker: 00:18:35

that. But your the tagging, the categorizing,

Speaker: 00:18:39

and all of the things you're describing there, that is next level.

Speaker: 00:18:43

And it's very interesting to me that you're

Speaker: 00:18:47

using AI to get data ready for AI.

Speaker: 00:18:51

That's an interesting combination. Mhmm. It makes sense, though. Right?

Speaker: 00:18:55

You can kinda scale out human capability with AI. I

Speaker: 00:18:58

think that's you you kind of alluded that with Newman in the loop. Right? Like,

Speaker: 00:19:02

I think I think where you were kinda going with that, again, don't wanna speak

Speaker: 00:19:06

for you, but it's like the idea that AI isn't gonna replace

Speaker: 00:19:10

humans. It's just gonna make humans more productive. Yeah.

Speaker: 00:19:13

For sure. Augment us because frankly speaking, no one

Speaker: 00:19:17

wants to to model data, you know, as their

Speaker: 00:19:20

career. We want to solve problems. Right? And to solve

Speaker: 00:19:24

problems, we we have to to understand what the problems are

Speaker: 00:19:28

And letting AI to surface the problems as alerts and for us

Speaker: 00:19:32

to to resolve them as conflicts takes, you

Speaker: 00:19:35

know, 1% to 10% of the time that it should take,

Speaker: 00:19:40

where we are busy, you know, wrangling data still. And, you know,

Speaker: 00:19:43

it's sad to some extent because data is growing and we cannot keep up.

Speaker: 00:19:48

No. That's a good point. Even if even if there are people out there and

Speaker: 00:19:51

some of our listeners may really do like modeling data. Right? But, you

Speaker: 00:19:54

know, Dow, they can model about 10 times the amount of data or maybe

Speaker: 00:19:58

a 100 times more. Right? And then ultimately, the expectation of

Speaker: 00:20:02

what a, you know, what a person

Speaker: 00:20:05

can do in a set period of time is gonna go up just

Speaker: 00:20:09

because I I I think I think you're on to something there. Plus,

Speaker: 00:20:13

I also I would also, like, double click on the idea that you said earlier,

Speaker: 00:20:16

which I think was very intriguing, was this notion of

Speaker: 00:20:20

a lot of the apps that you use would kind of fade away. You just

Speaker: 00:20:22

have this virtual assistant. You know, I I think back to

Speaker: 00:20:26

there's a number of scenes in, you know, Star Trek The Next Generation where they

Speaker: 00:20:30

have a conversation with the computer. Right? Mhmm. You know, you they

Speaker: 00:20:33

don't they use the computer. They get stuff done. There's no

Speaker: 00:20:37

Microsoft Word. There's no PowerPoint. Right? Like, there's no, like, it's

Speaker: 00:20:40

just the the there is no application. The application is kind of invisible. It

Speaker: 00:20:44

becomes the computer. And I think that's a very

Speaker: 00:20:48

intriguing kind of way. And if you had told me that a year ago, I

Speaker: 00:20:51

would have been very skeptical. Now I look at it, I'm like, I

Speaker: 00:20:55

mean, it's it's it's almost inevitable.

Speaker: 00:20:58

Yeah. Yeah. I agree with you. Futures here,

Speaker: 00:21:02

it's not evenly distributed as people say. So I

Speaker: 00:21:06

guess, you know, when you're attending conferences in Bay Area,

Speaker: 00:21:09

it's already it's already here. It happens. Right

Speaker: 00:21:14

and when you go to let's say Europe we

Speaker: 00:21:18

even just say you know just say a EU act in

Speaker: 00:21:21

Europe is is ramping up so it's all about

Speaker: 00:21:25

controls and and this is great So I do not think that regulation and

Speaker: 00:21:28

innovation, actually, jeopardize each other. I think

Speaker: 00:21:32

they should go hand by hand and, that's where I see

Speaker: 00:21:36

industry is going. So so East Coast approach, majority of our customers

Speaker: 00:21:40

are coming from East Coast US, Pharma,

Speaker: 00:21:44

financial services, insurance, highly regulated data

Speaker: 00:21:48

intensive companies. They have now,

Speaker: 00:21:51

sometimes even inventing standards for generative AI

Speaker: 00:21:55

implementations because everything is so new but companies

Speaker: 00:21:58

want to go fast. Right? So no one wants

Speaker: 00:22:02

to to downplay risks on one hand. On the other

Speaker: 00:22:06

hand, everyone want to, you know, to implement generative AI

Speaker: 00:22:10

and see the productivity cuts. It's, you know, it's evident productivity

Speaker: 00:22:13

cuts are already here with all those co pilots summarization,

Speaker: 00:22:18

what have you and this is where we are today. So I

Speaker: 00:22:22

think like again Bay Area running fast

Speaker: 00:22:26

and east is coming up with regulation. We will meet somewhere

Speaker: 00:22:30

in between. I believe in both. Well, if you kind of,

Speaker: 00:22:33

like, look at, like, historically, you know, when .coms first

Speaker: 00:22:37

started, right, there were a number of, hey. Look. You know, we're gonna sell pet

Speaker: 00:22:41

food online. Right? Like, and then it was

Speaker: 00:22:44

like, back in the dial up days, it didn't really make a lot of

Speaker: 00:22:48

sense. So it would just be easier for me to go to the store.

Speaker: 00:22:52

Whereas now, I mean, if you think about ecommerce, obviously,

Speaker: 00:22:55

Amazon is the £2,000,000,000 gorilla in the

Speaker: 00:22:59

room. I like, do I really

Speaker: 00:23:03

wanna think about, you know, dealing particularly as we get into the holiday season, do

Speaker: 00:23:06

I really wanna deal with the traffic at the mall or the store when I

Speaker: 00:23:10

can just click on something, either have, you know, groceries delivered

Speaker: 00:23:13

or, you know, I'm I'm okay waiting 2 days for

Speaker: 00:23:17

something to come up if I don't have to deal with them all.

Speaker: 00:23:21

Yeah. Totally. What's what's the difference between Black Friday

Speaker: 00:23:24

and Cyber Monday? No. It's not. Right? Like not really. Yeah.

Speaker: 00:23:28

Yeah. So it's like Not anymore. I remember Yeah. You

Speaker: 00:23:32

know? So we're recording this just before Black Friday. And,

Speaker: 00:23:36

you know, this whole idea of, you know, going to the store, get

Speaker: 00:23:40

the best deals, it's like, do I really wanna deal with the

Speaker: 00:23:44

crowd? No. Yeah. Although ironically, the name for the

Speaker: 00:23:47

podcast came on a Black Friday, while I was

Speaker: 00:23:51

at a Dunkin' Donuts, drinking coffee, waiting waiting

Speaker: 00:23:55

in line actually to get so there's a I'm a Krispy Kreme

Speaker: 00:23:59

person. So I'm Ah, okay. Yeah. So With you and

Speaker: 00:24:03

I, right, definitely. Right here. This is before we had a Krispy Kreme

Speaker: 00:24:06

near us. So it's I I have split sides, but yeah. Yeah.

Speaker: 00:24:10

Jeff's JT. He's a mess. From up north. So they are

Speaker: 00:24:14

they're Dunkin' Donuts. I've noticed this. They're Dunkin' Donuts, like, north of

Speaker: 00:24:17

Virginia. And he's in Maryland. I'm in Virginia. Then down

Speaker: 00:24:21

south, you rarely see a Dunkin' Donuts. I see more Dunkin' Donuts down

Speaker: 00:24:25

south than Krispy Kreme's up north, though, for sure. Yeah. But

Speaker: 00:24:28

I They're they're from Boston. That's why. Yeah. Oh, that's why. And then So at

Speaker: 00:24:32

Krispy Kreme's from Atlanta. And plus, it's funny. Right? Like, so I live in

Speaker: 00:24:35

Maryland Mhmm. Which depending on who whom you ask is either

Speaker: 00:24:39

north or south. So that's right. That's true.

Speaker: 00:24:43

Interesting. Interesting. We're a quarter state for sure. Yeah. That that's

Speaker: 00:24:47

that goes safe for Virginia. But I wanted to follow-up on, you know, you've

Speaker: 00:24:51

been we've been talking about all the cool stuff. I'm

Speaker: 00:24:55

gonna try and say this correctly. Illumix. Is that correct? Am I getting it

Speaker: 00:24:58

right? So Illumix name

Speaker: 00:25:01

from Illuminating the Dark Side of Organizational Data.

Speaker: 00:25:05

Illuminate like illuminate. Illuminate. I like that. And x x

Speaker: 00:25:09

for the x factor. Excellent. X for the x

Speaker: 00:25:13

factor. Yeah. What? And I'm not asking you to I'll

Speaker: 00:25:16

just ask a question. What are the risks in in what you're doing?

Speaker: 00:25:21

And, you know, what are the risks you're aware of and how are you addressing

Speaker: 00:25:24

those? Yeah.

Speaker: 00:25:28

So I think the biggest risk of 2025

Speaker: 00:25:31

is going to be, a TCO, total cost of

Speaker: 00:25:35

ownership. So already today,

Speaker: 00:25:39

it's, it's very hard for organizations to to

Speaker: 00:25:42

monitor where the generative AI tokens are spent.

Speaker: 00:25:47

And the benchmark say that 80%

Speaker: 00:25:50

of LLM tokens actually spend on customization

Speaker: 00:25:55

of off the shelf models. And that's not a good news because

Speaker: 00:25:58

which means ROI is is pretty low on on the actual

Speaker: 00:26:02

production use of generative AI in in enterprise.

Speaker: 00:26:05

And I think it doesn't get any better because the

Speaker: 00:26:09

customizations techniques which are used today gains a black box

Speaker: 00:26:13

performed by super expensive data scientists and

Speaker: 00:26:17

they're not very scalable for data that you don't want to, you know,

Speaker: 00:26:20

to schmooze around. I think it's cost prohibitive actually to bring data

Speaker: 00:26:24

to AI. You need to bring AI to data. So so putting

Speaker: 00:26:28

data in some graph structures for graph, frog, and all of that, it's to me,

Speaker: 00:26:32

it's cost prohibitive. So this is why I think that, the Telumex

Speaker: 00:26:36

position for 2025 is actually favorable because we bring this

Speaker: 00:26:39

transparency. We do create this, a virtual,

Speaker: 00:26:43

a semantic knowledge graph, which is transparent to certify, which is

Speaker: 00:26:46

created for business people. Based on business

Speaker: 00:26:50

logic. We do use extensively industry ontologies and so on so forth.

Speaker: 00:26:54

And I think the the most interesting part about generative AI is

Speaker: 00:26:58

we do not necessarily going to mimic processes that

Speaker: 00:27:02

the humans performed. Mhmm. We're going to invent

Speaker: 00:27:06

those processes. Right? So new new processes and new workflows. So

Speaker: 00:27:09

right now, a generative AI is deployed like like

Speaker: 00:27:13

analytics is deployed, which means you you have to

Speaker: 00:27:17

label your data, check the quality, usually manually, and then

Speaker: 00:27:21

you have to to prepare the test set which is fed

Speaker: 00:27:24

into customization of the model and then you actually provide the

Speaker: 00:27:28

context to on every question. So this is

Speaker: 00:27:32

very old fashioned or, you know, 40 years old

Speaker: 00:27:35

machine learning technique to to actually train generative

Speaker: 00:27:39

vi. So this is why why I'm saying that, many companies are

Speaker: 00:27:43

probably going to to mimic what Equinox does in the sense

Speaker: 00:27:47

that you have to you have to be focused on domain

Speaker: 00:27:50

specific knowledge, reason, ontologies, and knowledge graphs. You have

Speaker: 00:27:54

to onboard your customers automatically via metadata because

Speaker: 00:27:57

metadata has the factor all

Speaker: 00:28:01

activities in organization documented for us. We're

Speaker: 00:28:04

just under utilizing them, right? And then you bring your

Speaker: 00:28:08

business people, your domain experts, your governance teams to the

Speaker: 00:28:11

loop because you can simply cannot bring this business acumen,

Speaker: 00:28:16

to, you know, to data. You have to bring data to to those people.

Speaker: 00:28:20

That's an interesting thing because I've seen the the particularly is this this this

Speaker: 00:28:24

statistic around 80% of the tokens are being used to

Speaker: 00:28:27

manipulate the data. I have a microcosm example of that

Speaker: 00:28:31

where I use AI to augment my blog post, my blog

Speaker: 00:28:35

that I create, and I finally took

Speaker: 00:28:40

a closer look at this because I was spending a lot more on

Speaker: 00:28:43

the OpenAI API than I really wanted to. And I'm like, well,

Speaker: 00:28:47

what exactly am I I'm using a product called Fabric.

Speaker: 00:28:51

And I'm like, wait, what exactly is the source of this prompt? And I look

Speaker: 00:28:55

at it, and I'm like, I can't. It's a lot. It's a long prompt. And

Speaker: 00:28:58

I'm like, I really don't need that. Right? So we are gonna do a deep

Speaker: 00:29:01

dive in a show on Fabric at some point. Not not the Fabric Andy

Speaker: 00:29:05

works with, but there's an open source thing called fabric. There's

Speaker: 00:29:08

a I'm sure there are lawyers right now that are doing their

Speaker: 00:29:12

holiday shopping based on how much money they're gonna make off of this

Speaker: 00:29:15

dispute. But, the the short of it is, like,

Speaker: 00:29:19

I realized, like, well, no wonder why I spent so much money. I'm sending all

Speaker: 00:29:22

of this in my prompt plus the content. So I

Speaker: 00:29:26

actually in the verse before you joined in, Andy and I were talking, and I

Speaker: 00:29:29

was like, I actually got a really good result based on a more optimized

Speaker: 00:29:33

prompt. You know? And, you know, strictly speaking, it's

Speaker: 00:29:36

not I I like your approach of bringing the AI to the data rather than

Speaker: 00:29:39

bringing the data to the AI because that is expensive.

Speaker: 00:29:43

You know, I I think that bringing the AI to the data will be less

Speaker: 00:29:47

expensive. How less, I think, remains to be seen. But I like that approach,

Speaker: 00:29:50

right? Because that's typically what we've done, you know, and we've seen

Speaker: 00:29:54

huge upsides to that, whether it's from Hadoop bringing the

Speaker: 00:29:58

compute to the data rather than vice versa. I like that

Speaker: 00:30:02

approach. And it's backed by historical precedent. Right? So it's not

Speaker: 00:30:05

completely gonna be this crazy idea. It's just a very sensible

Speaker: 00:30:09

idea. Yeah. Yeah. I believe the future was already

Speaker: 00:30:12

invented. Right? So it's just the inclination of technologies we already have.

Speaker: 00:30:16

It's been healthy about it. So, we had

Speaker: 00:30:19

machine learning practices which are very healthy like feature

Speaker: 00:30:23

exploration, feature definitions and then we had neural net brute

Speaker: 00:30:27

force and then majority of companies used combination of both,

Speaker: 00:30:31

right, to to to be optimized. This is what I think what's happening with

Speaker: 00:30:34

generative AI. So this, you know, wild west of brute

Speaker: 00:30:38

force or great spend is going to be replaced by methods

Speaker: 00:30:42

which have, like, this automated context filtering or pre

Speaker: 00:30:45

processing and then use like fraction of your budget to to actually

Speaker: 00:30:49

run the query. Yeah. I remember hearing about a lot

Speaker: 00:30:53

of this in the late nineties. And, I worked for a company who

Speaker: 00:30:57

was a big SAP shop. I see you have a history with SAP. Yeah. And

Speaker: 00:31:01

this lady and and and so we were an we were the IT department. So

Speaker: 00:31:04

we were in the basement, but the analytics team back then was in

Speaker: 00:31:08

a closed in space inside the basement. So it was

Speaker: 00:31:11

like even more like, you know, I was the web developer, so I didn't

Speaker: 00:31:15

have a window, but I could see the window about 50 feet away.

Speaker: 00:31:19

But, like, when you when when you went

Speaker: 00:31:23

into this, like, you know, further enclosed space deeper into

Speaker: 00:31:26

the the the the the depths of the IT department,

Speaker: 00:31:31

there was the database team. And and and and in the back of that area

Speaker: 00:31:34

was the analytics group. And I remember this lady telling me

Speaker: 00:31:40

that she was working with these things called OLAP cubes. Oh, wow.

Speaker: 00:31:44

Yeah. And I was like, what is that? And then she went on this thing

Speaker: 00:31:47

and, you know, I'm remembering a conversation, oh my god,

Speaker: 00:31:51

almost 30 years ago. But I just remember walking away with,

Speaker: 00:31:55

like, that sounds either crazy because she's talking about,

Speaker: 00:31:59

like, you know, figuring out patterns. Right? So, you know, will

Speaker: 00:32:03

rainfall patterns in Australia affect not just the agricultural

Speaker: 00:32:07

side of the chemical business, but also the plastics purchasing

Speaker: 00:32:10

versus rainfall in the Amazon versus this and all of

Speaker: 00:32:14

that? And I just remember walking away from that conversation as I as I

Speaker: 00:32:18

as I as I leave the depths of the IT department back to my normal

Speaker: 00:32:22

kinda, basement. Back to the regular basement from

Speaker: 00:32:25

the sub basement. I remember thinking that is either the craziest thing I

Speaker: 00:32:29

ever heard or the most profound thing I ever heard, which

Speaker: 00:32:33

now with the, hindsight of time, it turns out it was the most profound

Speaker: 00:32:36

thing. Yeah. You you can think about it as

Speaker: 00:32:40

semantic layers of, you know, that era. Right?

Speaker: 00:32:44

Mhmm. Right. And I think You know go ahead.

Speaker: 00:32:48

I'm sorry. Sorry. I think it's delayed between the

Speaker: 00:32:51

between the connection. So I think around the same time I was

Speaker: 00:32:55

doing my bachelor and my project was about multi dimensional

Speaker: 00:32:59

theory. So multi dimensional geometry,

Speaker: 00:33:03

of these neural nets. So basically, you model neural nets as multi

Speaker: 00:33:07

dimensional graph and it does operational research calculations.

Speaker: 00:33:11

So it's exactly the same. You you model your universe in a

Speaker: 00:33:15

graph. Back then it wasn't MATLAB. We didn't have any, you

Speaker: 00:33:18

know, neural nets Right. Structures or graph structures and so you're

Speaker: 00:33:22

modeling in MATLAB in this weird language,

Speaker: 00:33:26

a graph which has a neural nets on there. And

Speaker: 00:33:30

this is exactly like modeling all of cubes. Right? A

Speaker: 00:33:33

multidimensional representation of your reality. Now,

Speaker: 00:33:36

unfortunately, we have a new technologies which,

Speaker: 00:33:40

which are semantic and context. Right? Large language

Speaker: 00:33:44

models and graphs, which do the same thing but much

Speaker: 00:33:48

more efficiently. Yeah. So this is amazing. Like, I

Speaker: 00:33:52

think it goes back to what you said. You know, The future's already here. It's

Speaker: 00:33:55

just not widely distributed yet, which I think is a William Gibson

Speaker: 00:33:59

quote, or is it a Esther Dyson quote? I forgot.

Speaker: 00:34:04

But it's one of those 2 kinda luminaries. Yep.

Speaker: 00:34:07

You you said what I was going to say, you know, and it

Speaker: 00:34:11

was, you know, more of what off of what Frank

Speaker: 00:34:15

said is it turns out that we're just

Speaker: 00:34:18

doing more nodal analysis and vector

Speaker: 00:34:22

geometry as a result of that. That's it did all start

Speaker: 00:34:26

with multidimensional and and grow from there. And

Speaker: 00:34:30

that's where these algorithms, like nearest neighbor

Speaker: 00:34:33

originated, was in that math. So

Speaker: 00:34:38

Yeah. Yeah. Great minds. Exactly. Exactly.

Speaker: 00:34:41

Alike. Exactly. Now you're

Speaker: 00:34:45

complimenting me. Thank you. I I feel I feel better

Speaker: 00:34:49

when smart people in the room agree with me.

Speaker: 00:34:53

No. I'm on the right path. You know, I employ

Speaker: 00:34:56

millennials. So so having people with experience in multidimensional

Speaker: 00:34:59

geometry and all of cubes, it's just a miracle to me to to start

Speaker: 00:35:03

with. You know? People now like Python, neural

Speaker: 00:35:06

nets, we do actually, the average age in in in

Speaker: 00:35:10

Lumex is around 35, 37, something like that. So we do

Speaker: 00:35:14

have like also pretty experienced folks, you know, but new talent,

Speaker: 00:35:18

they, they they're not familiar with all all of that.

Speaker: 00:35:22

And I think it's actually a disadvantage because,

Speaker: 00:35:26

when when you do know different patterns in architecture Yeah.

Speaker: 00:35:29

You can model them with new technology. Right? Make them more

Speaker: 00:35:33

efficient, but you already know what works and what doesn't, and it

Speaker: 00:35:36

helps. That yeah. That's a great point. The old

Speaker: 00:35:40

experience, you know, the experience that we have from doing this for

Speaker: 00:35:43

decades is that we see the patterns that have

Speaker: 00:35:47

repeated over time, architectural patterns and design patterns. And,

Speaker: 00:35:51

you know, and we know that they've

Speaker: 00:35:55

I I love that how you said that. The, you know, the future's already been

Speaker: 00:35:58

invented. We we realize that if we reapply some of these

Speaker: 00:36:01

patterns, that there are use cases for them, not just now, but

Speaker: 00:36:05

also in the future. So totally get you.

Speaker: 00:36:09

Too, you know, like,

Speaker: 00:36:12

you know, it it is painful to think that, you know, we've been in this

Speaker: 00:36:16

industry for decades. Right? It is a little hurts a little bit. But,

Speaker: 00:36:20

like, also, if you're listening to this, you've not been in the industry for

Speaker: 00:36:23

decades, and you're thinking like, woah. You know, what are these what are these

Speaker: 00:36:27

old geezers now? I would point out when I was

Speaker: 00:36:31

a young kid in the industry and, you know,

Speaker: 00:36:35

client server was like the new hotness. Right?

Speaker: 00:36:39

And, you know, the whole notion of going back to,

Speaker: 00:36:43

you know, cloud and and and and and, you know, terminal

Speaker: 00:36:47

and an old mainframe geezer basically said to me, like, this is just

Speaker: 00:36:51

this industry has a cycles. Right? It's like the fashion industry. This goes in

Speaker: 00:36:54

style. This goes out style. And it was like, I had that moment

Speaker: 00:36:58

of, like, wait. I think he's on to something, but he's just an old geezer,

Speaker: 00:37:02

so I won't listen. So, you know, so so

Speaker: 00:37:06

if you are a young buck, like, or,

Speaker: 00:37:11

buck is a male deer, right? What would be a Yes. A doe. A young

Speaker: 00:37:14

doe. So if you're a young buck or a young doe, I grew up

Speaker: 00:37:18

in New York City. So all of this wildlife thing is brand new. I'm here

Speaker: 00:37:22

for you. I'm here for you, Frank. So, you

Speaker: 00:37:26

know, listen to, like, some of the things that these, you know, more

Speaker: 00:37:29

experienced colleagues will say. Yeah. You know,

Speaker: 00:37:34

if you don't believe it right away, just put it on the shelf in your

Speaker: 00:37:36

mind because you're gonna need it later. It'll come up at some point.

Speaker: 00:37:40

And it's like, if you look at kind of, you know, everybody ran to the

Speaker: 00:37:44

cloud. Right? And cloud is effectively like a

Speaker: 00:37:47

mainframe effectively. Right? The same philosophy. Right? Centralized

Speaker: 00:37:51

computing somewhere else. Right? And then your browsers become

Speaker: 00:37:54

the terminals, terminals with fancy graphics, but terminals nonetheless.

Speaker: 00:37:58

Now I think you're gonna start seeing it kind of we're about due for a

Speaker: 00:38:02

seismic shift backwards, right, as people kinda move

Speaker: 00:38:06

repatriate data and things like that. Particularly, I think driven by AI

Speaker: 00:38:10

because of the cost of some of this. You know, I had this debate,

Speaker: 00:38:14

you know, the other day. It was like, you know, if if one of these

Speaker: 00:38:18

super clusters with, you know, a 100, 8 100,

Speaker: 00:38:21

all of this, if it costs, say, $500,000,

Speaker: 00:38:26

right, I could probably do the math, and that probably means

Speaker: 00:38:30

about, you know, there's a certain break even point,

Speaker: 00:38:34

and it's probably after about 7 or 8 fine tunings or full

Speaker: 00:38:37

on trainings where it's just cheaper to have it. Just buy it.

Speaker: 00:38:41

Yeah. Yeah. Yeah. Totally on that. And also, you

Speaker: 00:38:45

know, salary skills are the most expensive part. So you

Speaker: 00:38:48

want to spend it on your business specific problems and

Speaker: 00:38:52

not generic problems you can solve with software. Right? So

Speaker: 00:38:56

it's always like that. Yeah. Yeah. So,

Speaker: 00:39:00

I do think that, basically capacity to process data

Speaker: 00:39:04

is is going to be a challenge. Right? And this is why we

Speaker: 00:39:08

see that, that majority of,

Speaker: 00:39:11

of I would even say countries not

Speaker: 00:39:14

only specific enterprises, kind of gear

Speaker: 00:39:18

up with, with GPUs, FPGAs,

Speaker: 00:39:21

whatever hardware you have. Right? So do you see it in

Speaker: 00:39:25

middle east, in emirates? They they have national generative

Speaker: 00:39:28

vi grid and they're building it for, you know, not only government companies

Speaker: 00:39:32

but also private companies. We see the same in Europe

Speaker: 00:39:36

and I would assume, you know, US based telcos

Speaker: 00:39:40

are going to to provide those data centers with GPU soon

Speaker: 00:39:43

enough, right, for, you know, for everyone to purchase as an

Speaker: 00:39:47

alternative to the public cloud. Yes. And we'll

Speaker: 00:39:50

see it. So this is for starters. And second one, the second part where

Speaker: 00:39:54

you don't need, this, you know, heavy machinery,

Speaker: 00:39:58

you might just have your variables processing

Speaker: 00:40:02

parts of whatever generated AI on your end before sending to the cloud

Speaker: 00:40:06

because you do not necessarily need to to process everything in a central

Speaker: 00:40:10

manner. We basically have pretty powerful machines on

Speaker: 00:40:13

our hands or in our hand, you know, as

Speaker: 00:40:17

glasses as well. We can see that, and it's

Speaker: 00:40:21

going to be part of the processing. So the processing is going to be distributed.

Speaker: 00:40:24

You bring AI to your data, where your data is. You do

Speaker: 00:40:28

not shift your data all the time. It's not, it's not

Speaker: 00:40:32

cheap anymore. And we'll have this, as you mentioned,

Speaker: 00:40:35

those central repositories of mass processing

Speaker: 00:40:39

and those distributed powerhouses which are

Speaker: 00:40:43

small enough to to process data on on edge.

Speaker: 00:40:47

I think you're right. I think you're gonna see a set of data being processed

Speaker: 00:40:50

in one place. I think it's gonna be everywhere. There's gonna be some

Speaker: 00:40:54

and and I think that that introduces some interesting, consequences. Right?

Speaker: 00:40:58

So my wife works in IT security, and I can immediately hear her voice in

Speaker: 00:41:02

the back of my head. Contrary to what you think, ladies, we do

Speaker: 00:41:05

listen. We just don't always pay attention. But

Speaker: 00:41:09

I can hear her like, well, if compute's happening everywhere,

Speaker: 00:41:13

gee, couldn't like that be poisoned anywhere.

Speaker: 00:41:16

Right? I think I think that's going to be the next kind of thing. Right?

Speaker: 00:41:20

It's and it's again, it's a pattern. Right? Advancement.

Speaker: 00:41:23

Bad actors take advantage for that. Problem happens. And

Speaker: 00:41:27

then then that's the new thing. Right? So it's almost like you're you're building like

Speaker: 00:41:30

a, like a like a like a layer cake. Right? Like, you know, the cake

Speaker: 00:41:33

goes down then the frosting. The cake is the innovation. The frosting is

Speaker: 00:41:37

security, and then so on and so on. So Yeah. Yeah. Yeah.

Speaker: 00:41:40

So it basically back to the semantics. What we started is

Speaker: 00:41:44

semantic ontology as a baseline for generative AI.

Speaker: 00:41:48

It has multiple benefits. Single source of truth, of course, has the

Speaker: 00:41:52

benefits for accuracy. But also, if you're passing every

Speaker: 00:41:56

question to this semantic ontology context,

Speaker: 00:41:59

it's almost impossible to poison it because we're going to either

Speaker: 00:42:03

match to part of your logic or Right. Right. We're going to

Speaker: 00:42:07

miss. So it's it's another layer of security if you think about

Speaker: 00:42:10

it. So, so yeah.

Speaker: 00:42:14

That's an interesting point. All new. Yeah. All new ontology, all new

Speaker: 00:42:18

semantics have governance meaning, it has

Speaker: 00:42:21

accuracy meaning, it has also security meaning.

Speaker: 00:42:27

And also if you want to have single source of truth you have to to

Speaker: 00:42:30

have means to distribute it to those edge devices or

Speaker: 00:42:33

to to bring it back to central location and without ontologies, without

Speaker: 00:42:37

semantic layers, simply it's impossible to do that. I was gonna

Speaker: 00:42:41

say, like, the the the infrastructure, not just the computer infrastructure, but the

Speaker: 00:42:44

logical infrastructure to distribute this stuff,

Speaker: 00:42:48

it's probably not a trivial problem. That's the first thing that popped in my mind.

Speaker: 00:42:51

I was like, you know, like, oh, yeah. You're right about the distributed

Speaker: 00:42:55

activity on this data, but, wow, what does that

Speaker: 00:42:59

look like? What do updates look like? Like, the whole like, it's a it sounds

Speaker: 00:43:02

like a growth industry to me.

Speaker: 00:43:07

Definitely. Yeah. Yeah. I don't it's, it's

Speaker: 00:43:10

what we call, engineering problem. Right? So

Speaker: 00:43:14

creating ontology is data science or generative AI problem, but

Speaker: 00:43:17

distributing it, maintaining it, thinking it's its engineering problem.

Speaker: 00:43:21

Engineering problems tend to to have engineering solutions. Oh, Oh,

Speaker: 00:43:25

that's a good point. That's a good way to look at it. I like that.

Speaker: 00:43:27

I like that. So did you wanna do the, premade questions?

Speaker: 00:43:31

Because we haven't we've gone a few shows without them. If you're okay with those,

Speaker: 00:43:34

Ina, we can we can ask them. If not, that's fine

Speaker: 00:43:38

too. Of course. Yeah. Sure. Mhmm. So they're not they're not complicated.

Speaker: 00:43:41

They're more kinda just general questions. I pasted them in the chat.

Speaker: 00:43:46

But the first question and and you've had a a pretty

Speaker: 00:43:50

significant career with SAP and and before that. How'd you

Speaker: 00:43:53

find your way into this space? Did you find data or did

Speaker: 00:43:57

data find you? I

Speaker: 00:44:01

found my way to data by being frustrated

Speaker: 00:44:04

user. Right? So I started in engineering

Speaker: 00:44:08

and it was evident to me that

Speaker: 00:44:12

using data as engineer is not enough. You have to go to

Speaker: 00:44:15

data management. You have to fix those things because otherwise

Speaker: 00:44:19

I will I will going to be frustrated for the end of my life. Right?

Speaker: 00:44:22

So I went to data management analytics to to solve the problem

Speaker: 00:44:26

and I discovered that, as you mentioned, every experience

Speaker: 00:44:30

has a footprint. So my experience with graphs and with

Speaker: 00:44:34

operational research and multidimensional geometry and all of that is so

Speaker: 00:44:38

useful for data management. And it was actually exhilarating.

Speaker: 00:44:42

That's true. Like and I like that because, like, every experience does leave

Speaker: 00:44:46

a footprint. Like, you know, that that's cool. I'm gonna I'm gonna pull that out

Speaker: 00:44:50

as a special quote for the episode. That's a great quote. Yeah. So

Speaker: 00:44:54

our next question why we do these? Yeah. Is what's your favorite part of your

Speaker: 00:44:58

current gig? My favorite part of being a

Speaker: 00:45:01

founder is is

Speaker: 00:45:05

unlimited ability of experimentation,

Speaker: 00:45:09

right? So majority of my day actually say no

Speaker: 00:45:13

to things, not to experiment, which is which is hard, which is not fun part,

Speaker: 00:45:17

right? But, still, we can

Speaker: 00:45:21

make decisions and we can do

Speaker: 00:45:24

new stuff every day. So as a founder,

Speaker: 00:45:28

it's been very, very different than enterprise setting. And don't don't take

Speaker: 00:45:32

me wrong. Like, SAP is a huge place of growth and had

Speaker: 00:45:36

very, fulfilling career at SAP, you know, building

Speaker: 00:45:39

stuff, founding p and l's, running big organizations,

Speaker: 00:45:43

but but been able to to actually, you know,

Speaker: 00:45:46

start anything new. And, like, right now, we have this customer

Speaker: 00:45:50

and they want to to try Illumax on in

Speaker: 00:45:54

parallel on the newest, you know, newest BI

Speaker: 00:45:58

tool with semantic layer or and on the oldest

Speaker: 00:46:02

warehouse on premise at once. I'm like, okay. Challenge accepted.

Speaker: 00:46:05

Yeah. And next Wow. Yeah. And next day, you know, engineer

Speaker: 00:46:09

comes with we have this academic data set and they have these benchmarks.

Speaker: 00:46:13

Let's beat them. I'm like, yeah, let's do it. It could be cool stuff.

Speaker: 00:46:17

Right? Lovely. So, you know, you know, it's to some extent,

Speaker: 00:46:20

so we don't need to justify it, you know, business wise and but but in

Speaker: 00:46:24

majority of cases, we can. Cool.

Speaker: 00:46:28

We have a couple of complete the sentences. When I'm not working, I

Speaker: 00:46:31

enjoy blank. I used to

Speaker: 00:46:35

enjoy doing jogging and yoga when I'm not working.

Speaker: 00:46:39

Right? So right now when I'm not working which means when I'm not

Speaker: 00:46:43

traveling I just spend time with my family. Whatever

Speaker: 00:46:47

is the plan for the weekend if it's just you know Netflixing,

Speaker: 00:46:51

or cooking or hiking whatever is the plan I just

Speaker: 00:46:55

join So sometimes just, you know, plan it. But spending time with my

Speaker: 00:46:58

family has become, indulgence and I'm

Speaker: 00:47:02

very focused on that. Cool. Very cool. Our

Speaker: 00:47:05

next is I think the coolest thing in technology today

Speaker: 00:47:09

is blank. I think the coolest tech is

Speaker: 00:47:13

thing right now is not in tech. It's actually the

Speaker: 00:47:16

pull from CEOs of companies

Speaker: 00:47:20

for technology. This is something which didn't experience for decades.

Speaker: 00:47:24

So we were pushing cloud and big data and machine learning and deep learning. We

Speaker: 00:47:27

were explaining to business stakeholders why do they need that. Mhmm.

Speaker: 00:47:31

And now, so you're all coming and saying, okay, I want to have

Speaker: 00:47:35

chatbot experience for x y that, so just

Speaker: 00:47:38

build it. This is actually I think this is the coolest

Speaker: 00:47:42

part because it's kind of a removes majority of the friction that

Speaker: 00:47:46

we had to to deploy technology in the past.

Speaker: 00:47:50

Interesting. On our 3rd and final complete the sentence,

Speaker: 00:47:54

I look forward to the day when I can use technology to blank.

Speaker: 00:48:00

So many things. You know, travel has

Speaker: 00:48:04

been so frustrating lately, and, I

Speaker: 00:48:07

don't think what happened because it's like kind of technology goes

Speaker: 00:48:11

forward but airline, you know, travel technology,

Speaker: 00:48:15

hospitality technology in general, I don't feel it bridges a

Speaker: 00:48:18

gap. So I really look forward to the

Speaker: 00:48:21

future where I can just have this comment, this prompt

Speaker: 00:48:26

of plan, this conference in Dallas on

Speaker: 00:48:29

x and the system already knows all by preferences and

Speaker: 00:48:33

just done. Oh, boy. It would be it would be fantastic.

Speaker: 00:48:38

Yeah. That that the travel experience as I I've had to

Speaker: 00:48:41

travel quite a bit, like, for the past, like,

Speaker: 00:48:45

couple months, and it's just like, oh my god. Like, it never was

Speaker: 00:48:49

great, but awful is not a word I remember. But it's post

Speaker: 00:48:52

pandemic, I think it's gotten way worse. It's like there's just so many small things

Speaker: 00:48:56

that you could be done a lot better. I'm I'm a 100% with you on

Speaker: 00:48:59

that one. So true. So our our next

Speaker: 00:49:03

question is to, ask you to share something

Speaker: 00:49:06

different about yourself.

Speaker: 00:49:11

Sharing something different about myself. I think I'm a controversial

Speaker: 00:49:15

person in general. So, so some people,

Speaker: 00:49:20

so some people agree with, you know, with the degree

Speaker: 00:49:24

of, of living in the future. So I,

Speaker: 00:49:27

I, you know take myself as person who is very much in the future so

Speaker: 00:49:31

all this seed happening and I might be a little bit you know ahead because

Speaker: 00:49:34

I see the technology being developed in my mind is already there, it's already

Speaker: 00:49:38

used right? So and so where this is

Speaker: 00:49:42

where I see myself controversial because you know in majority of the

Speaker: 00:49:46

cases, then you sit over family dinner

Speaker: 00:49:50

and say, you know, we're still paying our bills

Speaker: 00:49:53

online when we have this notification. Right?

Speaker: 00:49:57

So everyday technology has

Speaker: 00:50:00

developed a lot. And when I'm speaking about this application

Speaker: 00:50:04

free future and, you know,

Speaker: 00:50:08

automated, x y zed. Sometimes or many

Speaker: 00:50:12

oftentimes on everyday level, we are still not there and

Speaker: 00:50:16

this is where people think that I'm too visionary or too

Speaker: 00:50:20

too dreamer on that. Interesting.

Speaker: 00:50:23

No. I'm with you on that one.

Speaker: 00:50:28

Growing up, I was the technical person in the family. So

Speaker: 00:50:32

Yeah. They don't they don't know what you're talking about. Right? I I I love

Speaker: 00:50:36

how the, you know, or, you know, they all they

Speaker: 00:50:39

all get confused until the printer breaks and then suddenly

Speaker: 00:50:43

But you're the smartest people in the room. That's why you're the smartest person in

Speaker: 00:50:46

the world. Alright. So where can people find out more about you and

Speaker: 00:50:50

Illumix? I love socializing on

Speaker: 00:50:53

LinkedIn. I don't know that many people think LinkedIn became a

Speaker: 00:50:57

marketing tool. I still see tons of valuable

Speaker: 00:51:00

discussions and I just absolutely love keeping in touch

Speaker: 00:51:04

on LinkedIn and and see the latest and greatest and I also share quite a

Speaker: 00:51:08

bit. So LinkedIn would be the the most

Speaker: 00:51:11

straightforward way in Atokaropsala on LinkedIn.

Speaker: 00:51:15

We do have blogs and I actually write many of

Speaker: 00:51:19

them. So if you go to illumeg.ai/blocks,

Speaker: 00:51:23

you will see lots of materials written on semantics,

Speaker: 00:51:27

on ontologies, on generative AI governance. So those

Speaker: 00:51:31

topics which are close to my heart, and we communicate quite

Speaker: 00:51:35

frequently on that. Very cool. Very cool. Very cool. So

Speaker: 00:51:39

so Audible is a sponsor. And if you

Speaker: 00:51:42

would, like to take advantage of a free month of

Speaker: 00:51:46

Audible on us, you can go to the datadrivenbook.com.

Speaker: 00:51:51

I just tested the link. That's why I was looking over here for anyone watching

Speaker: 00:51:55

the video. And it works. Sometimes it doesn't. And

Speaker: 00:51:59

we ask, our guests, do you have, do first, do you

Speaker: 00:52:02

listen to audio books? And if so, can you recommend 1? If

Speaker: 00:52:06

you don't listen to audio books, just a a good book.

Speaker: 00:52:11

I do listen to audiobooks. I also podcast, more

Speaker: 00:52:14

frequently recently. I I'm not sure this book is

Speaker: 00:52:18

already on Audible, but, if not, it's going to be

Speaker: 00:52:21

in Audible soon enough. So it's Nexus by Yuval Noah

Speaker: 00:52:25

Harari. It is audible. I have it in the library already.

Speaker: 00:52:28

Yeah. Amazing. So it speaks about the truth

Speaker: 00:52:32

in the age of generative AI. Right? Interesting.

Speaker: 00:52:36

What's the truth? What's the ground truth? And I was

Speaker: 00:52:40

actually in the lunch party in SoHo, New York, you know when Yuval

Speaker: 00:52:44

was speaking about you know how how technology

Speaker: 00:52:47

and what we see right now is not very different from what we experience

Speaker: 00:52:51

in you know middle age like when when Gothenburg

Speaker: 00:52:55

and printing was was a new thing and like what was

Speaker: 00:52:58

printed actually was you know rumors

Speaker: 00:53:02

and juicy stuff rather than scientific books and this

Speaker: 00:53:06

is where what we see right now in, you know, in chatbots and internet, on

Speaker: 00:53:09

social overall. So it's it's interesting parallels that he's

Speaker: 00:53:13

taking about what's what truth is in generative

Speaker: 00:53:17

AI age where what truth were was, like, 20 years

Speaker: 00:53:20

ago or even, like, 500 years ago. Yeah.

Speaker: 00:53:24

We're the we're the same species with the same problems and the same drama

Speaker: 00:53:28

and the same drivers. Like, it's just our tools have changed, whether

Speaker: 00:53:32

it's a printing press or, you

Speaker: 00:53:36

know, celebrity gossip or whatever or fake news

Speaker: 00:53:39

or anything like that. Plus, I also think the, you know, there's an old phrase

Speaker: 00:53:43

like who watches the watchers. Right? Like Mhmm. Who decides what's

Speaker: 00:53:46

misinformation and who decides what's true? I think. I think

Speaker: 00:53:50

because misinformation could be, you know, there there's

Speaker: 00:53:54

a image of me robbing a bank. Right? Like, you know?

Speaker: 00:53:57

Mhmm. Mhmm. I thought, Frank, I thought when the US

Speaker: 00:54:01

Marshals put you into the witness protection program, they said

Speaker: 00:54:05

we couldn't bring up you robbing a bank any any longer.

Speaker: 00:54:09

Misinformation. You gotta be careful because, like, one of the things I I wanted the

Speaker: 00:54:13

flow was so good. I didn't wanna interrupt it. But, like, one of the things

Speaker: 00:54:15

was I was experimenting with fine tuning an LLM locally.

Speaker: 00:54:19

Mhmm. And I'm basically trained it on information about my blog. My blog's

Speaker: 00:54:23

been around since 1995. Right? Or my site has been around since 1995.

Speaker: 00:54:28

One of them hallucinated this really great origin story for my

Speaker: 00:54:31

website. It was awesome. It was awesome. I'm like, I like that

Speaker: 00:54:35

better. So basically, it said that Always. Always.

Speaker: 00:54:39

It was really good. It was basically that Frank's World started as a

Speaker: 00:54:42

show, a kids TV show in the nineties on

Speaker: 00:54:46

the BBC or channel 4. I forget. Like one

Speaker: 00:54:50

of the big British channels. And it was about a talking

Speaker: 00:54:53

trash can named Frank that would teach kids about the importance

Speaker: 00:54:57

of, recycling. That's my favorite part.

Speaker: 00:55:01

And it was and it was the best part was that it was it was

Speaker: 00:55:04

the first professional project of the guys who did Sean the sheep and Wallace and

Speaker: 00:55:08

Gromit. Yeah. And I'm like so I

Speaker: 00:55:12

I I pinged the guy I worked with. Has this ever been a show?

Speaker: 00:55:15

Because no. Not that I ever heard of. And I looked over it. I couldn't

Speaker: 00:55:18

find it. But and then what I did was as an experiment, I fed

Speaker: 00:55:22

that that whole paragraph that it came up with into

Speaker: 00:55:26

notebook l m. Mhmm. Notebook l m

Speaker: 00:55:30

took that and ran with it. There's, like, a 20

Speaker: 00:55:33

minute audio, and it is the funniest thing because it basically

Speaker: 00:55:38

talks about the early environmental movement. They said it was the Britain's

Speaker: 00:55:41

answer to, Captain Planet. Like, they made up all the

Speaker: 00:55:45

stuff. And now it's documented. So now someone is going

Speaker: 00:55:49

to pulling to pull some information. And if you have Right now it's out there.

Speaker: 00:55:53

Right. And I guess to your point earlier about Lumix, like, if you start

Speaker: 00:55:56

building a crooked foundation, right, like, that eventually as

Speaker: 00:56:00

it moves on, it's gonna so, I mean, who knows, like, couple of years

Speaker: 00:56:04

from now, like, Wikipedia may say, like, there might be a

Speaker: 00:56:08

Wikipedia article about this TV show didn't exist. We're talking about it. We're feeding

Speaker: 00:56:11

the machine. That's fascinating.

Speaker: 00:56:15

Yeah. And it was a so a little bit on the books. I have to

Speaker: 00:56:18

mention it, like, in a couple of sentences. So, in US

Speaker: 00:56:22

a legal entity actually is a citizen. It

Speaker: 00:56:26

has social number. Right. So, technically machines

Speaker: 00:56:30

can create legal entities. They can vote, they can,

Speaker: 00:56:34

you know, they can create information and this information is,

Speaker: 00:56:37

you know, created with social number, with identifiers. So it's actually real

Speaker: 00:56:41

information. It's not fake news. It's created by social number.

Speaker: 00:56:45

And so this is how you create, like, this new truth. Right?

Speaker: 00:56:49

And, and how do you control that? So it's an interesting aspect of what's,

Speaker: 00:56:53

what even is defined as ground truth.

Speaker: 00:56:57

That's true. Everybody needs to define it. I think that's gonna be the question of

Speaker: 00:57:00

the 20 That's a big deal. Mhmm. Yeah. Well,

Speaker: 00:57:03

awesome. It's been great. We wanna be respectful of your time. This has been an

Speaker: 00:57:06

awesome show. Yeah. We'll let Bailey finish the show. And

Speaker: 00:57:10

that's a wrap for today's episode of data driven. A massive

Speaker: 00:57:13

thank you to Ina Tokarev Saleh for joining us and sharing her

Speaker: 00:57:17

fascinating insights into the world of generative AI, semantic

Speaker: 00:57:20

fabrics, and the ever evolving relationship between humans,

Speaker: 00:57:24

data, and decision making. If you're as inspired as we

Speaker: 00:57:27

are, be sure to check out IllumiX and follow INA on LinkedIn for

Speaker: 00:57:31

more thought leadership in the AI space. As always, thank

Speaker: 00:57:35

you, our brilliant listeners, for tuning in. Don't forget

Speaker: 00:57:39

to subscribe, leave a review, and share this episode with your data

Speaker: 00:57:43

loving friends or that one colleague who insists they don't trust

Speaker: 00:57:46

AI. We'll convert them eventually. Until next

Speaker: 00:57:49

time, stay curious, stay caffeinated, and remember,

Speaker: 00:57:53

in a world driven by data there's no such thing as a trivial

Speaker: 00:57:57

question, just fascinating answers waiting to be found. Catch

Speaker: 00:58:00

you next time on Data Driven.

Share Episode

Shownotes

Timestamps

Transcripts

Follow

Links

Chapters

Video

More from YouTube