In today's episode, we're thrilled to have Niv Braun, co-founder and CEO of Noma Security, join us as we tackle some pressing issues in AI security.
With the rapid adoption of generative AI technologies, the landscape of data security is evolving at breakneck speed. We'll explore the increasing need to secure systems that handle sensitive AI data and pipelines, the rise of AI security careers, and the looming threats of adversarial attacks, model "hallucinations," and more. Niv will share his insights on how companies like Noma Security are working tirelessly to mitigate these risks without hindering innovation.
We'll also dive into real-world incidents, such as compromised open-source models and the infamous PyTorch breach, to illustrate the critical need for improved security measures. From the importance of continuous monitoring to the development of safer formats and the adoption of a zero trust approach, this episode is packed with valuable advice for organizations navigating the complex world of AI security.
So, whether you're a data scientist, AI engineer, or simply an enthusiast eager to learn more about the intersection of AI and security, this episode promises to offer a wealth of information and practical tips to help you stay ahead in this rapidly changing field. Tune in and join the conversation as we uncover the state of AI security and what it means for the future of technology.
00:00 Security spotlight shifts to data and AI.
03:36 Protect against misconfigurations, adversarial attacks, new risks.
09:17 Compromised model with undetectable data leaks.
12:07 Manual parsing needed for valid, malicious code detection.
15:44 Concerns over Agiface models may affect jobs.
20:00 Combines self-developed and third-party AI models.
20:55 Ensure models don't use sensitive or unauthorized data.
25:55 Zero Trust: mindset, philosophy, implementation, security framework.
30:51 LLM attacks will have significantly higher impact.
34:23 Need better security awareness, exposed secrets risk.
35:50 Be organized with visibility and governance.
39:51 Red teaming for AI security and safety.
44:33 Gen AI primarily used by consumers, not businesses.
47:57 Providing model guardrails and runtime protection services.
50:53 Ensure flexible, configurable architecture for varied needs.
52:35 AI, security, innovation discussed by Niamh Braun.
Hello, listeners. And welcome back to another thrilling episode
Speaker:of data driven. In today's episode, we delve deep into the
Speaker:fascinating and, let's be honest, slightly terrifying world of
Speaker:generative AI and security risks. Joining us is
Speaker:Niamh Braun, co founder and CEO of Noma Security,
Speaker:who's on the front lines of keeping your AI driven project safe from
Speaker:digital mischief. So grab a cuppa and let's get data
Speaker:driven. Well, hello, and welcome back to Data Driven, the podcast where we explore
Speaker:the emergent fields of AI, data science, and, of course, data
Speaker:engineering. Speaking of data engineering, my favoritest data
Speaker:engineer in the world can't make it, today. But we
Speaker:have an exciting, conversation queued up with Niv Braun,
Speaker:who is the cofounder and CEO of Noma. Noma
Speaker:is a security firm that focuses on effectively
Speaker:he'll describe it more eloquently than I can, but effectively thinks about
Speaker:security in the context of data and AI across the
Speaker:entire life cycle. Welcome to the show, Niv. Hey,
Speaker:Frank. Happy to hear you, bro. Yeah. It's good to have
Speaker:you. And and security is one of those things where I've been thinking about more
Speaker:lately. Right? So my background was a software engineer and,
Speaker:you know, software engineers historically have not thought of
Speaker:security. Then I made the transition into data engineering and data
Speaker:science, and, traditionally, security is not really at top
Speaker:of mind, for them either. Now I
Speaker:kinda look at this, and I kinda look at the landscape that we're in where
Speaker:enterprises are deploying LLMs,
Speaker:generative AI solutions, on top of the predictive AI solutions,
Speaker:fast and furiously, and not thinking about
Speaker:security ramifications. So what are your what's your take on
Speaker:that? 100% agree. I think that, it's
Speaker:even like the the the the current, like, timing is even more fascinating
Speaker:than the than just, like, a new technology. Because exactly like you said, like,
Speaker:Frank, like, we all like the data practitioners. We all know that, like, security is
Speaker:not, like, our top priority. And by the way, like, by, like, like, this is,
Speaker:like, how it should be. Like, we are focusing on the business and, like, drive,
Speaker:like, drive, like, the business forward. And this is why we're, like, this is
Speaker:what we're paid for. The problem is that
Speaker:because we're not, like, in this kind of, like, mindset, we also, like, like
Speaker:any technologies in the company, also, like, create some risk. What we see right
Speaker:now is the LLM drive, which is pretty cool, is that for the
Speaker:first time, the security teams started to put
Speaker:the focus and, like, the spotlight on the data and AI teams. Because until
Speaker:now, let's be honest, they were focusing only on the software developers and
Speaker:their SDLC and the CICD and all these areas. Like, we were,
Speaker:like, you know, like, in the shadow. And we were, like, able, like, to act
Speaker:like exactly like, like, like, completely freely as we wanted.
Speaker:But now when, like, the security team start, like, to put the spotlight on the
Speaker:data and AI teams, what they understand is that it's not
Speaker:only this new kind of LLM threats, but also all
Speaker:the basic principles of security are not implemented
Speaker:in the data engineers and the data science teams. Nobody, like, scans all the
Speaker:code in our notebooks, for example, unlike the software developers that, like, all
Speaker:their code is being scanned. Nobody helps us to
Speaker:find configurations in our data pipelines or our
Speaker:MLOps tools or our AI platforms, like Databricks, for example.
Speaker:Like, nobody, like, provide us this ability to to find it easily,
Speaker:unlike, again, the software developers that they receive all this coverage
Speaker:and everything. Like, on the moment that they have, like, the smallest misconfigurations
Speaker:in their SCM or their their CICD, they
Speaker:will immediately, like, receive, like, a notification, like,
Speaker:helping them exactly, like, how to secure it. And also eventually,
Speaker:like, in the run time, in the runtime, in software life cycle, in
Speaker:classic like software application, we also have a lot of API security and web
Speaker:application firewalls tools that help us to protect the application in the
Speaker:runtime. But now specifically in LLM, this is, like, very
Speaker:related also, like, to what you said. Like, there are new kind of adversarial attacks,
Speaker:all the prompt injection and model jailbreak and stuff like that.
Speaker:And, again, nobody, like, else would like to protect it, like, in real time. And
Speaker:I think that this is, like, one of, like, the main shift that we see
Speaker:today in this area. We understand that the spotlight
Speaker:moved to the data and AI teams, but we need to make sure that we
Speaker:do, like, both. Like, we start with, like, a new kind, like,
Speaker:trendy, like, risk that we want to make sure that we are protected from.
Speaker:But also that for the first time, after a lot of years, we're
Speaker:starting also, like, to implement the basic security measurements
Speaker:needed in our area. But the most important thing, of course,
Speaker:is to continue and, like, do it without slowing us down. Like, we need to
Speaker:make sure that, like, everything, like, all the different, like, security measurements that
Speaker:we take still provide us the ability to move fast, to enable
Speaker:the data sent the data science and the data engineering teams to
Speaker:continue and, like, innovate, but in a secure way.
Speaker:You know, that's a good point because I never thought about scanning a notebook for
Speaker:errors. Right? Shame on me. Right? Like for code
Speaker:security I mean, not errors, but, you know, security vulnerabilities. That's not something
Speaker:that I have seen done in practice. I mean, the the
Speaker:closest I've seen where security has been an issue for
Speaker:anyone in this space is,
Speaker:basically using protected, you know, Python
Speaker:libraries, right, or or Python library repos, right, where they're those
Speaker:are scanned by, I forget the name of the 3rd party that'll do it where
Speaker:you just basically say you point your Python instance to there. Yeah. Because
Speaker:I also think that Internal Artifactory. Yes, exactly. So
Speaker:like, what, because I often
Speaker:wonder, you know, people just like to install.
Speaker:God only knows what's in there. I can tell that, like, it already, like, happens.
Speaker:Like, I don't know if you heard, but for example, like, like,
Speaker:like, pretty recently, PyTorch, for example. Right.
Speaker:PyTorch that we all know was compromised. We all know and love. We're most people
Speaker:love. It was compromised. Like, specific version of PyTorch, a
Speaker:malicious actor succeeded to to put some
Speaker:code inside that basically,
Speaker:collected all the the the secrets and token that you have in the
Speaker:environment and sent it to DNS. Now we all
Speaker:know, like, how much like like, how many downloads, like, PyTorch have.
Speaker:And most times, where PyTorch is downloaded to through, like, to
Speaker:all these different, like, notebooks, wherever they be, JupyterOps,
Speaker:SageMaker, Databricks, like, we all use them.
Speaker:And it I can tell that, like, it caused us to a lot of, like,
Speaker:problem. I can tell, like, like, like, firsthand, like, we saw, like, a lot
Speaker:of organizations that were compromised because of this attack.
Speaker:And it happens all the time. And by the way, if you mentioned, for example,
Speaker:like, if you already, like, touched the point of, of open source,
Speaker:now you have also Hugging Face, which is completely different area. Now it's
Speaker:not only Open Source packages. It's all these different Open Source
Speaker:Hugging Face models and Hugging Face datasets. And there,
Speaker:all these internal artifact are completely useless because they don't even
Speaker:scan these models. It's completely different technology, completely different, like,
Speaker:heuristics in order to find it. And, therefore, you start to
Speaker:see kind of, like, trends for for the attackers. They started to
Speaker:upload a lot of backdoored and a lot of malicious models
Speaker:into Hugging Face. I can tell you, like, we personally, we already,
Speaker:like, detected, I think, almost, like, 100, back
Speaker:or the malicious models, on Hugging Face because it's a wild
Speaker:west. Right. Because how do you because these these models, first off,
Speaker:they're physically large files. Right? So that there's that's a factor.
Speaker:Right? I don't know how Hugging Face makes money. I'd be
Speaker:curious to have someone on the show talk about that. But, you know,
Speaker:they're doing the service. And, how would you even scan? I
Speaker:mean, that's a good question. Right? What types of vulnerabilities have you sent have you
Speaker:found so far? And how does one even scan, like, a safe
Speaker:tensor or g file? Like, how do you what's what's
Speaker:that look like? Right? Obviously, I'm pretty sure, you know, McAfee
Speaker:antivirus doesn't have a thing for that. But, like Exactly.
Speaker:But, how do you even do that? I'm just curious. Yeah. So this is, like,
Speaker:exactly, like, the problem. Like, it's even, like, in in in the models, like, it's
Speaker:even, like, a a more, like, the the risk there, like, is
Speaker:more, like, clearer because as you know, a lot of time, like,
Speaker:these models in hanging face are even, like, in pickle. And, like, pickle is, like,
Speaker:by design, like, insecure, like, file. And so
Speaker:binary dump, right, of, like, the memory space. Yeah. Like, in the deserialization
Speaker:process, like, basically, you can, like, put, like, any kind of, like, malicious,
Speaker:action that you'd like, that, like, the attacker can. So we see,
Speaker:like, different attacks. Like, most of the attacks come today, like, from pickle files.
Speaker:Some also, like, not even, like, in the deserialization process, but also, like, in the
Speaker:model code itself. For example, like, if you ask
Speaker:for a specific example, like, share something that we
Speaker:detected, like, recently. We found, like, a very,
Speaker:let's say, a popular, open source, LLA model that we all
Speaker:know. But we know that, like, a it has a lot of, like, different
Speaker:versions. And one of the version was actually a docker
Speaker:that took the original model, wrapped it up with few
Speaker:lines of code in the model, which what they did is that every
Speaker:input to the model and every output from the model
Speaker:was also sent to the attacker, which basically
Speaker:just received full visibility and observability to all the
Speaker:runtime application and production. So, like, all the organizations that,
Speaker:like, use this model. And performance wise, the
Speaker:data scientist, of course, they cannot, like, detect it because performance
Speaker:wise, it worked perfectly because it took the original model. So nothing to be
Speaker:suspicious about. If we want the data
Speaker:scientist, every new open source model that they like, like
Speaker:in Hugging Face, they'll start, like, to open, like, these files and the binaries and,
Speaker:like, to start, like, to looking, like, in their own hands, they're manually
Speaker:for, like, a for a for risk. First, like, of course,
Speaker:like, we understand that this is not their expertise and, like, it it
Speaker:like, we want to be secured, but, like, like, even, like, worse,
Speaker:we just spend all their time on security. And I think that
Speaker:this is, like, the worst stuff. Actually, it's not the worst. I think that, like,
Speaker:the worst, and this is also, like, something that, like, I saw recently in several
Speaker:organizations is just, like, to block everything. Organizations
Speaker:that, like, understand, okay, Hugging Face model, it's, like, true, like, a secure, like,
Speaker:in secure area. Let's block it. Let's say, like, to
Speaker:all the data scientists in the organization, you're disallowed to use HAG interface model. I
Speaker:think this is, like, the worst. That seems like a mistake because
Speaker:because the people are gonna find a way. Well, 1, where you can't stop the
Speaker:signal. Right? That was a line from, a movie.
Speaker:They can't, kudos if people know who that what movie that is.
Speaker:But, you know, if you block Huggy Face, people are gonna find a way
Speaker:around that. They're gonna put it on a thumb drive at
Speaker:home and then bring it in. So percent. This is, by the way, also, like,
Speaker:what you see, like, with this kind of, like, internal Artifactory. You see that, like,
Speaker:once you get to you you create for the r and d or create for
Speaker:the developers or for the data scientists, you create some level of, like,
Speaker:friction. They will just find a way out to, like, bypass
Speaker:it and to to lower this, this friction.
Speaker:Right. So so couple of questions.
Speaker:One, I've seen, improper naming
Speaker:Not improper naming, but but basically using, names,
Speaker:like, that's looks similar to what should be. Yeah. Type will split. Type
Speaker:type splitting. That's it. I've seen that, which is kind of, I guess,
Speaker:kind of, you know, dollar store approach. But also,
Speaker:how does how does it if you wanted to look through these model files, as
Speaker:far as I know, they're just I just looked at them. I just see binary
Speaker:stuff. Like, how would you look for malicious code in there? Because I think you're
Speaker:right. That's not a skill set the average AI engineer or data scientist
Speaker:would have. Yeah. So, basically, like, you need, like, to manually kind of, like,
Speaker:parsing it because, like, you have, of course, like, the the binary file, but most
Speaker:times, it's not only, like, the binary file. You label for, like, the the code
Speaker:file that run, like, run the model, and you label for, like, the, in
Speaker:case it's, like, pick a, like, the deserialization process, that you can, like,
Speaker:parse and then, like, to see, like, the code there. But then you
Speaker:need also, like, you know, like, you have, like, 2 phase. 1st, you need to
Speaker:to parse it, you know, like, to see, like, the code, but then you need
Speaker:also, like, to be able to read code and to understand which
Speaker:one is valid and which one is malicious, which is also, like, completely, like, you
Speaker:know, like, you need expertise in this area. If you see bash
Speaker:commands, is it okay or not? Do you see access to the
Speaker:Internet? Okay or not? Like, you you need, like, to have, like,
Speaker:some, like, detectors in there that, that know how to do it, like, build
Speaker:by by expert or something. So how would you even detect
Speaker:that if you found it? Like, how was this found? Was this just somebody looking
Speaker:in network packets? Or, like, what how was it discovered? I'm
Speaker:just curious. Yeah. This specifically was, like, by our
Speaker:security research team. Okay. Yeah. That's like, looks a
Speaker:lot if, a lot like all the time, like, you know, all these different kind
Speaker:of, like, open source and third party models in order to to help
Speaker:our users to make sure that, like, everything that they use
Speaker:is is valid. And again, most importantly, without slowing
Speaker:them down. They can just, like, download and, like, run, like, with everything that they
Speaker:that they want. And in case, we see something that is,
Speaker:that is suspicious, we know how to detect it and to to help them to
Speaker:to secure it. Interesting. Interesting.
Speaker:Because I know a lot of people, you know, they they've been downloading
Speaker:these models from Hugging Face. And just taking it on
Speaker:faith, and I've heard that these things don't call
Speaker:out to the Internet. Mhmm. And I fell into that. And then
Speaker:I kinda had this moment of paranoia where I'm like, how do I know?
Speaker:I mean, the only way I'm a I'm just a humble data scientist. Right? Like,
Speaker:so the only way I would think about it would be to have a firewall
Speaker:rule that would block network traffic going up for that box.
Speaker:And I'm sure there's probably workarounds to that too. I mean, are these
Speaker:attacks are these attacks that sophisticated yet?
Speaker:Yeah. Yeah. And, like, also, like, most times you don't, like, the data
Speaker:science, like, they don't want, like, to permanently, like, to close, like, the Internet, like,
Speaker:the outbound because also, like, the application needs it. And also, like, the, you
Speaker:know, like, the the in order, like, to download, like, the dependencies and the models
Speaker:you needed. So most times, like, just, like, to block the Internet, it doesn't solve
Speaker:everything. It was, like, more, like, in the past that everything was, like, network based
Speaker:only. Today, when you have, like, also, like, the applicative layer here, so
Speaker:it's, like, a bit more sophisticated.
Speaker:But yeah. Wow. So
Speaker:the safe tensor format, as I understand it, what you
Speaker:know, you basically digitally sign or somebody
Speaker:digitally signs the contents of it. Is that is
Speaker:that a correct understanding? Yeah. So it's end up like a
Speaker:like, in general, first thing, of course, that, like, a safe denture is, like,
Speaker:much more secure. Okay. I already like by design, and as long
Speaker:as we as the industry will go, like, more and more, like, towards
Speaker:this road, because today, like, we still see, like, tons of light pickles.
Speaker:But as long as we progress, like, all as an industry, we'll already,
Speaker:like, be, like, in a bit better situation. It's not
Speaker:perfect, of course. We still see some issues. And, of course, organizations still
Speaker:need, like, to have some security measurements and processes
Speaker:to make sure that, like, they're aware of what,
Speaker:like, Hang in Face are using. But I think that it's already, like,
Speaker:going to be a bit better. I can tell you something that, actually,
Speaker:like, recently one of our one of our partners told me,
Speaker:which was pretty cool, very similar to what you said that you
Speaker:start, like, to feel a lot of concerns about this area.
Speaker:VP data science of a very big like,
Speaker:Fortune Fortune 500, like, very big, like, corporate. And you kind
Speaker:of, like, the the head of, like, the older data science, like, groups here. And
Speaker:they told me, you know, Niv, I I already know
Speaker:what I'm going to be fired about, like, in a in the next, like,
Speaker:24 months, and it's going to be about that. I know for sure, like, we're
Speaker:using, like, so much, like, Agiface models. I know for sure that I'm this is,
Speaker:like, the reason that I'm going to be fired, like, one day. Because today, like,
Speaker:we're using it, like, freely. We are also, like, very creative. We're not, like,
Speaker:only using, like, the most popular LAMA model, but, like, we're to,
Speaker:like, take advantage of this great advantage of the platform, which is, like,
Speaker:the amount and the diversity of the model that you have there. But I have
Speaker:no no doubt that we create so many risks that we're just,
Speaker:like, not exposed yet, that I'm going to to pay with it,
Speaker:like, with my head. So it it's
Speaker:it's pretty cool because it's not it's not always that you see,
Speaker:r and d and business owners that are so concerned
Speaker:about security even before the security team arrived
Speaker:to them. But they're already aware of this risk. And it's something that
Speaker:we start, like, to see more and more because, you know, it's just like it's
Speaker:it's too obvious. Like, the the the window is open and everybody see it.
Speaker:Yeah. I I would suppose that's in in a in a very kinda strange way
Speaker:that's bit progress, right, where people think about security beforehand.
Speaker:Like, even if they don't know I mean, I think this this this VP,
Speaker:you know, is pretty spot on. Like, what concerns me about the widespread
Speaker:adoption of these models and particularly Hugging Face, so there are no knock on Hugging
Speaker:Face. I think whatever you get your models Mhmm.
Speaker:I mean, we just don't know. And these things are just complicated.
Speaker:Right? I mean, they are by design complicated with 1,000,000,000,000 of
Speaker:parameters. In some cases, I guess, 1,000,000,000,000. But also, you
Speaker:know, they have this ability to even
Speaker:even if everything worked out well, even even assuming everything is
Speaker:fine, right, in terms of the operationalization of these things,
Speaker:There's still the chance that the model itself and its
Speaker:training was poisoned. So, like,
Speaker:I I mean, like, there's just so many because when I my wife works in
Speaker:IT security, and I was all excited. It was about a year and a half
Speaker:ago. I I was talking to her about LLMs and stuff like that
Speaker:and chat GPT and and and those types of things.
Speaker:And I was like, oh, well, you take all this data and you train a
Speaker:model and you you distill down this graph and this and this. And then she's
Speaker:like, that sounds like a big attack surface to me. Yeah.
Speaker:And I was like like, data poisoning in the classic one and data
Speaker:poisoning can be, like, in in in 2 levels or, like like, someone
Speaker:like poisoning your data or exactly what you say,
Speaker:somebody just, like, this way, like,
Speaker:create backdoor in, in third party models and open source
Speaker:models that then, like, everybody downloads. Right.
Speaker:Right. And we wouldn't know, like, what's
Speaker:the I mean, the defense against that seems very
Speaker:intricate. Not impossible, but very delicate and intricate.
Speaker:So in in in classic application security, there is a
Speaker:great practice called SBOM. SBOM is a software
Speaker:billing of material. Basically, it means that, you get, like, in
Speaker:specific format, kind of like visibility to all the different
Speaker:software components that build your application. One of the things that
Speaker:now we're also, like, part of the building is a
Speaker:official framework of OWASP, the nonprofit organization
Speaker:around security of AI and machine learning. And
Speaker:what you have there is for the first time you have like double layer
Speaker:of visibility. The first one is just like to understand
Speaker:what models I'm even using in the organization. Everything, like
Speaker:what models like, include in my application. It can be open
Speaker:source models. It can be self developed models. Also, by the way, not only not
Speaker:only LLM, of course, also like vision, NLP, like everything else.
Speaker:And also third party models that are embedded as part of the application, they
Speaker:are not open no. They are not open source. For example,
Speaker:if software engineer add API call as part of the application
Speaker:to OpenAI, in this way, they embed
Speaker:LLM as part of the application. This is also like one of, like, the models
Speaker:that you are using, but you you you want to know this is all my
Speaker:AI and model inventory that I'm using as Spyro as part of the application.
Speaker:And in addition to that, you have even the deeper context there, which
Speaker:is also like what you referred to. It's not only this is
Speaker:the list of the model that I'm using, but for each one, you want to
Speaker:understand on what dataset it was trained, what data
Speaker:maybe also like it has access to in case it's in production, let's say, with
Speaker:RAG architecture. You want to understand, like, the deep context
Speaker:of all these, like, models, what I'm using, but also, like, what
Speaker:happens, like, in this specific, like, model. Sometimes
Speaker:it's, as you said, for to to understand what data was trained on a
Speaker:model before, like, I'm starting, like, to use it by 3rd party, a
Speaker:lot of time is even, like, internally in the organization.
Speaker:Because once we start to train a lot of models,
Speaker:we want to make sure that we don't violate
Speaker:any policy that we have in the organization, either it's for compliance or
Speaker:security. For example, one of the things that, like, we are like, I keep, like,
Speaker:hearing a lot of time from, from security and legal and privacy
Speaker:teams is that, look, we instruct all the
Speaker:organization not to train any sensitive
Speaker:data, PII, PCI, PHI, any other sensitive
Speaker:information on our models. But except instructing
Speaker:it and speak about it, nobody knows if it
Speaker:happens. And we don't provide also our data
Speaker:teams tools that will help them to
Speaker:detect it in case it, like, it happens, like like, not in purpose. For
Speaker:example, I can tell you, like, one of the thing that we saw very
Speaker:recently. Big organization, a huge Fintech company,
Speaker:that data scientist unintentionally trained all the
Speaker:transaction of the application on one of the models. Now it's
Speaker:a, like, crazy big violation there of, like, compliance and
Speaker:security. The data scientist did this unintentionally. They
Speaker:truly, like, didn't know it. If they had something that, like, would help them, like,
Speaker:the basic visibility that you mentioned before, it will truly, like, help them to
Speaker:start, like, to continue, like, innovate and just, like, in case something like bad happens,
Speaker:to be alerted in that. And so I see that, like, the the data training
Speaker:is also, like, very, very important point also internally and not
Speaker:only the external data train on the external models that we're embedding and
Speaker:downloading. So you mentioned, OWASP. So just
Speaker:for the benefit of folks who may not know, because most of our listeners are
Speaker:either data engineers, data scientists. What is OWASP? And what is the
Speaker:I think it's with the OWASP 10? Yeah. So
Speaker:OWASP in general, it's a amazing organization that,
Speaker:is like a nonprofit one that helps basically,
Speaker:we combine a lot of people together, gather together in order to make
Speaker:sure that all our industry is much more secured with a lot of
Speaker:different security initiatives in a lot of different aspects, mainly of like product
Speaker:security, but not only. Product security is like application
Speaker:security. It's building security.
Speaker:Specifically in OASP, you have several different types of
Speaker:projects. So for example, one type of project is the OSP10,
Speaker:top ten, that basically takes different areas
Speaker:and define the top ten risks in this specific area. So it can be top
Speaker:ten for API, top ten for
Speaker:CICD. And now there is also like top ten for LLM.
Speaker:Addition framework, like, there are a lot of like different tools. Specifically,
Speaker:if someone wants to understand a bit more about like the wide
Speaker:landscape and the risk around AI and machine learning,
Speaker:the framework that I would like recommend on, highly recommend on, is
Speaker:amazing and very comprehensive called the OWASP AI
Speaker:Exchange. A group of people, again, gathered together,
Speaker:that covered not only LLM, but all the basic
Speaker:principles and risk in data pipelines and MLOps
Speaker:and start from the building and up to the runtime and start from the
Speaker:classic machine learning and up to Gen AI, very comprehensive,
Speaker:very also practical, which is very important and
Speaker:speaks in both language, on both languages. On one hand,
Speaker:of course, security, but on the other, also like very oriented
Speaker:for data and machine learning and AI practitioners.
Speaker:Interesting. Interesting.
Speaker:What what do you see
Speaker:well, here's what I mean, I'll have a lot of questions, but one of them
Speaker:is, do you think the 0 what do you think the
Speaker:0 trust approach is a good starting point? I don't think
Speaker:it's the answer here like it is kinda everywhere else. But do you think that,
Speaker:that type of philosophy of don't trust anything?
Speaker:Right? Kind of like, I mean, is that because you you mentioned this
Speaker:early when I talked about network firewalls, right, where the old approach of thing
Speaker:is just pull the plug or set up rules. And that used
Speaker:to work, but there's plenty of other ways around it, Both I think
Speaker:kind of low skill, mid skill, and certainly high skill
Speaker:ways around that. What do you you mean then 0
Speaker:trust is meant to address that. What are your thoughts on like I
Speaker:mean, is that the pro is that the mindset that either
Speaker:security folks in this space would have to take on? Like, it's more
Speaker:if they well, they probably already have. Right? Yeah. I think you're,
Speaker:like, I think you're actually, like, the the you you you perfectly
Speaker:defined it because I believe that 0 Trust is exactly like you say, it's kind
Speaker:of like a, like, kind of like a mindset. It's not like a very
Speaker:accurate, like, technical approach, but it's kind of
Speaker:like more like a a philosophy with some level of implementation.
Speaker:I believe that, like, the right mindset and, like, the right framework to look
Speaker:on a on a security for AI and, like, all the building
Speaker:and also, like, the runtime is basically to take all the
Speaker:different principles that we are all already aware
Speaker:of. Like we are all, like I'm saying, like the security industry,
Speaker:we are all already aware of on classic software development,
Speaker:building and runtime, and to implement it on the
Speaker:data and AI lifecycle. For example, if we mentioned, like,
Speaker:code scanning, so code scanning the notebooks, we mentioned open source,
Speaker:so checking all the all the Ag interface models. But it's not only that.
Speaker:For example, one of the things that, like, we see, a lot of attacks that
Speaker:we, like, we had recently in the security area are around the
Speaker:CICD. A few years ago, there was a big attack called
Speaker:SolarWinds, that basically, yeah, so you know it
Speaker:perfectly, just for the audience that, like, are not familiar with the specific details
Speaker:in, like, very high level attacker that exploited and
Speaker:misconfigurations in CICD tools. And this is
Speaker:basically how they succeeded, like, to start, like, this whole huge attack and
Speaker:breach. Now one of the things that, like, it taught us all as an industry
Speaker:is that until now we were focusing on, like, securing only
Speaker:our code. Now we understand that the code is not enough. We need to make
Speaker:sure that the building tools are also well configured. So
Speaker:we start, like, to see a lot of, like, tools that help us to make
Speaker:sure that we don't have misconfigurations in the CICD and the SCMs and all
Speaker:these different kind of tools. But when we are going to our domain, when
Speaker:we go to the data and AI teams, as we know, we just use different
Speaker:stack. We use all these data pipelines and model
Speaker:registries and MLOps tools and platforms like Databricks and Domino
Speaker:and Snowflake and stuff like that. The configuration, as we know, is
Speaker:not like neverwhere. Most time, it's even wider. This is why it's
Speaker:not managed by DevOps. It's managed by us, by the data teams. It's managed by
Speaker:MLOps teams, by data infra, by data platform. And we're doing a
Speaker:lot like, a great job in order to optimize all the configuration for the
Speaker:product. We're not security experts. We don't want to be
Speaker:security experts and, like, start, like, to spend a lot of time in that. But
Speaker:nobody else just like to very easily find all these different kind of misconfigurations.
Speaker:And this is also a threat and, like, attack vector that we
Speaker:started, like, to see a lot in the field today. I can tell you that,
Speaker:like, we see tons of attacks around
Speaker:different misconfigurations in tools like Airflows and Databricks
Speaker:and stuff like that. And I think this is also like a very, very important,
Speaker:like, mindset, like, to be in. And in addition to that, of course, we have
Speaker:all the all the runtime and all the adversarial attacks there.
Speaker:There are specifically, if I mentioned in the
Speaker:OSPI exchange, so OSPI exchange covers everything.
Speaker:The OSPI 10LLM specifically is more
Speaker:covering this LLM, like,
Speaker:specific risk. And then you have, like, all the adversarial attacks, like prompt injection
Speaker:and model jailbreak and model dn out of service, model dn out of wallet,
Speaker:etcetera. So basically, the mindset should
Speaker:be we already know security very well. We already have, like, these
Speaker:principles. Until now, we just haven't
Speaker:implemented them on the data and AI teams,
Speaker:tools, and technology. And this is exactly what we start, like, to
Speaker:what we, like, need, like, to start to do. And this is what we see
Speaker:also that, like, you know, like, now we have no reason. Like, we all see,
Speaker:like, these different kind of attacks. So we start to see that all the organizations
Speaker:were, like, starting to to already, like, walk the walk.
Speaker:Wow. Yeah. I I often wonder too, like, what you
Speaker:mentioned the pipelines being a vulnerability or an
Speaker:attack surface. Right? Like, or a potential vulnerability.
Speaker:I often wonder now, like, when, you know, we're looking at agentic
Speaker:AI, right, where these things aren't just LLMs,
Speaker:right, producing text or going through these materials.
Speaker:We're giving them, you know, abilities,
Speaker:right, to influence pipelines, right, to to or to
Speaker:whatever. Right? Like, that just seems to me like a giant
Speaker:security risk. I mean, telling someone you know, there's there's multiple ways to
Speaker:break an LOM. Right? Like, obviously, there's the the the $1 Chevy
Speaker:Tahoe. Right? Where the guy did that. Right? Pretty low tech
Speaker:approach, pretty brute force ish.
Speaker:But I often wonder, like, well, what
Speaker:what sorts of things are agentic systems gonna open up?
Speaker:Like, what does that look like? I think that this is exactly like where we
Speaker:we will start, like, to see, like, the very big LLM,
Speaker:breaches, that we'll have. I believe that, by the
Speaker:way, my belief is that the the how does the
Speaker:attack start will still be, like, in a lot of cases,
Speaker:very similar to what we see today. But the impact of the
Speaker:attack will be much, much, much, much, much higher because now like the
Speaker:model cannot only like, promise you a
Speaker:$1 a car, but you can throw, like, I already like
Speaker:send the order, can send the car to you, can like book your
Speaker:hotel, can do like everything there, can share with you, like, the data
Speaker:of maybe, like, other customers in the application because it is,
Speaker:like, a RAG architecture, and it is also, like, different, like, tools
Speaker:that provide him the ability to maybe even, like, write different codes
Speaker:to the application. And then it might also like start like different types
Speaker:of remote code execution. As long as we are going to
Speaker:provide to these NLMs more privilege, more access,
Speaker:more tools, more abilities, the impact of the risk
Speaker:that they will be able, like, to cause will be much higher. I still
Speaker:believe again that that pack vectors are going to start from more or less,
Speaker:like, the same areas, like prompt injection and model jailbreak,
Speaker:but they they eventually, like, the outcome of these attacks will be much
Speaker:higher. I could see that. Because we're giving them
Speaker:actuators, so to speak. Right? Like we're not we're we're
Speaker:giving them agency. Right? Like where they could actually do real damage as
Speaker:opposed to because one thing in saying you're gonna give somebody
Speaker:a $1 Chevy Tahoe. It's quite another to actually place the order,
Speaker:sign off on the invoice, and then ship it. Right? Yep. And what
Speaker:if you'll do, like I don't know. Like, you'll you'll start, like, to see it
Speaker:also, like, in banks and in investments. They will start, like, to transfer
Speaker:your money. They will start, like, to invest, like, to buy stock. They will like,
Speaker:the the the the amount of, like, potential impact here is, like, a
Speaker:crazy high. I believe, by the way, that eventually, this is going to be one
Speaker:of the things that, like, we'll see also, like, slow down the adoption, not
Speaker:less than the than the technology or, like, finding, like, the
Speaker:right use case. Yeah. No. I could see
Speaker:that. I I just think that we're just setting, as an industry.
Speaker:We're setting ourselves up for a huge exploit that we
Speaker:haven't figured out is already there yet.
Speaker:And so so what what
Speaker:can AI engineers, data scientists,
Speaker:data engineers do today to make things
Speaker:better? I know we can't fix it because we don't know what's we really don't
Speaker:know what's broken. I think that's one of the frustrating and kind of fun things
Speaker:about security work is, like, it's not that there's no vulnerabilities.
Speaker:You haven't discovered any vulnerabilities yet. Right? There are no unknown there are
Speaker:always un there are always unknown unknowns.
Speaker:But if you have an unknown unknown or a known thing,
Speaker:you can you can say that you pretty much figured that out. But there's this
Speaker:whole aspect, which I don't think data scientists
Speaker:fully appreciate. I think they can understand the concept of the unknown
Speaker:unknowns. But in terms of the consequences of it, I don't
Speaker:think I think it's gonna take 1 major solar wind style
Speaker:issue or CrowdStrike style issue to make people conscious
Speaker:of of that. But how do
Speaker:we how do we prepare ourselves? Right? You can't
Speaker:stop the hurricane, but you can board up your windows. Right? Like, you
Speaker:know, how do you Yeah. I and I totally
Speaker:agree that, like, what's going through, like, to to shake every everybody
Speaker:will be, like, the the first SolarWinds or, like, the 4 log 4
Speaker:j attack that we see, like, in these areas. I think that,
Speaker:like, I think that you broke it very well
Speaker:and that we need to relate to both categories.
Speaker:1st is, like, the known,
Speaker:which already, like, exist. Like, we know that, like, you know, like,
Speaker:we see that as scientists. Like, we are not a scientist.
Speaker:And we see that one of the the things that, like, we see
Speaker:in in in our code in compared to software developers
Speaker:is that we don't give a
Speaker:tip on, like, everything, around security.
Speaker:Like, you'll see, like, tons of exposed secrets in plain
Speaker:text. You'll see tons of, like, test and, like, the sensitive data
Speaker:just like playing. And, like, it's state, like, exposed, like, in the notebooks.
Speaker:You'll see that we download, like, any dependencies without, like, like,
Speaker:even, like, think about it. Even so that, like, yeah, it looks like maybe, like,
Speaker:a bit suspicious and stuff like that. So it's it's far
Speaker:from from the basic. Let's make sure that, like, what we know that is not
Speaker:best practice, just, like, start, like, to implement it. And
Speaker:then regarding the unknown unknown, so, of
Speaker:course, like, you don't know how to handle it. I think that, like, as you
Speaker:as you said, you can start to prepare yourself. How do how do you
Speaker:prepare yourself in security? It's basically to be very
Speaker:organized and to to make sure that you have, like, the right visibility and
Speaker:governance. As long as you have, for example, like, you know how to build,
Speaker:like, your your AI or the machine learning bomb. You know all the
Speaker:different, like, models that are built or embedded as part of the application,
Speaker:and you have, like, the right lineage, which one
Speaker:was trained on which dataset, etcetera.
Speaker:Once, for example, that now let's say we'll continue with the
Speaker:examples of of Hugging Face. Like, a new Hugging Face
Speaker:model is is is now, like, published as a like, someone,
Speaker:like, found that it's, like, malicious. You because you prepared
Speaker:yourself and you have, like, the right visibility, you are able to go
Speaker:and very easily search exactly, like, if you use it and
Speaker:where you use it in all your organization. And this is also
Speaker:because you prepare yourself. This is exactly what happened, like, in Log 4
Speaker:j. In Log 4 j, it was like a dependency that
Speaker:found as a critical vulnerable. And a lot of
Speaker:organization, what they spent, like, most of the time is to try to understand
Speaker:where they even use this Log4j. And they seem that, like, if you prepare
Speaker:yourself, you are like, if you are organizing everything, you'll already
Speaker:be very, very, like, ready for the for the
Speaker:attack of, like, the unknown unknown. And, of course, everything
Speaker:in addition to to, you know, like, learning and, like, educating
Speaker:yourself. If you start, like, to understand, you'll go
Speaker:to, I don't know, Databricks, for example. A lot of people use Databricks. You'll
Speaker:go and, like, start, like, to see what are, like, the best practices of how
Speaker:to, like, configure your Databricks environments and what are, like, the best practices
Speaker:there. It's something that you can, like, find very easily, like, in the Internet. You
Speaker:don't need, like, to to do it, like, from scratch.
Speaker:But I'll say that, like, you you know, like, it's still, like, when we are
Speaker:aware of that, it's not still, like, the the top of our mind as the
Speaker:data practitioner to start looking, like, in our free time for this
Speaker:kind of concept. Right. I mean, that's a good point.
Speaker:Right? The fundamentals are still fundamental. Right?
Speaker:You know, making sure, you know, you track what
Speaker:your dependencies are. Right? So that way, if there's a breach in a hugging face
Speaker:model, like you said, you'll know right away whether or not it
Speaker:impacts you or not. Also too, I think you're
Speaker:right. This isn't top of mind for AI practitioners. Right?
Speaker:Even when I code, like, an app, my met
Speaker:my thought process are very different than when I'm in a notebook.
Speaker:Mhmm. It's just different wiring.
Speaker:Yep. And by the way, it's kind of like, it's kind of
Speaker:like a paradox because most times on the notebooks,
Speaker:we are connected to much more sensitive information than on our
Speaker:ID. Right. No. Exactly. So
Speaker:it's kind of it's like the worst, one of the worst case
Speaker:scenarios. Right? And and you're right. Like, people wanna work with real
Speaker:data, and they they just assume that if they're on a system that's
Speaker:secured and internal, they
Speaker:they, they don't have to worry about such things,
Speaker:which I think you're right. Like, with these systems that have access to
Speaker:sensitive data, these pipelines, I mean, it's one of those
Speaker:things where we need to start thinking about this. And what would you do
Speaker:you think that there's a, like, a career path for, like, an AI security engineer?
Speaker:Right? So it's not just a security engineer, like, in a traditional
Speaker:sense. Right? But also a someone who specializes
Speaker:in AI related issues. You think that's a growth industries? I
Speaker:have, like, no doubt that we are going to like to see more. Like, we
Speaker:already see these kind of practitioners in the field. I have no
Speaker:doubt that it's going, to be more and more frequent. And in
Speaker:addition to that, I believe that, like, even in the future, it's it's going to
Speaker:be even, like, several different, like, roles. For example, one of the
Speaker:things that, like, a lot of people that we work also, like, very closely with
Speaker:are AI red teaming. Right. It's not even,
Speaker:like, just like a AI security engineer, like, general one. Specifically around,
Speaker:like, credit teaming because all these kinds of adversarial
Speaker:attacks on models are very different, requires
Speaker:different techniques, different tactics. And the red teamers are the
Speaker:ones that, like, to, like, learning all these different
Speaker:types of adversarial attacks and how to, like, check your model,
Speaker:in your organization. And by the way, specifically in this
Speaker:area, I do feel that it's kind of, like, top priority and
Speaker:like top of mind also for the data science
Speaker:team. Like you do see that on LLMs,
Speaker:once they are deployed into production, the data
Speaker:scientists, they are kind of like understand that there are a lot of risk there
Speaker:and they are starting, like, to take also, like, responsibility even completely, like, regard
Speaker:regardless of the security team to make sure that, like, we we
Speaker:we reduce some of the risk there. Now the risk is not only
Speaker:security. The first thing is security, like, to try and, like, make sure
Speaker:that you are secured from all these different adversarial attacks or that you know how
Speaker:to detect sensitive data leakage, for example, as part of the response and stuff
Speaker:like that. In addition to that, it's also a lot of time
Speaker:like safety risks. You want to make sure that once you deploy LLM into
Speaker:production, your model doesn't give any financial advice to your
Speaker:customers, doesn't give any health advice in case it's not your business.
Speaker:So you then have, like, these kinds of responsibility, or example, like in the
Speaker:Chevy example that you gave, that you just, like, you don't just, release
Speaker:free cars or flights or books or a tail off, like, anything
Speaker:like that. So I think that because the
Speaker:the the the amount of potential risks are
Speaker:so high on the run time. In this area, I
Speaker:believe that, like, the data scientists already understood that this is, like,
Speaker:under their responsibility. They see it also as part of,
Speaker:like, being a professional data scientist. If I
Speaker:deploy this model, it has, like, a lot of, like, accuracy, but,
Speaker:like, it creates all these different kinds of risk.
Speaker:I would define myself as not a super professional data
Speaker:scientist, unlike on the supply chain, unlike in the
Speaker:notebooks that if I code a code that is not secure, I wouldn't say that,
Speaker:like, it's not professional. I would say that, like, it's okay. You're just, like, focusing
Speaker:on the business. So I do believe that we start, like, to seeing this shift
Speaker:also, like, in the mindset of the data scientist because of the risk of
Speaker:the Gen AI, but now it's also, like, like, a move
Speaker:to to all the the development and the building practices
Speaker:that we have. Yeah. And I think data
Speaker:scientists are acutely aware that LLMs
Speaker:are just taking they mean, we talk we we call it hallucinating when
Speaker:they get things wrong. But realistically, they're
Speaker:always hallucinating to a very real degree. Right? It's just they
Speaker:happen to be correct. And what these things are doing
Speaker:under the hood is they are looking for patterns of words.
Speaker:Sometimes those patterns of words are wrong, obviously wrong.
Speaker:And sometimes they may give out sensitive information
Speaker:inadvertently. So I can talk at least at least there's some common sense out there
Speaker:when they when they do realize these things are higher risk than I think
Speaker:we've been led to believe. Yeah. Actually, I love this this finish. They are,
Speaker:like, hallucinating, like, all this time. Sometimes they really find it
Speaker:as wrong. Like, they do the same thing as always. Right.
Speaker:Right. The they don't know they're hallucinating because they're just operating normally.
Speaker:And so when they go in a different direction and I've noticed
Speaker:that, you know, kinda like a little bit of, you you know, off by a
Speaker:little bit, and then then then it generates an off by a little bit, off
Speaker:by a little bit. I ran an experiment with a hallucination, and
Speaker:I read it through I ran it through a bunch of models and each one
Speaker:of them didn't do any fact checking, which I mean, realistically,
Speaker:I wouldn't expect that. Right? In the future, I think that'll be kind of table
Speaker:stakes. But, you know, it would just go through. So
Speaker:I took a hallucination, fed it through notebook l m, which then
Speaker:create even more hallucinations. Right? So it took this little
Speaker:genesis of something that was wrong and then made it even crazier wrong,
Speaker:which I think is an interesting kinda statement and and and
Speaker:also is a risk. Right? Like hallucination on top, compounding
Speaker:other hallucinations. And I don't think we've really seen that yet because we've
Speaker:only really seen for the most part, I've only seen one kind
Speaker:of model in production. But if you have these models that will kinda work together
Speaker:as agents or, you know, whether they're agents
Speaker:that do things or agents that it's different LLM discrete LLMs that talk
Speaker:to one another. They can get things wrong and make things worse. I mean, I
Speaker:haven't I think it's too soon to tell either way, honestly. Yeah. But, like, the
Speaker:like, theoretically, like, it makes a lot of sense. I think in general, like, we
Speaker:don't see, like, a lot like, we hear a lot about Gen AI.
Speaker:I think that, like, the level of adoption and the amount
Speaker:of business use cases that, like, businesses
Speaker:found are not that high yet. I think that, like, the
Speaker:most of the usage today is done by, like, consumers, like,
Speaker:like, directly, like, from, from the foundation model providers, like OpenAI and stuff
Speaker:like that for day to day, like, jobs, like, you know,
Speaker:like, reviewing mails and stuff like that.
Speaker:The the big businesses are still trying to find these
Speaker:different, like, use cases. I do believe that the that the
Speaker:agents are going, like, to open a lot of different use cases
Speaker:around it. Right. Right. I could I could see that. And
Speaker:I think I think it's just too soon to make a statement
Speaker:either way. But I think grounding yourself in the fundamentals
Speaker:is probably always a good idea. Mhmm.
Speaker:And probably a good a good
Speaker:approach. So so tell me about NOMA. What is is it NOMA? I
Speaker:I don't wanna make sure I pronounce it. NOMA. Okay. NOMA. Security. What does
Speaker:NOMA do? Is it security firms that focus on this space? You
Speaker:mentioned red teaming. Is that is that a sir service you offer?
Speaker:Yeah. So NOMA basically is an like, our name is Nomo
Speaker:Security. The domain is Nomo dot security. So it's Oh, okay. Sorry about
Speaker:that. No. No. We're good. So, so, yeah,
Speaker:what we do is, like, secure the entire data in the AI life cycle.
Speaker:Basically means that we truly, like, cover it end to end. Like, we enable, like,
Speaker:the data teams and the machine learning and the AI teams, to continue and
Speaker:innovate while we are securing them without
Speaker:slowing down. And this is like the the like, we are built from, like,
Speaker:data practitioner, like, the company. So this is, like, our main focus,
Speaker:meaning that we start, like, from the building phase. So if we
Speaker:said, like, notebooks and hugging face models and all these different stuff and the
Speaker:misconfigurations are on all the different stack and all the envelopes
Speaker:tools and AI platforms and data pipelines and stuff like that. So we are
Speaker:connected seamlessly on the background, and,
Speaker:basically assist the the data teams to to work securely,
Speaker:without changing changing anything in the workflows.
Speaker:And then also, like, we provide, as you said, the red teaming.
Speaker:Before you're deploying the model into production, you want to
Speaker:understand what is the level of, of
Speaker:robustness and security that the like, that your model has. And
Speaker:what we do is we had, like, a big research team that,
Speaker:like, builds, simulated, thousands of different
Speaker:attacks. And then we dynamically start to run all these attacks against
Speaker:your models, showing you exactly, like, what kind of, like, tactics
Speaker:and techniques your model is vulnerable to, and exactly
Speaker:also how to mitigate and improve it to be more
Speaker:robust. And then the 3rd part is also the runtime.
Speaker:We are mapping, we're scanning all the prompts and all the
Speaker:responses in real time, making sure that you don't
Speaker:have any risk on both sides. The security, we are detecting all these
Speaker:different kind of, like, a host and a little, like, adversarial tax prompt
Speaker:injection, model jailbreak, etcetera. We check also the responses for
Speaker:sensitive data leakage and stuff like that. But in addition, also the
Speaker:safety. We see a lot of organizations that the data scientists, as we
Speaker:said, they understand the risk of deploying
Speaker:models into production. And this is why not even, like, the security, but more like
Speaker:the the Chevy example and, like, the the health advice and stuff like that.
Speaker:So they built for their own, model
Speaker:guardrails in order to make sure that they are, like, controlling what
Speaker:are, like, the topics that the model is be able like, is allowed or
Speaker:disallowed to communicate about. And what we do is basically to save
Speaker:them also like this time. We also provide them, like, all this
Speaker:runtime protection already, like, as a service. You can define exactly what kind
Speaker:of, like, detectors and in native language, what kind of, like, policies you want
Speaker:to make sure that are enforced. And then we also, like, protect it in the
Speaker:run time. So, basically, we just, like, cover you, like, end to end, start from
Speaker:the building and up to the run time. It starts from the classic data engineering
Speaker:pipelines and machine learning and up to gen AI. Interesting. Interesting.
Speaker:It sounds like something I think is totally, I think, a
Speaker:needed needed service and and skill
Speaker:set. Because you're right. Like, I mean, there's just so many risks
Speaker:here, and the hype around Gen
Speaker:AI is so over the top.
Speaker:It is gonna be revolutionary, but
Speaker:maybe not in the way you think. Right? And I always call back to the
Speaker:early days of the dotcom. Right? Where it was pets.com. There was,
Speaker:you know, this.com, that, you know, like all these crazy things. But the
Speaker:real quote unquote winner of, you know,
Speaker:.com was some guy in Seattle selling books.
Speaker:Mhmm. Right? No one no one I mean, selling books. Like, really?
Speaker:Like, not, you know, and it's
Speaker:it's interesting to see how I think I
Speaker:think that the the obvious use case for chat for for
Speaker:LLMs thus far has been chatbots. Right? Customer
Speaker:service type things. I think that's really only the
Speaker:the the the the the surface of it. I think for me, what
Speaker:I've seen is most impactful is the ability for natural language
Speaker:understanding and their ability to understand what's happening in a in
Speaker:a block of text. And I think
Speaker:that that has enormous potential. I
Speaker:agree. A lot of risks too. Right? Because what if, you know, what if
Speaker:I I mean, to your point. Right? You wanna make sure these things stay on
Speaker:topic. Right? Like, I don't if I'm talking to a
Speaker:financial services chatbot and I say, hey, I have
Speaker:my my leg kinda hurts. Right?
Speaker:It's, you know, the risk of moving into health care, like, it's just kind
Speaker:of, I don't how mature are those guardrails? Because I've
Speaker:not really seen a good implementation of
Speaker:it yet. Yeah. So, you know, like, I
Speaker:don't want to to give ourself, like, a compliment, but,
Speaker:we Oh, you guys are pretty good at it? Yeah. Like, we're pretty good. Like,
Speaker:we were, like, you know, like, with fortune 5 100, with fortune 1 100.
Speaker:Not in vain. But, yeah, I believe that in general, specifically, like, when we speak
Speaker:more, like, on the guardrail side, I see that the most important thing is
Speaker:to make sure that it's, it's building the
Speaker:right architecture to be very flexible and easily
Speaker:configure for the organization because eventually, like, you know, like, each
Speaker:organization is completely different needs, completely different
Speaker:context to the calls, like, in their customers, internally to their employees.
Speaker:So everything should should be, like, very easily configured, but very flexible.
Speaker:Interesting. Interesting. I wanna I I could talk for
Speaker:another hour or 2 with you because this is this is a fascinating space.
Speaker:Where can folks find out more about Noma and you? I you think it's Noma
Speaker:dot security? Yeah. Noma dot security. Can't believe that's now
Speaker:a top load pain, but,
Speaker:and, any any, NOMA dot
Speaker:security, you're on LinkedIn, and, anything
Speaker:else you you'd like the folks to find out more?
Speaker:No. I had, like, a great time speaking with you, Frank. Great.
Speaker:Likewise. And for the listeners out there, if you're a little bit
Speaker:scared and a little bit paranoid about generative AI and LLMs,
Speaker:then I think we had a good conversation. Because I think we need a little
Speaker:bit of that fear in the back of our heads to guide us and
Speaker:maybe think about security issues. A
Speaker:little bit of thought ahead of time will probably save you a lot of problems
Speaker:later. And want to lose some. That's
Speaker:that's all I got, and we'll let the nice British AI,
Speaker:Bailey finish the show. Well, that wraps up another
Speaker:eye opening episode of data driven. A big thank you to Niamh
Speaker:Braun for sharing his expertise on the critical intersection of AI,
Speaker:security, and innovation. If today's conversation didn't make
Speaker:you double check your data pipelines or rethink your Hugging Face
Speaker:downloads, well, you're braver than I am. As always,
Speaker:I'm Bailey, your semi sentient MC, reminding you that while
Speaker:AI might be clever, it's never too clever for a security breach.
Speaker:Until next time, stay curious, stay secure, and
Speaker:stay data driven. Cheerio.