Are you confident in the origins of your AI training data?
In this episode, we unravel the complexities of generative AI with IP attorney Joy Butler, delving into data provenance, liability, and the transformative role of diversity in AI.
We discuss the importance of verifying your AI training data to avoid copyright issues, the need to avoid inputting confidential information in AI prompts and ensuring compliance with contractual obligations, and how AI companies are offering indemnification and adopting licensing models to provide legal assurances and mitigate liabilities.
Tune in to stay ahead in the fast-evolving AI domain!
🔍 Three Key Takeaways:
Resources Mentioned:
More About Our Guest:
Joy R. Butler helps companies and investors craft innovative business models, mitigate risks, and devise strategies for ventures into new markets, technologies, and product lines in the entertainment and digital technology industries. She is a graduate of Harvard College and Harvard Law School.
Connect with Joy Butler:
Charity Mentioned: https://girlswhocode.com/
Connect with Erin to learn how to use intellectual property to increase your income and impact. hourlytoexit.com/podcast.
Erin's LinkedIn Page: https://www.linkedin.com/in/erinaustin/
Hourly to Exit is Sponsored By:
This week’s episode of Hourly to Exit is sponsored by the NDA Navigator. Non-disclosure agreements (NDAs) are the bedrock of protecting your business's confidential information. However, facing a constant stream of NDAs can be overwhelming, especially when time and budget constraints prevent you from seeking full legal review. That's where the NDA Navigator comes to your rescue. Designed specifically for entrepreneurs, consultants, and business owners with corporate clients, the NDA Navigator is your guide to understanding, negotiating, and implementing NDAs. Empower yourself with legal insights and practical tools when you don’t have the time or funds to invest in a full legal review. Get 20% off by using the coupon code “H2E”. You can find it at www.protectyourexpertise.com.
Think Beyond IP YouTube Page: https://www.youtube.com/channel/UCVztXnDYnZ83oIb-EGX9IGA/videos
Music credit: Yes She Can by Tiny Music
A Team Dklutr production
Hello, ladies.
Speaker:Welcome to this week's episode of the hourly to exit podcast.
Speaker:I have a very special guest today.
Speaker:My law school classmate, joy Butler joy.
Speaker:Welcome.
Speaker:And thank you so much for joining us.
Speaker:Thank you, Aaron.
Speaker:I am honored to have been asked to be a guest.
Speaker:Well, we're very excited to have you because AI could not be more
Speaker:top of mind, for this audience.
Speaker:And so as someone who has written extensively and spoken
Speaker:about AI, I definitely wanted to have you on to, , go deep.
Speaker:So before we get started, would you introduce yourself to the audience?
Speaker:Sure.
Speaker:so as you already shared, I am an attorney and in my law firm practice,
Speaker:I provide product counsel services.
Speaker:that essentially means I provide a combination of strategic and legal advice.
Speaker:To companies that are, going into new lines of business or launching new
Speaker:products new features of existing products or forming strategic partnerships.
Speaker:And I come by that from, two areas of wall where I have a deep, in depth
Speaker:knowledge, and that includes the technology side where I have worked on
Speaker:and, help to structure probably literally, over 1000 contracts over the course
Speaker:of my career for, all the contracts 1 would need when 1 is doing business.
Speaker:Online and in digital technology, including end user license arrangements
Speaker:and terms and conditions and the other prong of my in depth legal
Speaker:knowledge concerns entertainment and copyright and this is where
Speaker:you and I overlap quite a bit.
Speaker:so I work on a lot of, creative content contracts also advise companies on.
Speaker:Protecting their, copyrights and trademarks and, work with companies
Speaker:that want to use, someone else's content, doing a lot of work
Speaker:in the rights clearance area.
Speaker:just to give your audience a little more of a flavor of the
Speaker:types of projects I might work on, most of them are in the digital
Speaker:technology and entertainment space.
Speaker:for So a couple of projects include helping an entertainment social media
Speaker:network launch, working with an commerce retail site that was incorporating a lot
Speaker:of album cover work and original artwork.
Speaker:another was, an ad supported, stock simulation game.
Speaker:And here's something that may resonate with your audience, helping
Speaker:a professional in the finance area take, this niche financial service
Speaker:he was offering and, convert it into an online software as a service.
Speaker:product.
Speaker:so, that is me in a nutshell.
Speaker:Awesome.
Speaker:When did you first when I think even tell you the day I first heard about AI.
Speaker:Where were you when you first heard about it?
Speaker:What was the context and what were your initial thoughts?
Speaker:I don't remember, the first time I heard about, JATGPT.
Speaker:Right.
Speaker:That may be what you're referring to.
Speaker:Yeah.
Speaker:but, actually within my practice, I have for quite some time been experimenting
Speaker:with, trying to take some of my knowledge And, develop it into, digital tools,
Speaker:making it more accessible to people.
Speaker:as you know, I've written a couple of books on my areas of in depth knowledge.
Speaker:So, one of the things I've been experimenting with is, taking some of that
Speaker:knowledge and offering it in a digital format, one, experiment I believe I shared
Speaker:with you was A contest and promotion tool, which asked a number of questions
Speaker:and then gave you kind of a checklist.
Speaker:of the legal questions you might ask before going forward with that.
Speaker:and I've asked, actually used a tool, that a lot of, attorneys and,
Speaker:well, it is a, Interview construction tool targeted to the legal space.
Speaker:It's called Doc.
Speaker:Assemble.
Speaker:It's actually open source and spent a little bit of time.
Speaker:tinkering around with that, is a long way to answer your question.
Speaker:I was familiar with automation and artificial intelligence
Speaker:through that process.
Speaker:But when chat GPT came to my attention, that may have been around the same time
Speaker:as it came to everyone else's attention.
Speaker:I kept hearing about it and
Speaker:right.
Speaker:Right.
Speaker:I guess I'd heard about it, but it was just noise to me kind
Speaker:of like block train or crypto.
Speaker:don't need to know that.
Speaker:I don't want to know it, until finally I could no longer
Speaker:ignore it, which was during.
Speaker:Yeah.
Speaker:And MCLE where I needed to get some credits, so I wasn't delinquent.
Speaker:And so I'm listening to this one about AI and it's describing, they were talking
Speaker:about chat, GBD in particular, and they're describing what you could do.
Speaker:And they're having these samples and I'm like, what it can do.
Speaker:What?
Speaker:And so while I'm still in there, you know, it was just online.
Speaker:I'm silly.
Speaker:God forbid I go someplace in person.
Speaker:and then I'm on my computer, like.
Speaker:Doing stuff with it.
Speaker:I'm like, Oh my God, this is bad.
Speaker:And that was, well, it was February, 2023 and that was my initiation.
Speaker:So what the last year has been, actually a fire hose of information
Speaker:and changes in that time.
Speaker:so I think chat GPT, it's the AOL of our times.
Speaker:It's this technology that's been around for a while, but we finally have this
Speaker:application that has made it a much more, accessible and user friendly
Speaker:for a much wider group of people.
Speaker:Yeah.
Speaker:I mean, I guess, you know, when you think artificial intelligence has
Speaker:been a while, I mean, obviously we've always had autocorrect and things
Speaker:like that, or, all those things were artificial intelligence, right?
Speaker:things like Alexa and Siri, right?
Speaker:I mean, those Versions of it.
Speaker:We just didn't think of it the way that we think of AI now.
Speaker:Exactly.
Speaker:It's been around for a while.
Speaker:We just finally got a killer app in chat GPT.
Speaker:Right.
Speaker:Awesome.
Speaker:So a lot of questions that I get are around, where's this data coming from?
Speaker:what is the black box of, generative AI in particular we're talking about.
Speaker:and what do I need to worry about?
Speaker:are they taking my prompts and what are they doing with it?
Speaker:client who is, utilizing signing an agreement to utilize the contract
Speaker:review a I like, what are the issues regarding using 1 of those?
Speaker:So everybody has questions about, What happens when I use AI and
Speaker:what do I need to worry about?
Speaker:And where does that, data come from and what is my exposure?
Speaker:So I would just like to start from the top.
Speaker:I think most of the audience is familiar, AI, but let's talk
Speaker:about what training data is.
Speaker:Like does it get its information from?
Speaker:How does it get in there?
Speaker:And, yeah, just start there with a general.
Speaker:Yeah.
Speaker:when we talk about a I models and some of the copyright and licensing
Speaker:issues, there are kind of 2 categories.
Speaker:category is the input.
Speaker:And the 2nd category is the output.
Speaker:When we're talking about generative, a, I, So when you mention training
Speaker:material, you're talking about the first category of input.
Speaker:And there has been a lot of controversy over whether or not, the training
Speaker:material that is required to train these models, can be used without permission.
Speaker:Because what the foundation models do, when I say foundation models, I mean
Speaker:that Maybe eight or 10 models are around that, literally have millions of pieces
Speaker:of content that they take into their kind of black box and, analyze it so that it
Speaker:can be a general use large language model.
Speaker:many of these models do is they source that data by getting data from
Speaker:anywhere that they can, including, scraping the Internet for millions
Speaker:and millions of pieces of data.
Speaker:So, there's been, as I said, a lot of controversy around
Speaker:whether or not permission is required for them to do that.
Speaker:and many of these models are relying on now is an argument that, their use of that
Speaker:material, as training material qualifies as a fair use to the Copyright Act.
Speaker:I believe there number of, Areas a number of factors that will gradually
Speaker:push these AI foundation models towards licensing that material.
Speaker:1 of them is, is that there have been a number of lawsuits that have been filed
Speaker:against them, charging them with copyright infringement and other related infractions
Speaker:over their use of this material.
Speaker:and.
Speaker:A lot of those suits while all of those suits are still pending and they may
Speaker:take a very long time to play out.
Speaker:I think we're going to see progress towards more licensing prior to that.
Speaker:And that's because people are very anxious.
Speaker:to, use generative A.
Speaker:I.
Speaker:And, before they use that, though, they want some comfort level that
Speaker:their use that material is not going to subject them to any type of a
Speaker:copyright infringement or other claim.
Speaker:So in order to make their customers, comfortable, With the fact that they
Speaker:can use this material without taking on any legal liability, we are seeing more
Speaker:and more of these AI companies gradually move towards licensing the content.
Speaker:want to follow up on that before I'm going to step back just a second,
Speaker:because you said large language models, and then we have machine
Speaker:learning and we have generative AI.
Speaker:Are those synonyms?
Speaker:Or are they all different elements?
Speaker:Transcribed Okay,
Speaker:not the expert here, but I'll share with you my understanding.
Speaker:So, the large language models, they are, the general models that can process.
Speaker:the generated output, so that means they take all of the input,
Speaker:all of that training material, and they basically analyze it to see
Speaker:what the relationship of each data point is to this other data point.
Speaker:So, when you ask it to produce something, it is, estimating or.
Speaker:Putting forth, it's analysis of what word should come next or what
Speaker:should come next in this particular graph, which is why it needs so much
Speaker:training material from which to learn.
Speaker:Got it.
Speaker:Okay.
Speaker:Now, you mentioned going back to where it's going towards licensing because users
Speaker:of AI, I want to know that they're not going to get sued they use the output.
Speaker:what does that mean for all of the current data that has been scraped
Speaker:from the Internet and all these places?
Speaker:previously, I mean, isn't the data sets.
Speaker:and our use of AI as is almost like too big to fail.
Speaker:what could happen with these lawsuits that are happening right now, if there
Speaker:are billions of pieces of, let's say pirated information and say, the chat
Speaker:GBT, open AI is training data set.
Speaker:what could the possible remedy be if they lose?
Speaker:Okay, so I do want to separate this into 2 categories again, because
Speaker:when we talk about infringement, there are 2 separate questions.
Speaker:The 1st question being whether or not just the process of.
Speaker:Of the, a I companies, taking in data as training material and using
Speaker:it to train their model, whether or not that's copyright infringement.
Speaker:That's one question.
Speaker:And then the second question is if you as a user of these models,
Speaker:if you produce content and.
Speaker:Use it to produce generated content.
Speaker:Is there any legal liability for you?
Speaker:Now, there are circumstances that could be imagined where, it's
Speaker:possible for the, models, training data to be considered a fair use.
Speaker:But maybe the way you've used it in creating output, is infringing
Speaker:or violating in some way.
Speaker:I'm not saying that scenario has actually come up or may come up
Speaker:often, but 1 can imagine a set of circumstances where that might be true.
Speaker:So, back to your original question, where is all this going?
Speaker:What are the potential remedies?
Speaker:well, 1 remedy with respect to these lawsuits is that they will settle with
Speaker:a lot of these companies because the companies that have sued them have been
Speaker:the largest companies with the most resources and very large organizations.
Speaker:Like the author's guild, so they may settle, come to some agreement
Speaker:on what a settlement fee should be.
Speaker:And it's also possible that part of their settlement might be a
Speaker:licensing agreement going forward.
Speaker:that resolves matters for, the large organizations that have sued and.
Speaker:The large private companies, if it's an organization or association, representing
Speaker:much smaller players, it remains to be seen how much might flow to them
Speaker:as part of any judicial settlement.
Speaker:It may be that as opposed to a private settlement, we might get
Speaker:some sort of a judicial settlement.
Speaker:I think.
Speaker:It's perhaps less likely, but it might be one of the outcomes and that
Speaker:might be a settlement like something that was proposed in Google Books.
Speaker:Now, the Google Books lawsuit, if anyone remembers his lawsuit from 2015.
Speaker:This is the lawsuit that came out of Google Books starting its program
Speaker:where it digitized millions of books and use them and still uses
Speaker:them today to give a snippet of books in response to our search.
Speaker:So that is one of the cases on which a lot of these AI model companies
Speaker:rely when they argue that their use of the training material is a fair use.
Speaker:For those who remember, the Google Books case initially, tried to resolve
Speaker:itself, via a judicial settlement agreement that would have permitted
Speaker:the snippets of those books and allowed the digitization, but that judicial
Speaker:settlement, or the private settlement that was proposed, went to court.
Speaker:Very much beyond just providing snippets, which is, one of the reasons
Speaker:that it was ultimately, not approved by the court and kept going on and
Speaker:ultimately said, okay, well, we're stripping out all this information.
Speaker:you try to do in the settlement.
Speaker:But, as consolation, we decided Google books that your use is a fair use.
Speaker:So, it might be that some of the parties, try to move in that direction of some
Speaker:type of a settlement that encompasses both small and larger players.
Speaker:some of the other types of resolutions that have been thrown
Speaker:out include kind of a collective.
Speaker:that would be parallel to, the way we collect, public performance
Speaker:royalties in the music industry.
Speaker:So, for example, when a song is performed on the radio, all songwriters,
Speaker:receive some income from their songs they've written being played.
Speaker:Well.
Speaker:the radio station is not going out and I'm entering into license agreements
Speaker:with the millions of songwriters.
Speaker:They have collected is in the case of music, as cap and BMI and couple others.
Speaker:that, have these collective agreements where they issue blanket licenses.
Speaker:So something like that has been proposed, potentially for,
Speaker:the training material space.
Speaker:So that brings in both, rights owners with very large catalogs and rights
Speaker:owners with very small catalogs.
Speaker:The copyright office had a comment period where it asked a bunch of industry
Speaker:players what they thought of this and most of the people who commented were
Speaker:very much in favor with, direct licensing, or perhaps even aggregated licensing
Speaker:now that may be in part because that's where, larger companies are going to
Speaker:get kind of the premium licensing.
Speaker:Um, because the direct licenses we've seen, today have been between, you
Speaker:AI model companies and very large organizations for millions of dollars.
Speaker:just like the, what we're talking about in the collective licensing, example,
Speaker:those very large companies are not going to enter into license agreements
Speaker:with, millions of small players.
Speaker:there'll be some balancing where they too can participate.
Speaker:1 potential example of how this might be alleviated is through aggregators.
Speaker:So, one aggregator we have right now is the Copyright Clearance Center,
Speaker:which is aggregating, scientific papers for use in training material, and
Speaker:that allows, smaller rights owners to participate in, having their material
Speaker:and being paid for their material to be used as training material,
Speaker:if that's what they choose to do.
Speaker:An example in, this space I've seen, come forward as a startup is called Dappier,
Speaker:and that is a startup that is, dedicated to getting those smaller, rights owners,
Speaker:giving them the opportunity to participate in being a part of training material.
Speaker:and making that training material more accessible to both the large, AI models,
Speaker:and, you know, the smaller, AI companies that might have fewer resources and not
Speaker:be as able to, compete when, you know, license, you know, is Agreements are going
Speaker:for millions and millions of dollars.
Speaker:Yeah.
Speaker:Yeah.
Speaker:I mean, it sounds like this would all have to be perspective.
Speaker:I mean, if, the AI companies have been scraping the Internet for we
Speaker:don't know how long and is it even able to distinguish 1 piece of data?
Speaker:In the data set from another, I don't know, like, how would you compensate all
Speaker:the information that's already in there.
Speaker:and in order to, parcel out payments, whatever fraction of a penny that, I might
Speaker:get for, something, and going forward.
Speaker:If you are a small content creator, kind of your everyday content creator,
Speaker:like the audience here, it would then be on you to make sure that
Speaker:your content is registered somewhere.
Speaker:So you'd be part of some aggregator that has a license
Speaker:who is getting paid by the AI, AI
Speaker:Okay.
Speaker:So several issues in that question.
Speaker:okay.
Speaker:Let's go with, That first part where you talk about kind of the provenance,
Speaker:what was the source of the data?
Speaker:Is it even traceable?
Speaker:And this is one of the pain points.
Speaker:And this is also where that analysis about whether your output subjects
Speaker:you to any type of liability.
Speaker:so back up for a second.
Speaker:if you are any type of content creator, and you are trying to determine
Speaker:whether or not the content you've created, is violating any rights,
Speaker:you need to know its source, right?
Speaker:So, For the output that they have, if that provenance is not available,
Speaker:you using the generative AI, you can't even do that analysis.
Speaker:So that's part of the pressure on the AI model companies.
Speaker:in not just waiting for these lawsuits to play out, but making their
Speaker:potential customers comfortable that you can use our AI models and it can
Speaker:produce output that you can then use.
Speaker:And so part of having to do that is knowing the provenance.
Speaker:Now, the extent to which they currently do that.
Speaker:I don't know.
Speaker:I model companies have often.
Speaker:Been quite opaque and not very transparent about how the sausage is being made.
Speaker:on the outside, though, like again as another example of where the industry
Speaker:is going, there has cropped up another, kind of startup in this space called
Speaker:barely trained, which is offering certification for AI model companies.
Speaker:that have, produced their models relying solely on an authorized data set.
Speaker:And then, you know, theory is, you are, a company that wants to leverage AI,
Speaker:you can get more comfort in knowing that, you're relying on AI model, an
Speaker:AI company that is fairly trained.
Speaker:And the last time I checked, there were only a few dozen companies that had
Speaker:that certification, but, that may grow.
Speaker:So, maybe enterprise users would go for the fairly trained type,
Speaker:because they're much more concerned, frankly, than most everyday users
Speaker:about the quality of that output.
Speaker:it seems like if they're using it, to create public facing
Speaker:materials, they would want that fairly trained data set behind it.
Speaker:they did, they also give reps and warranties when you go through them
Speaker:regarding the quality of the output.
Speaker:Do they give representations and warranties fairly trained provided
Speaker:anyone who well, fairly trained or someone who has licensed their
Speaker:data from fairly trained would they then in their terms of use have.
Speaker:Represent fairly,
Speaker:trained doesn't license data.
Speaker:Fairly trained is a certification program.
Speaker:So if an AI company, wants this certification to show everyone, that.
Speaker:They have relied on an authorized data set, then this is a
Speaker:certification that they can apply for.
Speaker:Got it.
Speaker:Okay.
Speaker:because I believe that there are some platforms that do provide
Speaker:indemnification, although they have a bunch of provisos, where's that going?
Speaker:So that users feel more.
Speaker:Comfort there, right?
Speaker:So, I mean, I think that's part of their responding to, this pain point
Speaker:of needing to make their customers more comfortable with using their product.
Speaker:they are, providing certain indemnifications, it remains to be seen,
Speaker:how effective those indemnifications would be if a customer were actually sued.
Speaker:And as you mentioned, they do have, a lot of exclusions, personally, I think that is
Speaker:just sort of an intermediate stop gap and they are going to be pushed more towards,
Speaker:More licensing of their data sets.
Speaker:and I would say, while we wait for this to play out, I mean, as you
Speaker:know, this lawsuit could take and probably will take a very long time.
Speaker:the Google books case, for example, on which the AI companies are
Speaker:relying to 10 years before finally reaching, that conclusion that, Google
Speaker:books digitization was a fair use.
Speaker:So I would say in the interim, AI companies, and those.
Speaker:producing AI models should look more to, using, authorized data sets or
Speaker:construction of their models and authorized data sets with a traceable
Speaker:provenance so that, their customers, when using the, output or wanting
Speaker:to put the output into use, can.
Speaker:Know what the source is and do that analysis of is this
Speaker:violating any copyright?
Speaker:Is this violating any right of publicity or trademark or anything else?
Speaker:I would say for the companies that want to leverage a I, when you're looking for
Speaker:partners, you do want to look at partners who are using authorized data sets.
Speaker:Right now, what I see is that a lot of companies, brands.
Speaker:companies in the film and television industry that are actually leveraging AI
Speaker:and it is being leveraged, but they're using it for a first draft or a proof
Speaker:of concept for things that are iterative and you'll need to be turned around
Speaker:very quickly, but they're not using it.
Speaker:As part of the final consumer facing output, just due to those copyright
Speaker:reasons, both the reasons we just discussed, fear of having any type
Speaker:of legal liability, but also, because there are limitations on the degree to
Speaker:which you can protect, output that's generated by, artificial intelligence.
Speaker:Right, so when they have a.
Speaker:Authorized data set.
Speaker:And does the output come with footnotes with What does that look like?
Speaker:Do you know?
Speaker:Have you seen, what that looks like
Speaker:tell us what the sources are with it?
Speaker:Like, does it identify?
Speaker:Yeah.
Speaker:Oh, oh, oh, I see.
Speaker:When is it authorized?
Speaker:It is right.
Speaker:To my knowledge, it is not coming with anything.
Speaker:And you're talking about the fairly trained component, right?
Speaker:Yeah, is not coming with anything.
Speaker:but that does need to be a path toward which we're traveling.
Speaker:And there has been like a lot of conversation about that in this space that
Speaker:it needs to be, you marked, needs to be traced in terms of, what was the source?
Speaker:What did you rely on to do that?
Speaker:Yeah.
Speaker:As far as, the magic that happens inside of generative AI
Speaker:platform, do we know what that is?
Speaker:Or is that kind of the trade secrets of each companies?
Speaker:Or is there general technology that Makes the magic happen?
Speaker:I am not the expert on the technology inside of the AI models.
Speaker:I can share what I know.
Speaker:In part, it does depend on the approach that they've used, whether it's supervised
Speaker:learning or unsupervised learning.
Speaker:Which to make it very simple depends on how much you assisted the machine,
Speaker:like, did you mark things and tell them, this is a dog and this is a cat
Speaker:or did you just give them like kind of millions of pictures and kind of let
Speaker:them figure it out when you let them figure it out when it's unsupervised,
Speaker:it is more of a black box in terms of how they got to that answer,
Speaker:which brings up all sorts of other.
Speaker:societal issue, right?
Speaker:I think it's a time to play it about
Speaker:a, uh, interesting.
Speaker:can we wrap up with some best practices just for your everyday kind of chat, GBT,
Speaker:Janet, what is the Google on Genesis?
Speaker:What does it, user like when they're using it, for this audience, the expertise
Speaker:based business, maybe they're using it to create first drafts or to help them with
Speaker:social media posts or something like that, like just some general best practices.
Speaker:Sure, you want to be, circumspect about any, confidential or proprietary
Speaker:information you include in a prompt, may want to anonymize it.
Speaker:you need to keep in mind that, whatever output you get from, the
Speaker:AI model, may not be eligible for copyright protection if this is, um.
Speaker:something, material or output that you are passing on to a client or to a
Speaker:customer, you may need to disclose that use, and you have to make sure that,
Speaker:you're using the output Depending on the extent to which you're using it, are
Speaker:you using it just for, a little bit of assistance in, modifying a few sentences?
Speaker:Or are you actually producing images with it or producing an entire report with it?
Speaker:You gonna want to make sure that you're procedures for like using generative
Speaker:AI are consistent with the, contract that you have with your customer.
Speaker:If you want to know whether or not, your material, your prompts are being
Speaker:incorporated into the training data and being used to further train that AI model,
Speaker:take a look at the terms and conditions.
Speaker:to give you an example for chat, if you're using the free model,
Speaker:and it is, recording your history.
Speaker:Of your prompts, then, the prompts that you put in.
Speaker:There are subject to being included as part of future training data.
Speaker:Yeah.
Speaker:So, if it along the left hand side there, you have scroll
Speaker:through and see all your graphs.
Speaker:I have.
Speaker:that means it is going into the training data.
Speaker:That is excellent.
Speaker:can't say with certainty that it is going into the training data.
Speaker:But I would say, it is susceptible to being used.
Speaker:It's like they have not provided you, CHAT2P has not provided
Speaker:you any representation that they will not use it for training.
Speaker:Right.
Speaker:Very good.
Speaker:Thank you for making that distinction.
Speaker:thank you for this.
Speaker:this podcast is to help create a society that, and an economy
Speaker:that works for more of us.
Speaker:So I love to ask my guests, if there is an organization or a person who is doing the
Speaker:good and hard work to help make an economy that works for more of us, is there
Speaker:one that you'd like to share with us?
Speaker:Sure, I really like organizations whose mission it is to
Speaker:bridge the digital divide.
Speaker:And one of my favorite is girls who code that has as part of its
Speaker:mission, introducing more women.
Speaker:into the technology field, and that's very apropos to our conversation
Speaker:today, because as part of making AI, you beneficial for all humankind, we
Speaker:really do need, a diverse perspective.
Speaker:Yeah, I mean, we know that just when we talk about the Trina dating sets, like
Speaker:what data is going in there, obviously.
Speaker:what the output is only as diverse as the input, right.
Speaker:And how it's being trained.
Speaker:And I know that has come up in a number of, controversial ways as well, but
Speaker:whether something's leaning this way or that way, but we definitely want to make
Speaker:sure everyone has a voice in the future.
Speaker:Thank you for that one.
Speaker:And we will put that in the show notes along with how people can reach you.
Speaker:where do you hang out, Joy?
Speaker:And how can people get in touch with you to find out more?
Speaker:Sure, so I'm always, , happy to, chat with, people doing innovative
Speaker:things with technology, especially in the digital technology, online and
Speaker:entertainment space, so they can find me through my website, which is www.
Speaker:joybutler.
Speaker:com.
Speaker:And I'm awesome.
Speaker:So, on LinkedIn, I have Joy Butler.
Speaker:Awesome.
Speaker:Well, thank you so much.
Speaker:And, yes, everyone, please, follow Joy and, let us know if you have
Speaker:any other questions about AI.
Speaker:I know it's constantly evolving.
Speaker:There's always going to be something new and we can continue
Speaker:this conversation in the future.
Speaker:Thanks again, Joy.
Speaker:Thank you.