Sunday, October 22, 2023

WSJ Podcast interview with OpenAI creators/Wash Post Op-Ed "AI learning from stolen IP"

 "A Conversation with OpenAI’s Sam Altman and Mira Murati

Two of the creators of ChatGPT discuss job disruption, data and the ‘person-ness’ of AI chatbots with WSJ’s Joanna Stern"

This 20 mins podcast interview has some startling information about possible futures for AI programs like ChatGPT.

And even scarier prospects about the directions other companies developing these technologies might be willing to take with it. 

https://www.wsj.com/podcasts/the-journal/a-conversation-with-openais-sam-altman-and-mira-murati/7c89e85f-9d7e-4569-b67d-6a777374eada

Excerpts from the episode transcript available on the link above --

"Kate Linebaugh: Another problem OpenAI has faced are lawsuits from writers like George R.R. Martin and John Grisham to the comedian Sarah Silverman. They're alleging copyright infringement because their copyrighted work was used to train the company's AI models. One lawsuit calls ChatGPT, "systematic theft on a mass scale." OpenAI has said it trains its AI models on publicly available information. The company has also said that it respects the rights of creators and authors and that many creative professionals use ChatGPT.

Joanna Stern: Sam, I'll ask you about the data, the training data. Obviously there's been maybe some people in this audience who may not be thrilled about some of the data that you guys have used to train some of your models. Not too far from here in Hollywood, people have not been thrilled, publishers. When you're considering now as you're walking through and going to work towards these next models, what are the conversations you're having around the data?

Sam Altman: So we obviously only want to use data that people are excited about us using. We want the model of this new world to work for everyone. And we want to find ways to make people say like, "You know what? I see why this is great. I see why this going to be a new." It may be a new way that we think about some of these issues around data ownership and how economic flows work, but we want to get to something that everybody feels really excited about. But one of the challenges has been people, different kinds of data owners have very different pictures. So we're just experimenting with a lot of things. We're doing partnerships of different shapes. And we think, like with any new field, we'll find something that sort of just becomes a new standard. Also, I think as these models get smarter and more capable, we will need less training data. So I think there's this view right now, which is that models are going to have to train on every word humanity has ever produced or whatever. And technically speaking, I don't think that's what's going to be the long-term path here. We have existential proof with humans that that's not the only way to become intelligent. And so I think the conversation gets a little bit led astray by this because what really will matter in the future is particularly valuable data. People trust The Wall Street Journal and they want to see content from them. And the Wall Street Journal wants that too. And we find new models to make that work. But I think the conversation about data and the shape of all of this, because of the technological progress we're making, it's about to shift." ......

.......Joanna Stern: That's a big responsibility though. And you guys will be in sort of control of people's friends. Maybe it gets to being people's lovers. How do you guys think about that control?

Sam Altman: First of all, I think we're not going to be the only player here. There's going to be many people. So we get to put our nudge on the trajectory of this technological development. And we've got some opinions, but, A, we really think that the decisions belong to humanity, society as a whole, whatever you want to call it. And, B, we will be one of many actors building sophisticated systems here. So it's going to be a society-wide discussion, and there's going to be all the normal forces. There'll be competing products that offer different things. There will be different kind of societal embraces and pushbacks. There'll be regulatory stuff. It's going to be like the same complicated mess that any new technological birthing process goes through. And then we pretty soon we'll turn around and we'll all feel like we had smart AI in our lives forever. And that's the way of progress and I think that's awesome. I personally have deep misgivings about this vision of the future where everyone is super close to AI friends, more so than human friends or whatever. I personally don't want that. I accept that other people are going to want that. And some people are going to build that and if that's what the world wants and what we decide makes sense, we're going to get that. I personally think that personalization is great, personality is great, but it's important that it's not like person-ness and at least that when you're talking to an AI and when you're not. We named it ChatGPT and not, it's a long story behind that, but we named it ChatGPT and not a person's name very intentionally. And we do a bunch of subtle things in the way you use it to make it clear that you're not talking to a person. And I think what's going to happen is that in the same way that people have a lot of relationships with people, they're going to keep doing that. And then there'll also be these AIs in the world, but you kind of know they're just a different thing.

Kate Linebaugh: After the break, how to prevent the AI apocalypse...."

------------------------------------------

And this Washington Post Op-Ed by William D. Cohen

Opinions | AI is learning from stolen intellectual property. It needs to stop.

"Tech companies are getting even richer by vacuuming up the work of writers without permission." https://www.washingtonpost.com/opinions/2023/10/19/ai-large-language-writers-stealing/

Excerpt: “Some of us are starting to fight back. More should. One class-action lawsuit has been filed by authors Richard Kadrey, Sarah Silverman and Christopher Golden in federal court in California against Meta — what we used to call Facebook — seeking both an injunction against continuing to use the writers’ copywritten material and financial damages. The authors argue that to create Meta’s large language models, LLMs for short, which form the basis of Meta’s AI offerings, the LLMs are “trained” by copying text and extracting expressive information from it. Once the material has been “copied and ingested,” the LLMs are able “to emit convincing simulations of natural written language,” according to the lawsuit. “Much of the material in Meta’s training data set, however, comes from copyrighted works — including works written by Plaintiffs — that were copied by Meta without consent, without credit, and without compensation.” They filed a similar lawsuit against OpenAI, maker of ChatGPT. Author Michael Chabon has also filed a lawsuit against Meta for the same reasons. These lawsuits are in the early stages of the judicial process.”

No comments: