Artists criticize Apple's lack of transparency on Apple Intelligence data

Later this year, millions of Apple devices will start using Apple Intelligence, Cupertino’s version of generative AI that can, among other things, create images from text prompts. But some in the creative community are unhappy with the company’s lack of transparency about the raw information that feeds the AI model that makes it possible.

“I wish Apple would have been more transparent with the public about how they collect their training data,” Jon Lam, a Vancouver-based game artist and creators’ rights activist, told Engadget. “I think their announcement couldn’t have come at a worse time.”

Creatives have always been among Apple’s most loyal customers, a company whose founder positioned it at “the intersection of technology and the liberal arts.” But photographers, concept artists, and sculptors who spoke to Engadget said they’re frustrated by Apple’s relative silence on how the company collects data for its AI models.

Generative AI is only as effective as the data its models are trained on. To that end, most companies have ingested pretty much everything they can find on the internet, without regard to consent or compensation. Nearly 6 billion images used to train several AI models also come from LAION-5B, a dataset of images scraped from the internet. In an interview with ForbesDavid Holz, CEO of Midjourney, said the company’s models were trained on “just a big snippet of the internet” and that “there’s no real way to get a hundred million images and know where they came from.”

Artists, authors, and musicians have accused AI companies of hoovering up their work for free and profiting from it, leading to more than a dozen lawsuits in 2023 alone. Last month, major record labels including Universal and Sony sued AI music generators Suno and Udio, startups valued at hundreds of millions of dollars, for copyright infringement. The tech companies have—ironically—defended their actions and have also struck licensing deals with content providers, including news publishers.

Some creatives thought Apple could do better. “That’s why I wanted to give them a little benefit of the doubt,” Lam said. “I thought they would approach the ethics debate differently.”

Apple has revealed very little about the source of Apple Intelligence’s training data. In a post on the company’s machine learning research blog, the company wrote that, like other generative AI companies, it scrapes public data from the open web using AppleBot, its purpose-built web crawler, something its executives also said on stage. Apple’s head of AI and machine learning, John Giannandrea, was also quoted as saying that “a lot of the training data was actually created by Apple,” but didn’t elaborate. And Apple has also reportedly signed deals with Shutterstock and Photobucket to license training images, but hasn’t publicly confirmed those relationships. While Apple Intelligence is trying to win praise for a supposedly more privacy-focused approach using on-device processing and custom cloud computing, the fundamentals behind its AI model appear little different from those of its competitors.

Apple did not respond to specific questions from Engadget.

In May, Andrew Leung, a Los Angeles-based artist who has worked on films such as Black Panther, The Lion King And Mulancalled generative AI “the greatest theft of human intellect in the history of the world” in testimony before the California State Assembly about AI’s impact on the entertainment industry. “I want to emphasize that when they use the term ‘publicly available,’ it just doesn’t fly,” Leung said in an interview. “That doesn’t automatically translate to fair use.”

It’s also problematic for companies like Apple, Leung said, to only offer users the option to opt out after they’ve already trained AI models on data they haven’t consented to. “We never asked to be part of that.” Apple allows websites to opt out of AppleBot scraping Apple Intelligence training data—the company says it respects the robots.txt file, a text file that any website can host to tell bots to stay away—but that would be a cherry-picking at best. It’s unclear when AppleBot started scraping data from the web, or how anyone could have opted out before then. And, technologically, it’s an open question how and if requests to remove information from generative models can even be honored.

It’s a sentiment shared even by blogs aimed at Apple enthusiasts. “It’s disappointing to see Apple sully a compelling feature set (some of which I’m really looking forward to trying) with practices that are no better than the rest of the industry,” wrote Federico Viticci, founder and editor of the Apple Enthusiasts blog. MacStories.

Adam Beane, a Los Angeles-based sculptor who created an image of Steve Jobs for Squire As of 2011, he has been using Apple products exclusively for 25 years. But he said the company’s refusal to disclose the source of Apple Intelligence’s training data disappointed him.

“I’m getting more and more angry at Apple,” he told Engadget. “You have to be informed and savvy enough to know how to opt out of training Apple’s AI, and then you have to trust a company to respect your wishes. Plus, all I see as an opt-out is further “Train their AI with your data”

Karla Ortiz, a San Francisco-based illustrator, is one of the plaintiffs in a 2023 lawsuit against Stability AI and DeviantArt, the companies behind the image-generating models Stable Diffusion and DreamUp, respectively, and Midjourney. “Ultimately, we know that for generative AI to work as it does, it relies on massive overreach and violations of rights, both private and intellectual,” she wrote in a viral thread on Apple Intelligence. “This is true for all (generative) AI companies, and as Apple forces this technology upon us, it’s important to remember that they are no exception.”

The outrage against Apple is also part of a broader sense of betrayal among creative professionals toward the tech companies whose tools they rely on to do their jobs. In April, a Bloomberg A report has revealed that Adobe, which makes Photoshop and many other apps used by artists, designers, and photographers, used images of questionable origin to train Firefly, its own image-generating model that Adobe claimed was trained “ethically.” And earlier this month, the company was forced to update its terms of service to clarify that it would not use customer content to train generative AI models after customer outrage. “The entire creative community has been betrayed by every software company we’ve ever trusted,” Lam said. Ditching Apple products altogether isn’t an option for him, he’s trying to cut back on spending — he plans to ditch his iPhone for a Light Phone III.

“I think there’s a growing sense that Apple is becoming like the rest of us,” Beane said. “A giant corporation that prioritizes its bottom line over the lives of the people who use its products.”

This post contains affiliate links; if you click on one and make a purchase we may earn a commission.

Source link

Artists criticize Apple’s lack of transparency on Apple Intelligence data

Leave a Comment Cancel Reply