Last month we looked at how generative AI works and how it can help bolster our creativity and productivity. I also made the claim that AI functions best as a supplement and not as a replacement for human creativity. Without a doubt, generative AI is a fantastic resource for creators, writers, and developers alike, but there are real reasons why we should also be cautious about relying on it too heavily. This month, let’s broaden our understanding of AI by looking at the potential drawbacks of this swiftly evolving technology.
Still Learning
Generative AIs are highly capable, but they’re not perfect. In order to mimic how we as humans communicate or produce art, generative AIs need an enormous sample size of existing human writings and art to use as reference, and that means they’re exposed to a nearly infinite amount of variation. AIs themselves can’t differentiate between what is a “good” or “bad” result – they rely on human engineers to set hundreds of parameters and filters to fine-tune what they produce. This process is ongoing as AIs continue to learn by people interacting with them and providing feedback.
As of now, the stuff we get from generative AI can sometimes be pretty comical. For example, art AIs like DALL-E and Midjourney are known for having (ahem) issues producing human hands accurately. They also have trouble producing readable text like signs. Take a look:
Aside from these light-hearted examples, it’s also possible to see generative AIs accidentally produce content that is insensitive, sexist, racist, hateful, or even violent. Until AIs can determine for themselves the quality and purpose of their output, they are ultimately just tools, and like any tool it’s therefore up to the weilder to determine whether they’re used for good or evil.
Potential for Misuse
As with anything, there exists the possibility that someone can misuse AIs to produce inappropriate content on purpose. Case in point, according to the Cyberbullying Research Center, generative AIs have had a significant impact on the frequency and intensity of cyberbullying in recent years:
Generative AI allows for both the automatic creation of harassing or threatening messages, emails, posts, or comments on a wide variety of platforms and interfaces, and its rapid dissemination. Its impact historically may have been limited since it takes at least some time, creativity, and effort to do this manually, with one attack after another occurring incrementally. That is no longer a limitation, since the entire process can be automated. Depending on the severity, these generated messages can then lead to significant harm to those targeted, who are left with little to no recourse to stem the voluminous tide of abuse, nor identify the person(s) behind it.
Less maliciously, there’s also the issue of using AIs to cheat. There are certain telltale signs that something was written or created using AI (see the hands in the images above), but more often than not it can be very difficult to distinguish between that and a genuine human creation. As the technology continues to improve, it may one day be impossible to tell the difference. But even now, the content produced by generative AI can be so “lifelike” that it can trick people into thinking that a human being is actually responsible.
This is causing problems in academia where students are using generative AIs like ChatGPT to write term papers and reports. There’s also the story of an amateur winning the digital art contest at the 2022 Colorado State Fair by submitting a piece he created using image AI Midjourney.
The Concept of Ownership
We know that generative AIs require training on vast amounts of data to be effective. But what exactly is this data? Where does it come from?
Well, it can depend on the AI and for what purpose it’s being used, but often the answer is simply “the internet.” In other words, the deep sea of content online – writings, speeches, art, poems, music, movies, all of it – has likely already been incorporated into one or more AI training algorithms at this point. That may or may not seem like a big deal depending on your personal views on digital ownership, but many content creators, particularly artists, are taking serious issues with their works being used to train AIs without their permission.
Take DeviantArt as a recent example. DeviantArt (www.deviantart.com) is a website that serves as an online portfolio and community for artists, photographers, and videographers to store and share their works. In November of 2022, DeviantArt announced that they were launching a generative art AI called DreamUp that would incorporate their users’ data into its training algorithm. User backlash to this announcement was swift and merciless, with many railing against the site’s lack of clarity and options for opting out of being included. What eventually came to light, however, was that DreamUp was actually Stable Diffusion, an existing image AI tool that was already known for training on artwork without creator permission. Essentially, DeviantArt users were protesting the idea of their work being included in AI training, when in reality it had already happened.
But it’s not just artists and their works that are at risk. With the advent of AI deepfaking, actors and models now have to worry about their likeness and/or their voice being replicated without their knowledge. Voice actors are also concerned with audio AIs like Murf replicating or training off their performances without their consent, especially if it means losing out on potential jobs. Even us regular folk should be wary about the risk that deepfaking poses when it comes to online security and identity theft.
Situations like these undoubtedly raise some questions about where we draw the line between AI training and digital plagiarism or copyright infringement. Consider the following:
Should an AI be allowed to learn on data without the original creator’s consent? Isn’t this what human creators do every day when they seek inspiration from other creators?
If an AI has already learned on data but the creator denies or withdraws consent, is it possible for the AI to “unlearn” that data?
Should an AI be required to list exactly what resources are used to generate its output each time and credit the creators?
Is it even possible to determine what those resources are? What if they number in the thousands, or tens of thousands, or more?
Should the extent to which the data was used be taken into account? For instance, what if the data used is as minute as a single letter or word, or even just one pixel from a piece of digital art?
It’s important to know that while generative AIs can be an invaluable tool for creation, there’s quite an ethical gray area in terms of how they’re trained and who, if anyone, owns what they produce.
Steve Shannon has spent his entire professional career working in tech. He is the IT Director and Lead Developer at PromoCorner, where he joined in 2018. He is, at various times, a programmer, a game designer, a digital artist, and a musician. His monthly blog "Bits & Bytes" explores the ever-evolving realm of technology as it applies to both the promotional products industry and the world at large. You can contact him with questions at steve@getmooresolutions.com.