As a fan of nature documentaries, I was excited when I saw a video shared by programmer Charlie Holtz on social media in November 2023 declaring:
“David Attenborough is now narrating my life…here’s a GPT-4-vision + @elevelabsio [Eleven Labs] Python script so you can star in your own Planet Earth.”
In the video, we see that Holtz has combined capabilities from two companies, OpenAI and Eleven Labs, to make a voice clone of Attenborough describing his movements in real time.
This is just one example of how generative AI systems enable new types of deepfakes beyond simple combinations of preexisting media. They also enable it at a greater scale of effectiveness and convenience. You no longer need to know how to use Photoshop to make a convincing deepfake, and you can make one in just a few seconds.
I was tempted to try Holtz’s demo myself, but as an ethicist, I was also worried about being respectful to Attenborough. He didn’t give consent for his voice to be used in this specific way, after all. I asked my Ethics and AI class for their advice, and the vast majority thought it was fine, based on some combination of the following three argument:
A. The media creator is not using the representation to make profit (free use).
B. The media is not being used to do anything harmful (adverse impact).
C. The media represents a person who has previously given consent to appear in similar content (public view).
Some people might think just one of these alone is sufficient to justify the video, much less all three. However, a week later, Attenborough made a statement to Business Insider where he seemed to disapprove:
“…it is of the greatest concern to me that one day, and that day may now be very close, someone is going to use AI to deceive others into believing that I am saying things contrary to my beliefs or that misrepresent the wider concerns I have spent a lifetime trying to explain and promote.”
An advocate of arguments A-C could still respond: “Sorry Attenborough, but when you chose to put your voice into public, you gave implied consent for us to use it in ways that don’t do harm to others or make profit, whether you like it or not.” I am not one of the people who would respond in this way, and I’m ultimately glad that I didn’t experiment with Holtz’s demo. There is an important fact that Attenborough’s statement reveals: This is contrary to how he wants his identity represented, and people should have control over representations of their identity. Compare this with other celebrities like Grimes, who has given qualified permission to use her voice however they like. Even Will Smith seemed to take the famous spaghetti-eating videos well, although he didn’t comment about what he thought of the video originally. (The actor is not famous for taking insults in good stride.)
Celebrities are not the only ones dealing with the problem of more advanced deepfakes. Earlier this year, we saw prominent news stories about girls in a New Jersey high school whose images were used by their peers to make fake nudes, and a principal in a Maryland high school whose voice was used by a spiteful colleague to make a fake racist audio.. As generative AI technology enables cheaper and more convincing deepfakes, the frequency of these events will increase, along with the volume of fake content compared with real content. We need to develop a clear set of social norms around what constitutes acceptable use of generative content that depicts real people.
Several U.S. states and many other countries have passed legislation to criminalize the production, sale or possession of fabricated media, but these are usually isolated to specific purposes, like political and sexual content (especially the sexual depiction of minors). For example, California has two laws prohibiting deepfake content, but these are directed specifically at sexual deepfakes (AB 602) and political deepfakes (AB 730). However, what if I take a friend or colleague’s voice and make them narrate my life, in the same way that Holtz did for Attenborough? This doesn’t fall under any of the obvious categories of deepfake legislation, standard harassment or defamation.
Instead of just a legal lens, we need to think about the problem of deepfakes from an ethical lens. In addition to the standard ethical ideas that my class brought up about Attenborough (A-C), there are two additional ethical concepts that often get left out of the discussion:
D: The media is not reflecting how the person wants to be represented (respectful representation).
E: The media is deceptive to people who are viewing it (deception).
Both of these concepts involve respect in some way, either respecting the subject of the content or respecting the audience of the content. I think these ideas are crucial to thinking about deepfakes. What’s wrong with this sort of media is not only that it’s harmful or violates consent, but that it is also disrespectful and deceptive.
There are difficult nuances with respect. For example, we want to allow for parodies, which involve a representation that people don’t like, but are intended to entertain rather than deceive. Still, I’m not convinced that Holtz’s demo was respectful of Attenborough, even under the “parody” category. (The reasons for this are subtle and involve the difference between “making fun” of Attenborough and using his voice to accomplish something new and distinct.) In addition, parody can often be used to conceal misinformation; when the Ron DeSantis campaign placed a manipulated image of Donald Trump hugging Anthony Faucci, a representative from the campaign told The New York Times that the images were “obviously fake,” and thus c
ontained no intent to deceive or expectation of deception. This is nothing new in politics; a smokescreen of “I was joking” can often be used to conceal lies and defamation.
Companies that design or disseminate generative AI tools may try to avoid responsibility by insisting that these are merely tools which can be used for good or bad purposes, or that people have always been able to create fake media representations. These are both true, but a long history of ethical debate about dual use technologies gives us good reason to think that technologies which make some bad activity easier have a proportional responsibility to mitigate these harms (despite legal protections like Section 230). This certainly applies to generative AI and deepfakes: Companies that make generative AI technology have both ethical and legal obligations to take strong measures to prevent this technology from being used to generate representations of people that violate A-E.
The most obvious mechanisms are watermarking AI-generated content or providing some information in the metadata that can be used to identify its origin. A recent whitepaper by Microsoft describes their efforts in this area. Another mechanism is verification of the data itself that is used by the model to generate content, which is what ElevenLabs attempts to do with voice data, partnering with companies like Reality Defender.
No security measures are ever going to be 100% perfect, and we’ve seen that the ElevenLabs software was likely used to generate audio deepfakes of Joe Biden that were sent out to New Hampshire voters via robocall earlier this year. But the question that companies must address is: How much abuse of their technology is an acceptable level, and how much abuse would demand stricter mechanisms or even a complete ban? This is not just a challenge for AI designers, but also for clients and even individual users who make use of these systems and. A loss of how our identities is represented is not an inevitability and can be prevented through standards for technologies that might enable it.