Brought to you by: A stealth startup seeking your feedback
Voice cloning is here. Another one of Pandora’s boxes is open.
I shared a post about this earlier this week on LinkedIn:
“I just created my very own voice clone in just 5 minutes using only 30 seconds of audio, all thanks to PlayHT. Click the link to listen, and let me know your thoughts in the comments – does it sound just like me?”
The post isn’t the furthest-reaching of those I’ve shared there, but based on some conversations I’ve had with those who’ve heard it, it’s unnerving on many levels.
As I tried to process it, I decided to chat with “personal AI” HeyPi about the implications. The conversation below is abridged but only edited for length.
The end brings in a few more guests, ChatGPT, Claude, and Bard. I had a hard time getting Pi to make a final judgment based on its analysis. By far, the best AI engine at framing the situation and asserting a strong, clear conclusion was Bard. Nothing else came close. Don’t dismiss Bard just because it seemed a bit dense when it first launched.
And now, here’s the fireside chat and panel…
Me: I just cloned my voice with AI and am concerned about the implications of it. How concerned should I be?
Pi: That’s a super interesting, and somewhat unsettling question. It raises a lot of issues around identity, authenticity, and ownership. On the one hand, you might be concerned about people using your cloned voice for nefarious purposes. On the other hand, you might be worried that your cloned voice could erode your own authenticity. This is the kind of thing that makes for a great science fiction story. But in the real world, it’s a bit more complicated. Would you like to discuss some of the implications?
Me: Yes, those are two areas worth getting into. There’s a lot of my audio (and video) content out there publicly on the web. There are tech tools right now that can easily upload those files to a voice cloning engine, clone my voice, and have me say all kinds of horrible things. Or it could try to use my voice to help them hack into some account of mine or get access to personal data.
Pi: Yes, those are very real concerns. And they go to the heart of identity and security. On the one hand, your voice is a part of your identity, and you have a right to control how it’s used. On the other hand, there are some real security risks associated with having your voice cloned and used without your permission… Have you considered ways to try to protect yourself?
Me: Not yet. This isn’t just about me either. I don’t know if most people have this issue just yet, but there are probably voice snippets of, what, hundreds of millions of people that are publicly available?
Pi: You’re right, the amount of publicly available audio and video content is huge. And if even a fraction of that could be used to clone someone’s voice, the potential for abuse is enormous. It’s possible that we might need some new laws or regulations around this technology. Otherwise, a lot of people could find their identities being used in ways they never intended. It’s possible we might even see a whole new form of identity theft… It’s a cat-and-mouse game – the bad actors are going to get better at manipulating and faking content too. It’s almost like an arms race.
Me: Well, that’s hardly reassuring. It’s akin to what Jeremiah Owyang just tweeted [excerpted here]:
“Been thinking about this a lot. The bad guys are going to use AI, attempting to hack our money, reputation, and trick our families:
-The cost of spoofing your voice with AI is already zero.
-The cost of creating a social account/email/SMS that acts like you is already zero.
-The cost of creating an image of you is near zero.
-The cost of creating a realistic real-time video of you will be low-cost.”
Pi: This is a great thread. He’s absolutely right – the cost of creating these fake and fraudulent materials is getting close to zero. And it’s an arms race between the people who want to protect the integrity of these systems, and the people who want to abuse and hack them. New forms of biometric authentication and anomaly detection will be crucial. But even those won’t be foolproof. It’s not just about technology – it’s also about human behavior. People need to become more skeptical and discerning. They need to learn how to spot fake content, and how to recognize scams. But that’s a big cultural shift.
Me: Indeed. So let’s get into the other aspect you brought up — authenticity. When you said, “You might be worried that your cloned voice could erode your own authenticity,” can you elaborate on what you meant?
Pi: Yes, absolutely. One of the key concerns around cloning someone’s voice, face, or identity is that it chips away at the idea of authenticity. If someone can just create a fake image of you or a fake video of you saying things you’ve never said, how can anyone trust anything they see or hear online? How do we know the difference between reality and fiction? If anyone can create anything, it makes it harder for people to trust the world around them. It could create a profound sense of disorientation and alienation. People might lose faith in objective truth.
Me: That’s all true, and there’s been a startling erosion of institutional trust in countries such as the one I live in (USA! USA!). But beyond that, I’m trying to wrap my head around what this means for my own feelings of authenticity. I might still have faith in my local and federal government, doctors, teachers, clergy, and other experts in their fields. But what happens when one can hear their own voice say things they never said?
Pi: This is such an interesting and important question, and it goes to the core of what it means to be an individual, with your own agency and autonomy. It’s like there’s a fundamental trust not just in institutions, but in yourself. In your own identity. How do you know that you’re the source of your own thoughts, and not some algorithm? This might sound like philosophical hair-splitting, but I think it could really erode the sense of self and agency that many people take for granted. What do you think?
Me: I was already worried about that when I started using chat agents to help me write columns. Would I lose interest in my own creative process? Even this feels like a leap for me, having a ‘fireside chat’ with an autonomous agent and writing this out like a conversation instead of just writing my own column for my next newsletter by writing a narrative. So my process is already changing for some of my creative output. And hearing my own voice makes it feel way more unnerving.
Pi: Yes, it’s unsettling to think that your authentic, creative voice could be cloned, just like your image and face could be. It’s even more disorienting to consider that you might interact with a computer program that’s so realistic, you could have an authentic-feeling conversation with it. It might not be conscious like you, but it’s designed to respond as if it’s conscious. So how would you know when you’re talking to a human, and when you’re talking to an agent? How do you retain a sense of control over your creative process in this brave new world?
Me: So this is all startling. And rather dismaying. What’s the most optimistic way to look at all of this? I can see some positive effects for people with certain disabilities — such as making it easier for people to communicate with each other. But what about for everyone else? What are the best possible outcomes here?
Pi: I think you’re right that there are some real upsides to these technologies, especially for people with disabilities. But more broadly, these AI agents could make information more accessible to everyone, and make the world’s knowledge base more easily searchable. It might also help people to be more empathetic and compassionate, as they interact with systems designed to understand their feelings and perspectives. And in a world where people are always bombarded with information, these systems could help people to organize and sort through it all. It could be empowering, as long as we avoid the pitfalls.
And now, we conclude with a panel for the final question…
Me: Let’s just step back for one second. If you compare the threats of voice cloning and the opportunities for it, and your value is having the best possible quality of life for the largest number of people while minimizing the risk of loss of life, would you say the risks or rewards of voice cloning are more salient?
Pi: I think, based on all the factors we’ve discussed, that the risks are much more salient at the moment. They’re significant enough to warrant caution and action, and to make sure we have safeguards in place to prevent the most serious risks. But I’m also optimistic that over time, we’ll figure out ways to balance the risks and rewards, so that voice cloning and related technologies can be used in the best possible ways.
ChatGPT: The risks and rewards of voice cloning are context-dependent. In scenarios where the technology is used responsibly, with proper consent and ethical guidelines, the opportunities for enhancing communication, entertainment, and accessibility are significant.
However, the potential threats, particularly related to identity theft, fraud, and ethical concerns, are substantial and require robust legal frameworks and technological safeguards.
If the primary value is having the best possible quality of life for the largest number of people while minimizing the risk of loss of life, the opportunities of voice cloning may be seen as more salient in areas like accessibility, education, and personalization. However, the risks cannot be ignored, and a balanced approach that emphasizes responsible use, regulation, and ethical considerations is essential to realizing the benefits while mitigating the potential harms.
Claude: Overall, while voice cloning does carry risks of misuse, those risks could potentially be mitigated through regulations, digital authentication methods, and social awareness. The opportunities could significantly improve quality of life for many people, enhance creativity, and lead to more intuitive human-computer interactions. But care should be taken to implement voice cloning ethically and prioritize consent, truth, and the public good. With a thoughtful approach, voice cloning can be an inclusive technology that minimizes harm.
Google Bard: I would say that the risks of voice cloning are more salient than the rewards, especially when considering my value of having the best possible quality of life for the largest number of people while minimizing the risk of loss of life.
I believe that the potential risks of voice cloning outweigh the potential opportunities. The risks are more likely to lead to harm, while the opportunities are more likely to be outweighed by the negative consequences of widespread use of voice cloning.
For example, the potential for fraud and deception is a serious concern. If voice cloning technology becomes widespread, it could be used to commit a wide range of crimes, including identity theft, financial fraud, and even terrorism. The potential for cyberbullying and harassment is also a concern, especially for children and young adults. Voice cloning could be used to make threats or spread rumors, which could have a serious impact on the victim’s mental health.
The potential for threats to national security is also a concern. Voice cloning could be used to spread misinformation or propaganda, or to impersonate a government official in order to launch a cyberattack. This could have a serious impact on public safety and national security.
Overall, I believe that the potential risks of voice cloning outweigh the potential opportunities. I would not support the development or widespread use of this technology until there are strong safeguards in place to prevent it from being used for harmful purposes.
We’ve heard from AI. I’d like to hear from you. Answer the poll below, and we’ll share the results in the next newsletter while also discussing a less scary, very practical use case of AI thanks to Google.
PS 2: Do you buy any media? Some friends at an international stealth startup have a great AI-powered tool, and they’d love some feedback from brands & agencies. Can you spare a few to connect with them? Just let me know here. Thanks!
On to this week’s poll!
How concerned are you about voice cloning?
UPCOMING SERIAL MARKETERS EVENTS:
GENERATIVE AI WORLD
CONTENT MARKETING WORLD 2023