OpenAI: Voice Cloning AI Can Mimic Any Voice From 15 Second Sample

OpenAI has announced the completion of initial small-scale testing of a new tool called Voice Engine that can reproduce the voice of any person. Even more impressive and concerning is that Voice Engine only requires a 15-second voice sample to achieve this.

The technology is based on OpenAI’s current text-to-speech API and has been in development since 2022. Numerous samples demonstrating its voice cloning capabilities have already been published on the company’s website.

OpenAI sees this technology as useful for applications like reading assistance, language translation, and helping those suffering from sudden or degenerative speech impairments.

However, the company acknowledges the potential for misuse, especially given the upcoming elections in many regions of the world this year. As such, anyone using the technology will be required to disclose that the voices are AI-generated. Each created voice will also be watermarked to allow tracking of its origin.

OpenAI has not announced a launch date for Voice Engine, which may be commercialized at a cost of $15 per million characters.

While voice cloning could have beneficial use cases, the implications around consent, privacy and potential misuse for disinformation are concerning ethical issues that will require careful consideration as this technology develops.

More info on Open AI Blog: https://openai.com/blog/navigating-the-challenges-and-opportunities-of-synthetic-voices