The disease took away his voice. AI created a replica that she carries on her phone

Share on facebook
Share on twitter
Share on linkedin
Share on pinterest
Share on telegram
Share on email
Share on reddit
Share on whatsapp
Share on telegram


PROVIDENCE, RI — The voice Alexis “Lexi” Bogan had before last summer was lush.

She loved singing Taylor Swift and Zach Bryan ballads in the car. She laughed all the time — even while corralling misbehaving preschoolers or debating politics with friends over a backyard bonfire. In high school, she was a soprano in the choir.

Then that voice disappeared.

Doctors in August removed a life-threatening tumor located near the back of his brain. When the breathing tube came out a month later, Bogan had trouble swallowing and struggled to say “hi” to his parents. Months of rehabilitation helped his recovery, but his speech is still impaired. Friends, strangers, and her own family struggle to understand what she is trying to tell them.

In April, the 21-year-old regained her old voice. Not the real thing, but an AI-generated voice clone that she can summon from a phone app. Trained on a 15-second time capsule of her teenage voice – sourced from a cooking demonstration video she recorded for a high school project – her synthetic but incredibly real AI voice can now say almost anything she wants. .

She types a few words or phrases into her phone and the app instantly reads them out loud.

“Hi, please can I get a grande iced brown sugar oatmeal shake espresso,” said Bogan’s AI voice as she held her phone out the car window in a Starbucks drive-thru.

Experts have warned that rapidly improving AI voice cloning technology could amplify phone fraud, disrupt democratic elections and violate the dignity of people – living or dead – who never consented to having their voice recreated to say things they never they said.

It has been used to produce fake robocalls to New Hampshire voters imitating President Joe Biden. In Maryland, authorities recently accused a high school athletic director of using AI to generate a fake audio clip of the school principal making racist comments.

But Bogan and a team of doctors from Rhode Island’s Lifespan hospital group believe they have found a use that justifies the risks. Bogan is one of the first people – the only one with his condition – who has managed to recreate a lost voice with OpenAI’s new Voice Engine. Some other AI providers, such as startup ElevenLabs, have tested similar technology for people with speech impairments and losses — including a lawyer who now uses her voice clone in court.

“We hope Lexi will be a pioneer as the technology develops,” said Dr. Rohaid Ali, a neurosurgery resident at Brown University School of Medicine and Rhode Island Hospital. Millions of people with debilitating strokes, throat cancer or neurogenerative diseases could benefit, he said.

“We must be aware of the risks, but we cannot forget the patient and the social good,” said Dr. Fatima Mirza, another resident working on the pilot. “We are able to help give Lexi her true voice back and she is able to speak in terms that are truer to herself.”

Mirza and Ali, who are married, caught the attention of OpenAI, creator of ChatGPT, because of their previous research project at Lifespan using the AI ​​chatbot to simplify medical consent forms for patients. The San Francisco company reached out earlier this year while searching for promising medical applications for its new AI voice generator.

Bogan was still slowly recovering from surgery. The illness began last summer with headaches, blurred vision and a droopy face, alarming doctors at Hasbro Children’s Hospital in Providence. They discovered a vascular tumor the size of a golf ball pressing against the brain stem and tangled with blood vessels and cranial nerves.

“It was a battle to control the bleeding and remove the tumor,” said pediatric neurosurgeon Dr. Konstantina Svokos.

The 10-hour duration of the surgery, along with the location and severity of the tumor, damaged Bogan’s tongue muscles and vocal cords, impeding his ability to eat and speak, Svokos said.

“It’s almost like a part of my identity was taken away when I lost my voice,” Bogan said.

The feeding tube was released this year. Speech therapy continues, allowing him to speak intelligibly in a quiet room, but with no sign that he will regain full lucidity in his natural voice.

“At some point, I was starting to forget what I sounded like,” Bogan said. “I’m getting used to the way I look now.”

Whenever the phone rang in the family home in the Providence suburb of North Smithfield, she handed it to her mother to answer her calls. She felt like she was burdening her friends whenever they went to a noisy restaurant. Her father, who has hearing loss, had difficulty understanding her.

Back at the hospital, doctors were looking for a pilot patient to try out the OpenAI technology.

“The first person that came to Dr. Svokos’ mind was Lexi,” said Ali. “We reached out to Lexi to see if she would be interested, not knowing what her response would be. She was willing to try it and see how it would work.”

Bogan had to go back a few years to find a suitable recording of her voice to “train” the AI ​​system on how she spoke. It was a video in which she explained how to make pasta salad.

Their doctors intentionally fed the AI ​​system just a 15-second clip. Kitchen sounds make other parts of the video imperfect. It was also everything OpenAI needed: an improvement over previous technology, which required much longer samples.

They also knew that getting something useful in 15 seconds could be vital for any future patient who has no trace of a voice on the internet. A brief voicemail left for a relative may suffice.

When they tested it for the first time, everyone was surprised by the quality of the voice clone. Occasional flaws – a mispronounced word, a missing intonation – were mostly imperceptible. In April, doctors equipped Bogan with a personalized phone app that only she can use.

“I get so emotional every time I hear her voice,” said her mother, Pamela Bogan, with tears in her eyes.

“I think it’s amazing to be able to have that sound again,” Lexi Bogan added, saying it helped “boost my confidence to where I was before all of this happened.”

She now uses the app about 40 times a day and sends comments that she hopes will help future patients. One of her first experiences was talking to the children at the preschool where she works as an assistant teacher. She typed “ha ha ha ha” expecting a robotic response. To her surprise, it sounded like her old laugh.

She used it at Target and Marshall’s to ask where to find items. This helped her reconnect with her father. And it became easier for her to order fast food.

Bogan’s doctors have begun cloning the voices of other Rhode Island patients and hope to bring the technology to hospitals around the world. OpenAI said it is proceeding cautiously in expanding the use of Voice Engine, which is not yet publicly available.

Several small AI startups already sell voice cloning services to entertainment studios or make them more widely available. Most voice generation vendors claim to prohibit impersonation or abuse, but they vary in how they enforce their terms of use.

“We want to make sure everyone whose voice is used in the service continually consents,” said Jeff Harris, product lead at OpenAI. “We want to make sure it’s not used in political contexts. So we take a very limited approach to who we provide the technology to.”

Harris said OpenAI’s next step involves developing a secure “voice authentication” tool so that users can replicate only their own voice. This can be “limiting for a patient like Lexi who has had a sudden loss of speech ability,” he said. “So we think we’re going to need to have high-trust relationships, especially with medical providers, to provide a little more unfettered access to the technology.”

Bogan impressed his doctors with his focus on thinking about how technology could help others with similar or more severe speech problems.

“Part of what she did throughout this process was think of ways to adjust and change this,” Mirza said. “She has been a huge inspiration to us.”

While for now she needs to fiddle with her phone to get the voice engine to speak, Bogan envisions an AI voice engine that enhances older speech recovery remedies — like the robotic-sounding electrolarynx or a vocal prosthesis — by merging with the human body or translating words in real time.

She is less sure about what will happen as she gets older and her AI voice still sounds like it did when she was a teenager. Perhaps technology could “age” its AI voice, she said.

For now, “even though I don’t have my voice fully recovered, I have something that helps me find my voice again,” she said.

___

The Associated Press and OpenAI have a technology and licensing agreement that allows OpenAI access to part of the AP’s text files.



This story originally appeared on ABCNews.go.com read the full story

Support fearless, independent journalism

We are not owned by a billionaire or shareholders – our readers support us. Donate any amount over $2. BNC Global Media Group is a global news organization that delivers fearless investigative journalism to discerning readers like you! Help us to continue publishing daily.

Support us just once

We accept support of any size, at any time – you name it for $2 or more.

Related

More

Google Gemini Voice Chat Mode Is Here

August 13, 2024
Google is launching a new voice chat mode for Gemini called Gemini Live, the company announced at its Pixel 9 event today. Available to Gemini Advanced subscribers, it
1 2 3 9,595

Don't Miss

Analysis: Scottie Scheffler’s comparisons to Tiger Woods are a tribute to both

Analysis: Scottie Scheffler’s comparisons to Tiger Woods are a tribute to both

AUGUSTA, Georgia – Think if Scottie Scheffler hadn’t misread that
City and County Officials Warn Residents to Prepare for Debby

City and County Officials Warn Residents to Prepare for Debby

Tallahassee city and Leon County officials implored residents to prepare