Researchers Say AI Transcription Tool Used In Hospitals Invents Things (apnews.com)

(Tuesday October 29, 2024 @12:41PM (BeauHD) from the grave-consequences dept.)

Longtime Slashdot reader [1]AmiMoJo shares a report from the Associated Press:

> Tech behemoth OpenAI has touted its artificial intelligence-powered transcription tool Whisper as having near "human level robustness and accuracy." But Whisper has a major flaw: [2]It is prone to making up chunks of text or even entire sentences , according to interviews with more than a dozen software engineers, developers and academic researchers. Those experts said some of the invented text -- known in the industry as hallucinations -- can include racial commentary, violent rhetoric and even imagined medical treatments. Experts said that such fabrications are problematic because Whisper is being used in a slew of industries worldwide to translate and transcribe interviews, generate text in popular consumer technologies and create subtitles for videos.

>

> The full extent of the problem is difficult to discern, but researchers and engineers said they frequently have come across Whisper's hallucinations in their work. A University of Michigan researcher conducting a study of public meetings, for example, said he found hallucinations in eight out of every 10 audio transcriptions he inspected, before he started trying to improve the model. A machine learning engineer said he initially discovered hallucinations in about half of the over 100 hours of Whisper transcriptions he analyzed. A third developer said he found hallucinations in nearly every one of the 26,000 transcripts he created with Whisper. The problems persist even in well-recorded, short audio samples. A recent study by computer scientists uncovered 187 hallucinations in more than 13,000 clear audio snippets they examined. That trend would lead to tens of thousands of faulty transcriptions over millions of recordings, researchers said.

Further reading: [3] AI Tool Cuts Unexpected Deaths In Hospital By 26%, Canadian Study Finds

[1] https://slashdot.org/~AmiMoJo

[2] https://apnews.com/article/ai-artificial-intelligence-health-business-90020cdf5fa16c79ca2e5b6c4c9bbb14?taid=671cde7444b38d00014b98db

[3] https://science.slashdot.org/story/24/09/18/0232214/ai-tool-cuts-unexpected-deaths-in-hospital-by-26-canadian-study-finds

Testing Methodology? (Score:5, Insightful)

by Drethon ( 1445051 )

So what testing methods did OpenAI use to ensure this product would meet the appropriate mean time between faults for a medical environment?

Validation Methodolgy. (Score:2)

by geekmux ( 1040042 )

> So what testing methods did OpenAI use to ensure this product would meet the appropriate mean time between faults for a medical environment?

What medical environment, accepted this pathetic bullshit after finding the first three reports full of imaginary medical “problems”?

Fault the controlled environment that should have never accepted a PT Barnum grade attempt at selling enhancing snake oil.

Not news. (Score:5, Funny)

by msauve ( 701917 )

AI Transcription Tool Used In Hospitals Invents Things

They've been using that AI in the billing department for years.

setting a low bar (Score:2)

by Thud457 ( 234763 )

> near "human level robustness and accuracy."

That's damning with faint praise. Have you met some people?

Re: (Score:2)

by bugs2squash ( 1132591 )

I think they just badly punctuated "near-humans"

What, again? (Score:1)

by Anonymous Coward

Clearly they're also [1]posting dupes [slashdot.org].

[1] https://tech.slashdot.org/story/24/10/28/1510255/researchers-say-ai-tool-used-in-hospitals-invents-things-no-one-ever-said

Re: (Score:2)

by thrasher thetic ( 4566717 )

Clearly they're also posting dupes [slashdot.org].

Re: (Score:2)

by VeryFluffyBunny ( 5037285 )

Obviously they're also posting dupes [slashdot.org].

Not a dupe? (Score:2)

by billybob2001 ( 234675 )

This is not a dupe, it's a transcription of [1]https://tech.slashdot.org/stor... [slashdot.org]

[1] https://tech.slashdot.org/story/24/10/28/1510255/researchers-say-ai-tool-used-in-hospitals-invents-things-no-one-ever-said

Re: (Score:2)

by Culture20 ( 968837 )

> This is not a dupe, it's a transcription of [1]https://tech.slashdot.org/stor... [slashdot.org]

Might be interesting to play a telephone game with these LLM transcription services. See what every new hallucination brings. Then perform a triple modular redundancy transcription and see if that can succeed without error.

[1] https://tech.slashdot.org/story/24/10/28/1510255/researchers-say-ai-tool-used-in-hospitals-invents-things-no-one-ever-said

Readers Say AI Editors Used In /. Reposts Things (Score:2)

by TigerPlish ( 174064 )

Seriously, it hasn't been even 3 days.

Dupe Dupe Dupe (Score:2)

by JustAnotherOldGuy ( 4145623 )

It's like deja vu all over again....

[1]https://tech.slashdot.org/stor... [slashdot.org]

[1] https://tech.slashdot.org/story/24/10/28/1510255/researchers-say-ai-tool-used-in-hospitals-invents-things-no-one-ever-said

You don't say? (Score:2)

by CEC-P ( 10248912 )

We tested Copilot for some reason. It mishears one word and goes off to some weird places in the transcription of meetings, which is typically longer than the meeting itself. It has no idea what's important or what we're talking about. It pretty much just makes every sentence a bullet point then invents a bunch of BS we didn't even say.

medical people are captive of asinine procedures (Score:2)

by Big Hairy Gorilla ( 9839972 )

Medical people are well educated idiots. They spend all their time on ipads and computer screens trying to figure out which button to press or which field fo fill in.

Doctors don't think, they just follow procedures now.

The medical administrators have "MBA'd" the operation: They outsource their brain and all operational functions to these all-in-one corporate systems, and when they get cryptojacked, all hospital staff are slack jawed wondering what to do.

Best advice is don't get sick, because your needs are

Re: (Score:2)

by nightflameauto ( 6607976 )

Most doctors and nurses aren't all that happy about this situation either. The last time my mom went in for a surgery was a day that the board was going to walk through the hospital and investigate procedure. The doctors were absolutely puckered. Every little thing had to be perfect, or else. It was ridiculous, seeing a hospital run like any other big business. It wasn't about taking care of the people that day. It was about presenting well for the board. We're well past the point where patient care takes p

Re: (Score:2)

by Bongo ( 13261 )

It's been said for a long time that machine-like thinking drains us of our intuition and other intelligences, especially the ones which are more in touch with contextual realities. Many things which are in essence good, like DEI movements, are done in a machine-like, blind, robocop "put down the weapon", self defeating way, because people aren't allowed to express intuitive contextual perceptions.

Corporate BS Generator (Score:2)

by devslash0 ( 4203435 )

What did we expect? When the sound recording quality drops, the model just wants to continue going with the usual corporate BS narrative because that's what it was trained on/for.

duplicate news (Score:1)

by Scythal ( 1488949 )

It's not interesting, and it has already been posted. [1]https://tech.slashdot.org/stor... [slashdot.org]

[1] https://tech.slashdot.org/story/24/10/28/1510255/researchers-say-ai-tool-used-in-hospitals-invents-things-no-one-ever-said

Malfunction (Score:2)

by StormReaver ( 59959 )

LLMs do not hallucinate, as that is something that requires some kind of intelligence. LLMs malfunction, which is what this is doing.

I prefer hallucinations vs doctor’s handwrit (Score:1)

by denisko ( 5946738 )

Manual inputs will also be prone to mistakes, especially since time per patient has been shrinking constantly with no improvement in sight. Freeing up medics’ attention to do other things might help correct such mistakes. Medical services is at least 50% bureaucracy, any help there will do miracles.

Hang on a minute... (Score:2)

by VeryFluffyBunny ( 5037285 )

...are they complaining that using a service based on a generative LLM, which sole function is to make things up according to input text, is making things up?

Well, I'll be touched by a BBC presenter! What a surprise!

Researchers = people who want attention (Score:2)

by WaffleMonster ( 969671 )

I still can't get over this exert:

"Researchers aren't certain why Whisper and similar tools hallucinate, but software developers said the fabrications tend to occur amid pauses, background sounds or music playing."

Did these "researchers" just ignore confidence scores and turn up the temperature of the model to 11? It is after all one of those articles cheerleading regulation. I'm sure that will lead to perfect STT.

"The prevalence of such hallucinations has led experts, advocates and former OpenAI employee

News: 0175348399

Researchers Say AI Transcription Tool Used In Hospitals Invents Things (apnews.com)

Testing Methodology? (Score:5, Insightful)

Validation Methodolgy. (Score:2)

Not news. (Score:5, Funny)

setting a low bar (Score:2)

Re: (Score:2)

What, again? (Score:1)

Re: (Score:2)

Re: (Score:2)

Not a dupe? (Score:2)

Re: (Score:2)

Readers Say AI Editors Used In /. Reposts Things (Score:2)

Dupe Dupe Dupe (Score:2)

You don't say? (Score:2)

medical people are captive of asinine procedures (Score:2)

Re: (Score:2)

Re: (Score:2)

Corporate BS Generator (Score:2)

duplicate news (Score:1)

Malfunction (Score:2)

I prefer hallucinations vs doctor’s handwrit (Score:1)

Hang on a minute... (Score:2)

Researchers = people who want attention (Score:2)