Researchers say an AI-powered transcription tool used in hospitals invents things no one ever said

Stopthatgirl7@lemmy.world · 19 days ago

Researchers say an AI-powered transcription tool used in hospitals invents things no one ever said

rand_alpha19@moist.catsweat.com · 19 days ago

I’ve read that this is only going to continue to happen (and get worse) because we’re basically out of human-generated training data that’s publicly available on the internet, so models are being trained on content generated by other models. They literally make shit up constantly, and every generation gets dumber and dumber until they can’t even stay on topic or complete a coherent sentence anymore.

Edit: Here’s the post I was reading, written by Ed Zitron. It’s pretty well written and thoughtful, though it is an opinion article from some guy’s blog at the end of the day. Also, by “generation,” I mean generations of AI, not generations of people.

Eheran@lemmy.world · 19 days ago

Odd that GPT (and of course all the LLMs too) only got better so far…

TheBlackLounge@lemm.ee · edit-2 19 days ago

The architecture changed, there is still progress to be made there. But LLMs will forever be stuck in 2021, all data afterwards is tainted. Not a lot has been added.

In fact, Whisper was developed to transcribe videos for more training data, because they ran out of text data. These bad transcriptions are in newer models.