The “em-dashes” (—) come up a lot in online translations of books like Bible and Quran.
Normal keyboard “-” and “–” are different from “—” but microsoft office auto-formats “–” to that.
I kinda assumed it was ALL microsoft word data that caused training to include that.
I am only now realizing AI stole from even the religious texts and influenced by them as well.
I had always thought it was a hangup of ‘borrowing’ of pre 21st century texts that had be OCR’d
Maybe it’s changed, but my experience with OCR is that it is not great at detecting nuances of punctuation.