If an LLM can’t be trusted with a fast food order, I can’t imagine what it is reliable enough for. I really was expecting this was the easy use case for the things.
It sounds like most orders still worked, so I guess we’ll see if other chains come to the same conclusion.
Surely if the person making the order sees 18,000 waters they would think, hold on this doesn’t seem right maybe I should ask the customer if they really want 18,000 waters?
The same applies for the ice cream with bacon on it which was mentioned in the article. I believe a lot of these could be resolved with a bit of common sense.
If you think bacon on ice cream is weird enough to cancel an order, I can only imagine you’ve never worked a customer service job.
Does it, though? Unlike the 18,000 waters, if I were working a drive through I wouldn’t even blink at an order for bacon ice cream. Heck, I might make a little extra to try it for myself!
Sure, in the most extreme cases it would be obvious to the crew. But simply making mistakes at a higher rate than humans will result in a lot of unhappy customers.
Sure, but how do you distill this into a rule a computer can follow? “Suspicious” is not an objectively measurable thing that a program can just check against
Think the easiest way would be to collect order data for at least a good number of months if not a couple years and feed it in and use that as a baseline of what a typical human order looks like, anything that deviates too far from that baseline needs to be handled by a human until someone can validate it as a good order, though I imagine you could get false positives for new menu items unless you set a reasonable instruction for items that have never appeared in the dataset before.