Is it to give out data for AI training? I guess you can’t fundamentally protect against this, except by limiting how much content is provided to each address.
Or is it the resource strain that it causes on your server? In that case i recommend limiting how much a single client / IP address can request in a day.
What’s bothering you?
its the strain of it i mostly run instances and frontends so the training is not a huge problem
the keyword you need is “DDoS protection” i guess
it keeps the server from getting overloaded due to too many requests