Ask HN: Has everyone just given up on blocking bots?

12 points by pupppet 7 days ago

Bots account for just a ridiculous amount of web traffic, tying up resources and bandwidth, and the go to response to this is...basically nothing. You can maybe throw Cloudflare in front of your site, and that's it.

Within the last year it's become much worse with AI bots vacuuming my content on my dime.

Why is there not some community-driven attempt block them? Is this problem just not solvable?

MattGaiser 7 days ago

I think the challenge is solving it in a way that users can still use the product.

I’ve done a fair bit of scraping for various things. Companies that attempt to defeat scraping often ruin the user experience along the way.

As an example I am aware of, there is an airline that keeps doing all sorts of things to defeat scrapers.

Problem is, the site now constantly throws errors for regular users doing searches and people are regularly getting banned for doing too many searches.

And they still haven’t done much but made scrapers increase their retry counts.

readyplayernull 6 days ago

It's time to level up in this arms race. Let's stop delivering html documents, use animated rendering of information that is positioned in a scene so that the user has to move elements around for it to be recognizable, like a full site captcha. It doesn't need to be overly complex for the user that can intuitively navigate even a 3D world, but will take x1000 more processing for OpenAI. Feel free to come up with your creative designs to make automation more difficult.

  • dTal 6 days ago

    So, Flash websites? Please no...

    • wruza 5 days ago

      Why flash, mp4/webm!

      Btw flash websites were cool and games were amazing. You just can’t write a game like that today. You may pretend that you can, but no.

truesign 5 days ago

I had those same problems time ago and ended up creating https://truesign.ai

Still in beta but successfully protecting millions of requests daily.

During beta I'm giving free service in exchange of user feedback. Do register if interested!

johng 7 days ago

We are considering blocking all the AI bots via Cloudflare but I wonder if there is any downside. Are we going to lose ranking in any search engines? Any downsides that I can't think of?

carlosjobim 5 days ago

> You can maybe throw Cloudflare in front of your site, and that's it.

What's wrong with Cloudflare?

nejsjsjsbsb 6 days ago

Think about it. A human visitor is a human using a bot (their browser) to get information on their behalf. Distinguishing between a human using a bot in the good way vs. a human using a bot in a bad way is going to be cat and mouse as nauseum.

Anything that a human can read can be used for AI training. You can maybe avoid by paywalling. Or limiting users to people you have spoken to. But maybe you can't trust em all.

There are commercial offerings to help solve this but there is a reason you need to pay.