<Stupidquestion>
What advantage does this software provide over simply banning bots via robots.txt?
</Stupidquestion>
the scrapers ignore robots.txt. It doesn’t really ban them - it just asks them not to access things, but they are programmed by assholes.
Robots.txt expects that the client is respecting the rules, for instance, marking that they are a scraper.
AI scrapers don’t respect this trust, and thus robots.txt is meaningless.
I don’t understand how/why this got so popular out of nowhere… the same solution has already existed for years in the form of haproxy-protection and a couple others… but nobody seems to care about those.
Probably because the creator had a blog post that got shared around at a point in time where this exact problem was resonating with users.
It’s not always about being first but about marketing.
It’s not always about being first but about marketing.
And one has a cute catgirl mascot, the other a website that looks like a blockchain techbro startup.
I’m even willing to bet the amount of people that set up Anubis just to get the cute splash screen isn’t insignificant.Compare and contrast.
High-performance traffic management and next-gen security with multi-cloud management and observability. Built for the enterprise — open source at heart.
Sounds like some over priced, vacuous, do-everything solution. Looks and sounds like every other tech website. Looks like it is meant to appeal to the people who still say “cyber”. Looks and sounds like fauxpen source.
Weigh the soul of incoming HTTP requests to protect your website!
Cute. Adorable. Baby girl. Protect my website. Looks fun. Has one clear goal.
Ooh can this work with Lemmy without affecting federation?
Yeah, it’s already deployed on slrpnk.net. I see it momentarily every time I load the site.
This is fantastic and I appreciate that it scales well on the server side.
Ai scraping is a scourge and I would love to know the collective amount of power wasted due to the necessity of countermeasures like this and add this to the total wasted by ai.
All this could be avoided by making submit photo id to login into a account.
I don’t think this would help:
By photo ID, I don’t mean just any photo, I mean “photo id” cryptographically signed by the state, certificates checked, database pinged, identity validated, the whole enchilada
That would have the same effect as just taking the site offline…
No one is giving a random site their photo ID.
You’d be surprised, many humans have simply no backbone, common sense nor self respect so I think they very probably would still, in large numbers. Proof is facebook and palantir.
I get that website admins are desperate for a solution, but Anubis is fundamentally flawed.
It is hostile to the user, because it is very slow on older hardware andere forces you to use javascript.
It is bad for the environment, because it wastes energy on useless computations similar to mining crypto. If more websites start using this, that really adds up.
But most importantly, it won’t work in the end. These scraping tech companies have much deeper pockets and can use specialized hardware that is much more efficient at solving these challenges than a normal web browser.
she’s working on a non cryptographic challenge so it taxes users’ CPUs less, and also thinking about a version that doesn’t require JavaScript
Sounds like the developer of Anubis is aware and working on these shortcomings.
Still, IMO these are minor short term issues compared to the scope of the AI problem it’s addressing.
To be clear, I am not minimizing the problems of scrapers. I am merely pointing out that this strategy of proof-of-work has nasty side effects and we need something better.
These issues are not short term. PoW means you are entering into an arms race against an adversary with bottomless pockets that inherently requires a ton of useless computations in the browser.
When it comes to moving towards something based on heuristics, which is what the developer was talking about there, that is much better. But that is basically what many others are already doing (like the “I am not a robot” checkmark) and fundamentally different from the PoW that I argue against.
Go do heuristics, not PoW.
Youre more than welcome to try and implement something better.
A javascriptless check was released recently I just read about it. Uses some refresh HTML tag and a delay. Its not default though since its new.
I don’t like it either because my prefered way to use the web is either through the terminal or a very stripped down browser. I HATE tracking and JS
I agree. When I run into a page that demands I turn on Javascript for whatever purpose, I usually just leave. I wish there was some way to just not even see links to sites that require this.
I’d like to use Anubis but the strange hentai character as a mascot is not too professional
Honestly, good. Getting sick of the “professional” world being so goddamn stiff and boring. Push back against sanitized corporate aesthetics.












