Thinking about how by the time that textmode web browser I started writing works, you won't be able to use it on any sites because without mouse entropy probably the AI-crawler rejecters will reject it
@mcc What do they do for users who don’t use the mouse due to disability? I have a sneaking dread that the smaller ones will just reject them, but hopefully Cloudflare is big enough to account for that.
@mcc How hard could it be to create Gecko, Chromium, and WebKit extensions to send, and Apache and Nginx modules to require, HTTP headers containing some lawyer-crafted language binding the entities causing request initiation to not being part of AI companies and not disclosing response to AI companies?
#Lawfedi, can you craft a statement that turns it into perjury, or fraud, or a CFAA (and non-US counterpart law) violation for an AI crawler or a person collecting data for an AI company to send a request with that statement, but not for anyone else? How about civil causes of action too, such as breach contract?
Perhaps, for brevity, the HTTP header would have to refer to terms at a defined .well-known/legal/declaration-not-ai.txt
location and a hash of that file's contents? And that file could have standard terms (like the GPL or the Creative Commons licenses, but for converting a false declaration into as many crimes and civil offences as possible), so the same boilerplate legal declaration can be replicated across services and domains?
Just a whimsical idea.
@noisytoot @neia @mcc A browser might. Idk what APIs are relevant here, I presume it's not just a literal "mouse motion event"
@outfrost @noisytoot @neia Most interesting sources of browser entropy would probably require a prompt to explicitly request those things.
@deFractal @mcc I suppose you could extend http to do some sort of terms negotiation before serving the resource, but if it’s entirely automated I’m not sure what good it would do; at best it’s still not enforceable unless you have more money for lawyers(/legislators) than the scrapers.
But I think you could do a variation of the interactive redirect-to-login except purely for a license agreement, so most scrapers only get the license agreement
@ShadSterling @mcc That’s where the browser add-ons come in: the user could preselect which boilerplate terms they agree to. And then they’d be asked upon seeing unseen terms, but only if the hash of those terms is in some trusted directory of non-abusive terms.
“Automatically agree to legal commitment that: (checkbox list of summaries of terms, each linked to their a legal definition, but only those on the non-abusive terms directory).”
If the website ToS is enforceable, then a legal declaration sent as an access credential should be enforceable too. But IANAL; I just like the idea of turning the tables on bad law the GPL did.