So I have now experienced development with both ActivityPub and ATProto and I feel like my 10,000 foot view is that ActivityPub is easy breezy to develop for but hell to sysadmin, whereas ATProto is a dream to sysadmin¹ but development is just digging through a giant fucking pile of garbage
¹ Because a giant, unaccountable corporation you just have to trust will never turn on you is hosting all the hard parts, natch
I do feel that pondering what does and doesn't work in ATProto is making me realize there is an Actual Issue with ActivityPub, which is that "Publish" and "Subscribe" are strongly coupled into a single server. I feel in the past I've been groping toward this problem by talking about "portable identity" or other things, but maybe it's just the coupling by itself. ATProto is a mess but you go through different services to publish content and subscribe to content, and that makes publishing easy
By contrast, what is it ActivityPub admins complain about? It's the subscribe side. Maintaining blocklists by hand. The ballooning server costs of images people saw in RTs. I run my own PDS for bluesky because it's only the publish side. I don't want to deal with administering the subscribe side. I want Mastodon gGMBH or Bluesky Social or somebody to handle that.
…but if Mastodon.social is how I *read* posts, Mastodon.social also has to be how I *publish* posts.
@cr1901 Well, my ATProto project appears to have succeeded, whereas my long-simmering LibP2P project failed, so I am say I tested both and found IPFS worse
@jj I dunno if it makes sense to distinguish because Bluesky itself provides so much of the tooling? And also because the biggest part of the problem is documentation
@mcc Theoretically you inbox and outbox could be on different servers
@thisismissem @mcc also the storage costs on the subscription side are entirely a consequence of how Mastodon in particular has decided to handle media and not a required feature
@erincandescent @thisismissem well, is the problem alleviated if I use GTS
@mcc @thisismissem i don’t think so, because gotosocial has decided to fairly comprehensively commit to copying many of mastodon’s worse ideas.
@erincandescent @mcc my experience with gotosocial is that it's the only major activitypub implementation that's 3 for 3 on "reasons a site won't get the good discord embeds":
- their HTML doesn't include a <link type="application/activity+json"> (so we can't even detect that it's AP in the first place)
- they reject unauthenticated activitypub read requests for public posts by default
- they implement a mostly-mastodon-compatible API, *but* require auth to read public posts (literally nothing else does this to my knowledge)
@mcc the ballooning server costs of images aren't a technical requirement though. It's an implementation detail of Mastodon. Mastodon downloads everything preemptively and stores forever. In Smithereen, I instead store remote images in a LRU cache of a configurable size and only download them when they're first requested by a client.
@mcc atproto is easier to develop for actually because it has a spec that is real and doesn’t think you’re doing C2S with an RDF triple store or something
@easrng well the spec may be real, but that's not the same thing as it being well documented :(
@mcc and atproto hasn’t had a chance to develop ossified interop weirdness yet like activitypub has
@easrng i feel like everything useful i've learned about bsky admin/dev i've gotten from one single discord, and the people on that discord are *really nice* but i'd have just preferred to read a clearly thought out document
@mcc the bigger problem with atproto imo is if you want to do anything with bluesky posts that’s more complicated than fetching direct from someone’s PDS or using an existing appview you need to immediately be able to handle the full network’s scale! which sucks.
@mcc This is not a short term thing to do, but if I made a Kademlia RPC library in Rust (running on top of the Postcard serialization format), would that interest you in reviving your DHT shenanigans?
@cr1901 Absolutely yes, *although* I should note that the reason I was going with libp2p in the first place was that it let you do kademlima over webrtc, and if i can't run it in a web browser tab then it doesn't fit with any of my prior plans
@cr1901 like to me, kademlima is not the hard part, the hard part is webrtc
@mcc Yea, I chose Kademlia specifically because it's the only DHT paper that doesn't make my eyes glaze over _immediately_ when I read it. But I still found that it takes effort/mental bandwidth to understand why it works.
I totally believe WebRTC is more difficult, and would venture I'm not prepared to tackle that yet. At least I'm honest :P?
@cr1901 kademlima's not so hard! i've implemented it in what i remember being a startlingly small amount of code.
@noisytoot @erincandescent @thisismissem @mcc It's mandatory on GTS and they won't budge - https://codeberg.org/superseriousbusiness/gotosocial/issues/1405 One of my friends got bitten by that since I told her to try GTS since she was budget and thus resource constrained but by getting too much media in her TL with the not-size-based cache controls made her instance unable to properly function due to lack of disk size.
Even Misskey with its reputation for heaviness lets you turn off remote image caching!
@erincandescent @thisismissem @mcc @noisytoot They do have a 2 years dormant ticket relating to the disk space problem: https://codeberg.org/superseriousbusiness/gotosocial/issues/1997
@kitten @thisismissem @mcc there’s a bunch of value to running a media proxy but IMO it really should be a size-limited read-through cache rather than Mastodon/GTS’ approach of “Store every image/video attached to a post your instance has wandered across for the following X days”
@erincandescent I have plans to fix this (with a remote media storage FASP), hopefully this will get selected by NLNet soon to get it funded
Loading from the remote origin has several other drawbacks like creating load on the remote origin, availability of content overall goes down quite a lot, some implems do not use stable URLs for media, can’t have strict CSP…
But we can definitely do better here, and will!
@renchap @erincandescent @kitten @thisismissem If I try to imagine how I'd solve the problem, I'd do it by groups of 4 or 5 tiny servers whose admins decide they trust each other forming a "local" image pool that is shared by only that small number of servers
@erincandescent also surprisingly when I ran a poll a few weeks ago, media storage was the most voted issue, but DB storage growth was predominent in the replies. Solving this is less clear to me, we would need to have garbage collection of objects in the DB (how to select the ones to purge? What about relationships?), and on-deman fetching of content (how to handle async fetches with the API, availability…)