I already get rate-limited like crazy on lemmy and there are only like 60,000 users on my instance. Is each instance really just one server or are there multiple containers running across several hosts? I’m concerned that federation will mean an inconsistent user experience. Some instances many be beefy, others will be under resourced… so the average person might think Lemmy overall is slow or error-prone.
Reddit has millions of users. How the hell is this going to scale? Does anyone have any information about Lemmy’s DB and architecture?
I found this post about Reddit’s DB from 2012. Not sure if Lemmy has a similar approach to ensure speed and reliability as the user base and traffic grows.
https://kevin.burke.dev/kevin/reddits-database-has-two-tables/
I love Lemmy but your question is legit. I just signed up with lemmy.world because lemmy.ml is slow/not responding.
Before making a post in lemmy.world guess what? lemmy.world isn’t responding. I know they have scheduled maintenance at 9 CET but it was 20 minutes before that.
Poorly. Lemmy will scale poorly.
I won’t be surprised if the larger instances start locking down more as a way to sustain themselves, like restricting communities or only allowing text posts.
Sometimes you have just to accommodate to the situation and keep going until it settles down. The error I think here is thinking something can’t have flaws and issues, even more if it’s not behind a corporations. And no one wants corporations.
It isn’t about accommodating to the situation, but planning for long term growth.
Right now, instances of Lemmy don’t have any way to fund server costs other than asking for donations. Outside of Wikipedia, that isn’t a sustainable business model. How is Lemmy supposed to survive if, every time a sub gains critical mass, it shuts down?
planning for long-term growth
Which is part of any scaling effort, and you can’t really guess through predicting and resolving bottlenecks, it takes some serious expertise. And as far as I know, the Lemmy devs have never built a high-scale service before, and I think that is possibly the single biggest risk to the growth and success of the Lemmy project in general.
Source: that’s my job, I’ve been doing that for some of the most high-scale services in the world for about a decade. I absolutely could help, actually I’d love to, but I definitely won’t under current Lemmy leadership, for reasons: https://lemmy.world/comment/596235
How about helping Kbin?
I think Kbin is something good being built by good people, I get what they’re trying to do, but unfortunately I don’t have a lot of faith that it will turn out to be a successful project.
In terms of technical scaling, I’m puzzled that they went with an interpreted language if the goal is scale. I get that the basic usage of Kbin’s features may not require a ton of CPU-heavy operations, or a fine handling of the memory; but once it meets sufficient scale, there will have to be some scale edge-case bottlenecks where you’ll want to step out of the beaten path and get lower-level, so I’m a bit confused about why they chose a technology that will make those harder to get past rather than easier. PHP is great for rapid prototyping, but I’d argue that’s not what the vision should be here.
About community scale, I’m not expert, but they seem to really care to offer a karma system; and we’ve seen the karma-farming behavior that this has been incentivizing on Reddit. I don’t see why it would be any different here if enough people end up joining. Lemmy is intentionally not offering a karma system, and it really feels like the healthier move long-term.
I think all it would take would be for the Lemmy devs to admit that they’re in over their heads, and that their political affiliations have been a hindrance to the project, to the point that they transition the governance of it to other people. I really hope they do that. If they do soon enough, they’re so far ahead and built on so much more long-term thinking, that I think it would pretty much make Kbin kinda obsolete. I have no special information about this, so I could be wrong, and I hope for them that I am; but I can see that as a pretty likely outcome.
(That, and on the shorter-term, I wouldn’t contribute to a product I don’t use, and I can’t use it for now because my usage is 100% mobile, and the current lack of API means no native client. I wish the mobile web was better than it is as an application platform…)
Have you found that their political leanings have affected you in any way? Just curious if you have some sort of bias that’s making you think people on the left can’t produce efficient software.
It hasn’t. But letting terrible people have power affects the world in normalizing violence and hatred. It’s not about left or right, if they were American racists against Chinese people, I would have the exact same problem. I’m personally quite on the left, but without the hate.
I am living safe and not being targeted with hateful violence like the Uyghurs or North Koreans are, so this is far, far more important than what can affect me.
Lemmy is entirely open source, so you can see what their architecture looks like, etc… here: https://github.com/LemmyNet/lemmy.
Rate limits, as I understand them from the code, should only apply on a per-IP basis. So you should only be seeing rate limit errors if:
- your behind a CGNAT and multiple people who use your ISP are using Lemmy
- you’re sending A LOT of requests to your instance yourself
- the admin of your instance has significantly lowered the rate limits (viewable here:
/api/v3/site
)
I’m not an expert, but I thought the issue was generally that big instances like lemmy.world were getting overloaded on the server side, not that they were enforcing a manufactured rate limit on individual IPs.
Also, someone else mentioned that on the fediverse even simple things like an upvote are slower and require more work here than in centralized platforms because they must be sent to all the instances that are indexing that user/community. As I understand, that’s inherent to the fediverse, a bug not a feature, designed for redundancy and resilience.
Again uninformed, but Lemmy seems like it should scale fine. Bigger instances will monetize, driving prospective users to smaller instances, and then rate limiting and server lag won’t be so bad anymore.
I’m not an expert, but I thought the issue was generally that big instances like lemmy.world were getting overloaded on the server side, not that they were enforcing a manufactured rate limit on individual IPs.
From what I can see it’s both. lemmy.world and others are getting overloaded, but there is an inherit built-in rate-limit in the code itself. You can see what those limits are via the
api/v3/site
. Now in theory if you’re actually getting rate-limited you should be seeing HTTP 429 responses from the server. If the server is just overloaded, you’ll get a 5xx response, the request will just timeout or at best you’ll still get a response but after a significant delay (what most people are seeing).Also, someone else mentioned that on the fediverse even simple things like an upvote are slower and require more work here than in centralized platforms because they must be sent to all the instances that are indexing that user/community. As I understand, that’s inherent to the fediverse, a bug not a feature, designed for redundancy and resilience.
I don’t want to comment on this too much as I’m not an expert here, but here’s how federation / ActivityPub works from what I understand looking at the code:
Whenever you take any action (or activity) your browser will first send that message to your instance. If your instance then owns the community that message is then propagated out to EVERY linked instance listed here:
/instances
/api/v3/federated_instances
. If your instance doesn’t own the community, that message is forwarded off to the instance that does and they sent it out to EVERYONE on their federated instances list. As you can see this creates A LOT of network traffic.This posing an interesting problem… the number of ActivityPub messages goes up as the number of instances increase. But at the same time as more and more users join a single instance that require that that instance send more and more traffic to individual user’s browsers as they view and respond to posts. So the problem here is trying to find a good balance. And to top it off, the default behavior of most users is going to be to join the largest instances, making that instance incur more and more traffic to view content.
Again uniformed, but Lemmy seems like it should scale fine. Bigger instances will monetize, driving prospective users to smaller instances, and then rate limiting and server lag won’t be so bad anymore.
Will it though? How would an individual instance monetize? They would have to use donations. If an instance tries to add Ads, users will leave to an instance that doesn’t, making it so that they don’t get any income. They could charge a subscription fee, but again users would just leave and the admins get nothing.
The ideal configuration of the fediverse as I see it, is if we had two types of servers 1) content servers that only hosted communities but didn’t have any real number of users, and 2) user servers that have no communities but most of the users. This way the number of API requests between instances is rather limited. When you end up with a server that has both most of the content and the userbase, the workload of that server appears to grow exponentially instead of linearly as the number of new instances rises.
Larger instances will have to monetize to stay afloat. I’ve gone so far as to buy a domain that is very appropriate for a business-oriented Lemmy instance (specifically for job hunting and career development), but don’t yet have time or resources to take it to the next level.
Who says any instances need to grow?
I said ‘afloat’, and was specific about larger instances.
Why should there be larger instances?
There are currently and there inevitably will be. Instances like Lemmy.ml, Lemmy,world, beehaw.org are already large.
Okay and once those grow to the point where they need to monitize to stay afloat people will just switch to other instances
The ideal way that ActivityPub federation works IMO is a bunch of smaller nodes coming together to make a large network.
If you have a bunch of people all on one or two instances then you’ll have a “central hub” of the network that’s constantly overloaded.
That’s my advice to community builders on this platform… Spread out across smaller instances, don’t just all sign up to a big one.
Moving the scaling problem from a few instances to federation is likely to cause more harm than good.
I don’t see how syncing a post across a hundred instances is more efficient than having a hundred users see the post on one instance.