Welcome to today’s daily kōrero!
Anyone can make the thread, first in first served. If you are here on a day and there’s no daily thread, feel free to create it!
Anyway, it’s just a chance to talk about your day, what you have planned, what you have done, etc.
So, how’s it going?
Anyone got big plans for the long weekend? Apparently schools across NZ are closed on Tuesday, a bit of a last minute surprise for poor planners like us!
We have family visiting, and I also hope to do an upgrade of the Lemmy image caching system “pictrs”. This will involve migrating the database into a postgres setup. It took 12 hours in my test run, and I expect it to perhaps take half that on the much beefier production server. In any case, if it’s not available it should only affect images, but in my test run the site remained perfectly functional somehow.
We also have a bit of a pre-fetching setup running that is pulling posts and comments from lemmy.world before they are sent to us. We hope this will help us get back in sync with lemmy.world, but at the very least it should help us get comments and posts through a bit quicker. We can’t pre-grab the votes so you may see a bunch of lemmy.world posts in New that have no votes.
We are currently tracking about 6 hours behind lemmy.world, but in theory people replying to you should now come through a lot quicker.
Got a whole heap of small projects that I have in my head that I want to get done, but nothing’s being put into paper so Im beginning to have some doubts as to what I’ll get done lol.
Ahh that Lemmy.world delay thing to explains why I haven’t been seeing some alerts for replies etc. Still don’t fully understand the whole “hours behind” thing and how it all works.
It can be a little tricky to understand exactly how the issue presents itself, because different parts move at different speeds. I’ll try to explain a little more. I hope you were planning on reading a novel today because there’s one coming 😆
Every time someone take an action (comment, create post, vote on post or comment) a little piece of information is created and added to a queue, this is called an “activity”.
This activity is sent to the instance of the community it happened under. For example, if you vote on a post in a community memes@lemmy.world then lemmy.nz sends this information to lemmy.world, then the lemmy.world server sends that vote to every instance that has at least one person subscribed to that community.
Recently, Kbin.social has had issues where it has sent tens of thousands of activities to lemmy.world in a very short amount of time (specifically to the games@lemmy.world community). This is presumed to be a bug as it seems to be the same thing sent lots and lots. Now Lemmy.world tries to send these tens of thousands of activities to every instance that has at least one person subscribed to !games@lemmy.world.
As a result of this happening, a bug in lemmy was found. Lemmy.world sends the activity to lemmy.nz (or any other instance), then once that is complete it sends the next one. In theory it should just fire them all off in parallel, but it doesn’t because if the order is wrong then it might end up in an inconsistent state (say, if someone edited a post twice, you don’t want to get the second edit first then the first edit second).
Because they are sent one at a time, there is a limit to how many it can send. This limit is determined by the time it takes for lemmy.world to send something to lemmy.nz and then receive a response that it has been received. The round trip takes about 250ms, or 1/4 of a second. This limits us to receiving only about 4 activities per second. We can’t improve on this speed, because the latency is largely because we are hosted in NZ and they are hosted in Finland on the other side of the world. Aussie.zone also has issues, but can handle slightly more than us as they are slightly closer.
Now lemmy.world is creating activities at around 4 per second (posts, comments, votes, and a few others put together), or just under, during week days. That means we are barely taking on the activities as fast as they are being created. Now suddenly Kbin sends a couple of hundred thousand activities within an hour and suddenly lemmy.world is sitting there trying to send them, but can only send 4 per second, so it takes a long time to get through them. Then add on that lemmy.world is generating new content at almost 4 per second and now there’s really no chance to catch up on that backlog. Luckily the weekend comes, and that backlog comes down. Over three weekends we are down to only about 40,000 actions behind.
Then a couple of days ago lemmy.world got another spike of activities from Kbin and here we go again!
Now the “hours behind” thing. Someone posts to a lemmy.world community, and they get added to the back of the queue at say 4pm. It slowly makes it’s way to the front of the queue as other activities are sent, and when it reaches the front it gets sent to Lemmy.nz and it’s now 10pm. It took 4 hours for that post to be sent to lemmy.nz after it was posted to lemmy.world, so that’s a 6 hour lag, 6 hours behind.
To make things more confusing, federation is not just one way, but when you post/comment/vote to a lemmy.world community the lemmy.nz server needs to send that activity to lemmy.world. Our queue going to lemmy.world is fine, sitting empty because kbin never sent us 100k activities, so it gets sent to lemmy.world straight away. So you have a one way lag - if someone posts to lemmy.world it takes 6 hours before it gets sent to us, but if we post to lemmy.world it is sent straight away. That makes conversations a bit one sided!
Now I’m going to add in a further point of confusion! Yesterday, another instance admin offered to turn on something they have been setting up to try to band-aid the problem. This is a server hosted in Europe (for quick federation) that gets the lemmy.world posts and comments, and sends a request to lemmy.nz for that content. When lemmy.world sends lemmy.nz details of a post, lemmy.nz has to do some things like sending a request back to lemmy.world to get some info about the post, generate a thumbnail, etc. Sometimes this can take seconds to do, so this slows down the process (is the theory). By asking lemmy.nz to grab posts before lemmy.world tries to send them, the theory is that any lag caused by lemmy.nz fetching this info will be mitigated. We aren’t able to grab votes, but the hope is that it will be slightly faster getting through the backlog. Because lemmy.nz and aussie.zone are similar in their queue trends, were using them as a control to see if it helps (so far no obvious big change but we are slowly making ground on them so perhaps that’s because of this).
At the very least, with this pre-fetching of posts and comments, if someone replies to you on a lemmy.world community it should now come through almost immediately, though it will take some time to see the votes which are still delayed.
And just to clarify, it’s all based around communities. If a lemmy.world user votes on a post in a community on beehaw.org, it should come through straight away, as beehaw.org is responsible for sending the vote. If a beehaw.org user votes on something on a community on lemmy.world, it will be delayed, because lemmy.world is responsible for sending the activity.
I’m really hoping the easter weekend will lead to a drop in lemmy.world activity that will let us catch up, though we are still only another kbin bug from being behind again, at least until a lemmy software update is done to resolve the issue.
Anyway, hope that made things clear 😑 (though I suspect it made you more confused 😆)
Solid description, you’re the greatest, Dave!
I know 😁
Here’s a graph of the number of “activities” we are behind lemmy.world, and aussie.zone as well. Last time we had the issue (a couple of weeks back) aussie.zone recovered much faster than us. Now we seem to be gaining on them, 30k worse than them down to 10k worse than them over the last 24 hours. So perhaps the prefetching is working!
The up and down is just normal variance over the day, it improves when the other side of the world are asleep, and gets worse when they all wake up.
WOW! Thank you for all that knowledge :) The relationships between the different instances, and how they work together suddenly makes sense - seems pretty simple when you break it down like that lol.
Do you think there could ever be a “perfect” solution to this problem? Or due to how this whole Federation thing works with all these different instances, is this just a limitation that can only get better to a point?
It’s fixable. Just that it wasn’t known to be a problem until recently.
Here’s the issue raised for it, where there is a bunch of conversation about ways to fix it.
The main issue it keeping the order right, so you can probably fix that by simply sending the date and time along with it and appropriately handling it on the receiving side. This has it’s own pros and cons that I think were discussed in that issue, but all in all, rest assured that top people are looking at it 🙂
Though it seems major enough that I think it will be a while before we get a fix released, and will have to limp along in the meantime.
Oh that’s good to hear :)
Just skimmed through the issue, and there seems to be a lot of interesting discussion around it.
Pretty cool, really feels like we’re at the infancy of something, hopefully growing into something great.
For sure! The great thing with federation is you can like the idea but not the implementation, and decide to do it better but get to interact with all the users still.
So we end up with a bunch of similar software that can interact: lemmy, kbin, mbin, sublinks, piefed. No single point of failure, so even if lemmy development was completely abandoned we could just switch software to something else and continue.
Oh okay - I didn’t know about the interaction with different software. Had heard of Kbin before (haven’t yet heard of any of the others), and just assumed it was a completely different type of thing like Mastodon.
That’s pretty cool. How does the interaction with (for example Kbin) work? Can we see their communities, and interact with their posts and vice-versa?