It’s the new lemmy update that is the issue. It broke a lot. Lots of people are abandoning lemmy anyhow, which is why the servers are so empty now. I don’t expect many lemmy servers to continue in the next six months.
the lemmy changes are causing excessive resource use on my 'bin instance. so yeah, not using lemmy, but being directly affected by the lemmy snafu.
my failed messaging queue is filling, which has its own retry logic.. that queue buildup also takes disk space.... extra processing, extra disk space.. this leads to 'worker' slowdown and then system failures and timeouts.
Lastly RabbitMQ allows message prioritisation. So you can drop the priority of things the older/more retries they contain.
Most of this is either RabbitMQ policy or Queue rules based on Headers in the AMQP message. Depending on how KBin is generating messages you might be able to do this as a system admin
the lemmy changes are causing excessive resource use on my 'bin instance. so yeah, not using lemmy, but being directly affected by the lemmy snafu.
my failed messaging queue is filling, which has its own retry logic.. that queue buildup also takes disk space.... extra processing, extra disk space.. this leads to 'worker' slowdown and then system failures and timeouts.
Oh, interesting. My bad then, it's common for people to be unaware that kbin is a different thing from Lemmy and so I made an incorrect assumption.
I suppose this reveals some room for improvement in kbin, then. Other servers' problems shouldn't be impacting kbin as badly as this, likely indicating that kbin needs to add some robustness when it comes to dealing with stuff like this.
your thought process isnt completely off. if my server product was detecting the failures correctly, these resources wouldnt pile up.
i dont think people really understand just how brand new all this stuff is. 'the fediverse' is under active development. they call it the 'bleed edge' of technology because its painful. most fediverse servers are experiencing growing pains of some sort.
the Lemmy/kbin sides are still wet behind the ears. i just hope people dont give up!
The beginning of reddit was much the same. Things stopped working all the time. Weird bugs popped up. And there were people posting posts like this a lot trying to figure out what was going on.
A friend of mine tells me the instance mas.to (last night, 1/2/24) produced this error message for him:
Due to the technical issues they are having we are temporarily stopping delivery to kbin.social, as the failures and retries are impacting our site performance. We'll monitor the situation and unblock once they're back up and running again.
Figured you'd want to know, and thank you for the service you provide.
Edit: Appears to be unblocked now, but who knows if there are other instances that disconnected.
I'm the admin of mas.to. We didn't block in the moderation sense (suspend/limit), we just stopped delivery of content to kbin.social to prevent even more failures and retries piling up. It wasn't good for performance and probably not helping kbin.social either! I'll start delivery again on mas.to and my other instance!
Thank you so much ernest. I was able to go to my subscriptions, and all, and notifications today and look at them by all and all the normal stuff. For a second there I thought there was something going on!!!
Yeah, It's true. Since Sunday, I've been noting errors that I'm still working on resolving. It doesn't make it easier that it's the post-holiday period, and due to travels and security measures, it's not the easiest task. I'm working to get everything back to normal as soon as possible.
Thanks for putting in all this work, especially over a period that's traditionally vacation time. Make sure you're striking a good work/life balance, if you can get the site basically functional (as it appears to be now) don't sweat the small stuff. :)
Not just me eh, sorry to hear! I had a Jellyfin upgrade go sideways (my fault) once during the holidays and that was bad enough - and all my users live with me! Sorry that you are pulling your hair out, and personally I'm more than content to wait it out until after your vacation.
I'm curious, 2hen you say "on-site work", do you mean you need to travel onsite to do some work for kbin? At a host somewhere? Otherwise, when you say "security measures" for travel, how is that related? Maybe you just mean you are travelling and it is taking up your time...?
Can you predict when the situation will improve? Should we take a week off?
PS: I think I've seen too many devs in crisis for rolling out "minor" patches late on Fridays or before holidays; there should be an unwritten rule about it.
kbinMeta
Active
This magazine is from a federated server and may be incomplete. Browse more on the original instance.