Hey All!
First, before writing anything else regarding the issue, I have dealt with this for literally years, and cannot figure it out and am hoping some of you gurus can shed some light/direction on where to go with this.
NFO VDS (unmanaged)
OS: Debian 10
Kernel: 4.19.0-13-amd64
MetaMod: 1.11.0-dev+1130V
SourceMod: 1.11.0.6570
I know firsthand, that server restarts are a good practice and have no issue doing so, but I also would like to know what the source of my issue is. Like clockwork, my server will crash after 3-4 days of running, ALWAYS on map change (no particular map pattern), and ALWAYS with the same error:
"CUtlLinkedList overflow! (exhausted memory allocator)"
I use SourceMod/MetaMod, and ensure regular updates to the latest versions. I have used a few different tools to analyze the crash dumps, particularly the useful Throttle by one of the SM devs (asherkin). Here is a link to the most recent crash with this same error => https://crash.limetech.org/vmywqehtkcfi
Here is the console output of that crash on map changing:
https://pastebin.com/Ljq6RJ2T
Output of debug.log:
https://pastebin.com/gbLiJD4Q
*The debug.log at #0 is particularly interesting to me, cannot access memory:
#0 0xf67366bb in ?? ()
Backtrace stopped: Cannot access memory at address 0xffc4f140
No symbol table info available.
While this is probably more related to the SRCDS process itself, I should note here that the VDS server memory usage is never exceeding 50-60% when the crashes happen, this is also true for swap space but even lower ~10%. Currently the server is 1.13G / 2.92G physical and 309M / 3194M swap.
This issue has perplexed me for a while now and I have scoured the internet and forums quite a bit with the only suggestion being to just restart the server daily, etc...which is fine but I would also like to get to the bottom of the issue at hand. There could be a slow memory leak going on here thinking plugins might be the culprit of a leak, so for a while, I used these commands in my 'server.cfg' to unload/reload them all at every map change:
sm plugins unload_all
sm plugins refresh
The same results still happen; after 3-4 days the exact memory allocator crash. It happens without those two commands as well, effectively making it a moot point. I have also used:
sm_dump_handles
Output:
-- Approximately 700610 bytes of memory are in use by Handles.
To clarify memory usage...at worst my plugins use about 700K of memory and no drastic spikes of increased memory usage past that point.
I am not sure how else to troubleshoot this issue or where to look. Is there a specific Linux Kernel I should try? A system/srcds memory setting? I know those things are a stretch, but again, not sure where else to look or what to do.
If any of you gurus have any ideas whatsoever, I am all ears and appreciate your time. No rush in response. If any more details are needed, please let me know. Thanks for your time and any possible help you could provide.
Hope everyone has a Happy New Year!
SRCDS crashing after 2-3 days on Map Changes (CUtlLinkedList overflow!)
-
- A semi-regular
- Posts: 26
- https://www.youtube.com/channel/UC40BgXanDqOYoVCYFDSTfHA
- Joined: Fri Sep 21, 2012 12:35 pm
Re: SRCDS crashing after 2-3 days on Map Changes (CUtlLinkedList overflow!)
I am a bit confused by your message here. You said that you restart your server nightly, but this crash occurs after 3 days of it not being restarted?
Re: SRCDS crashing after 2-3 days on Map Changes (CUtlLinkedList overflow!)
Clarification: It's good practice to restart a server daily, however, if I do not restart my server it will always crash within that 2-3 day window. I would like to find the source of the issue for the crashing regardless.
Re: SRCDS crashing after 2-3 days on Map Changes (CUtlLinkedList overflow!)
There are a couple of things here I can suggest or ask.
SRCDS has had the problem off and on for years, and generally the longer it runs without a restart, the more stability and performance can suffer. That said, I would expect more than 3-4 days, but even small things can make a large difference by that point, and I suspect that is what is happening here.
SRCDS has some internal limits even with extra resources on the host VDS. Have you seen a trend of CPU or memory usage increasing over time, or does it stay pretty level after the first day? How long do your maps run before a changelevel happens?
With the commands you used to disable plugins, it stops them from functioning, but it isn't always the same as temporarily uninstalling them. Additionally, it still leaves Sourcemod itself and Metamod:Source running. Have you tried running with all addons disabled? An easy way to do that is to rename the addons folder to "addons-off" while the server is shut down.
SRCDS has had the problem off and on for years, and generally the longer it runs without a restart, the more stability and performance can suffer. That said, I would expect more than 3-4 days, but even small things can make a large difference by that point, and I suspect that is what is happening here.
SRCDS has some internal limits even with extra resources on the host VDS. Have you seen a trend of CPU or memory usage increasing over time, or does it stay pretty level after the first day? How long do your maps run before a changelevel happens?
With the commands you used to disable plugins, it stops them from functioning, but it isn't always the same as temporarily uninstalling them. Additionally, it still leaves Sourcemod itself and Metamod:Source running. Have you tried running with all addons disabled? An easy way to do that is to rename the addons folder to "addons-off" while the server is shut down.
TimeX