Pedro-NF wrote: ↑Sat Jul 21, 2018 6:58 pm
Unfortunately, the mass timeouts continue. Like I did before when I had this issue in February, right after my previous post I increased the number of "hyperthreaded cores", going from 4 to 6 this time. That stopped the timeouts for some time, but we just had one with less than 10 players on the server (21 Jul 2018, 22:13 EST), with 3 "hyperthreaded cores" barely registering any activity, and a 4th with ~20% usage. I made an experiment last week, going back to 4 "hyperthreaded cores", and the timeouts became much more frequent, so I switched to 6 again.
Certainly have me take another look at the machine just to be safe. And, on your end, take another look at the OS and game server logs to make sure that nothing obvious and unexpected is going on, such as an OOM condition or clear DoS attack.
Does this happen when there's a lot of I/O? If you're using Windows as your OS, I'll want to check to see if you're on an all-SSD host.
This issue is not related to whatever is running on my VDS, or the OP's VDS, or the other people having these issues. It is caused by a physical core running threads from different VDSs. That's simply a disaster waiting to happen. No matter how fast the hypervisor juggles those threads around, eventually that won't happen fast enough, especially when it comes to game servers. Adding more " HT cores", even when they are not needed, the probability of "thread crashing" decreases, The problem naturally starts happening more frequently as a physical machine becomes more populated and the probability of "thread crashing" increases. And it will keep happening for as long as a physical core is allowed to run threads from different VDSs.
Thankfully, no, that's not the case.
For clients to start timing out in-game, most games require several seconds without a response from the server.
Hyperthreading doesn't mean that one thread will grind to a halt for multiple seconds, or crash, while another thread runs on the other physical core. Instead, both threads effectively run all the time, with very small additional delays (sub-microsecond delays) added as their instructions are split into microcode and interleaved within the processor to take better advantage of its internal resources. These delays manifest simply as higher CPU usage to the OS.
Scheduling inside Xen also happens with microsecond resolution, at least with the settings that we use.
Multiple-second delays would have another cause, such as a design problem on the software side, an I/O delay, or attack of some sort.
If your problem is not related to the other customer's, I should split this into a separate thread.