Good day,
This may be tedious, as it is a problem I have been fighting for a long time, but I desperately need help and I hope someone out there has some ideas for me. My server technicians have only a few ideas for changing the software.
Summary:
We have a dedicated computer with around 150 active websites (all WordPress) and e-mail addresses for some of them. Every day around 4pm, all websites slow down, emails are not delivered on time (my tests show that they arrive one hour later), and running SSH on the server looks like almost all of them individual website blocks the server at the same time (even development pages where no traffic should actually take place). We've been dealing with it for a while, trying out a bunch of things that we thought might be the culprit.
Each site has the same plugin suite, Wordfence for security, and Backupbuddy for backup. In our first round of changes, we found that Wordfence was doing scans at this time of day. We went to every single website and turned off the scanning feature of WordFence (not ideal). We thought this would have helped, but that might be coincidence when the problem started again.
Then we took a closer look at the top-level command and found that doing an everyday kill on all sites ran cron jobs and some seemed to be connected to the backup plug-in. That's why we have disabled our backup plugin on every website. That seemed to help for a day, but it can also be a coincidence. Because the problem still exists and has gotten worse in the last few days.
Now my server people are saying again and again that they are just random processes and the server is just overloaded, but why every day at the same time? You can see how the running processes jump from about 14 to 50, 60, 80, and so on.
Now, looking at the topmost command being executed, it does not seem to be a cron job, it just contains random commands like /index.php and /wp-login.php. The normal stuff you'd expect, but it really gets worse when it happens, and on sites that I know are not getting visitors.
My server people suggest switching to Cloud Linux, which I do not oppose, but I wanted to get some advice first, as apparently no one has a clue and just shoots in the dark.
Has anyone ever seen such a thing? My knowledge of all this is limited and self-taught, but any help is greatly appreciated.
Restarting HTTP and SQL resolved the issue. In most cases, the problem will be restored after a few seconds.
I will try to describe our specific situation more precisely.
150 (ish) sites running WordPress are not all up-to-date (as some would have custom plugins or obsolete designs and would require work, the customer is unwilling to make any changes to the update). Most are running on PHP 5.6 (another thing I need to fix, but we need time we do not have).
Drive is 90% full what I know is also no idea, but expensive in the workaround.
Server specifications are:
Intel Xeon
E3-1270 V2
3.5 GHz
Four core
16 GB RAM
800 gb ssd
Mirror drive for the fuse
Cpanel / WHM / MySQL
I can provide any other information I need to get to the bottom of it because my clients are not happy.
Should I have this kind of problem with these server specifications and so many sites? Or is something going on?
Is the outdated PHP my problem or the sites or the storage space? I just do not know what would cause more than 100 websites to access the server and let it crawl at the same time.
Thanks in advance for any help you can provide.