Frequent Sympl service test reports

Problem Description

Sympl service test reports with Apache2 temporary failures seem to be happening more frequently than usual. The pattern is always the same: an emailed report that includes
* apache2: FAILED: Temporary failure
then a detailed report with subject “sympl-monit detected service failure”
(contents below under “Error Messages”)
then a third email with
* apache2: PASSED
I’m used to getting a handful of these per month, but for some reason they are occurring more frequently recently, e.g. 9 sets of the above three messages in the last 3 days.

Is there any kind of resource problem that could be making these temporary failures more frequent?

System has 2.5G of RAM, 2GB swap of which only half is currently being used. Serving a number of mostly low traffic web sites and some email. When I look, CPU load is typically less than 1.0. Is there a way to log peaks in CPU, memory or other resource usage?

Any Error Messages

Typical detailed report from this morning

Started sympl-monit.service - Sympl service monitor.
 INFO Runner: apache2: Checking service is enabled
 INFO Runner: apache2: Checking process
 INFO Runner: apache2: Testing connection to 85.119.83.133:http
 INFO Runner: apache2: > OPTIONS / HTTP/1.0
 INFO Runner: apache2: > Host: localhost
 INFO Runner: apache2: >
 INFO Runner: apache2: < HTTP/1.1 200 OK
 INFO Runner: apache2: Connection test OK
 INFO Runner: apache2: Testing connection to 85.119.83.133:https
 INFO Runner: apache2: Connection test temporarily failed: Connection timed out -
execution expired
 INFO Runner: apache2: Attempting to stop apache2
 INFO Runner: apache2: Attempting to start apache2
 WARN Runner: apache2: RETRYING (following Temporary failure)
 INFO Runner: apache2: Checking service is enabled
 INFO Runner: apache2: Checking process
 INFO Runner: apache2: Testing connection to 85.119.83.133:http
 INFO Runner: apache2: > OPTIONS / HTTP/1.0
 INFO Runner: apache2: > Host: localhost
 INFO Runner: apache2: >
 INFO Runner: apache2: Connection test temporarily failed: Connection timed out -
execution expired
 INFO Runner: apache2: Testing connection to 85.119.83.133:https
 INFO Runner: apache2: Connection test temporarily failed: Connection timed out -
execution expired
 INFO Runner: apache2: Attempting to stop apache2
 INFO Runner: apache2: Attempting to start apache2
 WARN Runner: apache2: FAILED: Temporary failure
 INFO Runner: RESULT: 10/11 passed.
sympl-monit.service: Main process exited, code=exited, status=1/FAILURE
sympl-monit.service: Failed with result 'exit-code'.
sympl-monit.service: Triggering OnFailure= dependencies.
sympl-monit.service: Consumed 10.948s CPU time.

Environment

  • Sympl Version: 12
  • Sympl Testing Version: no
  • Debian Version: 12.9
  • Hardware Type: VPS
  • Hosted With: Bytemark

As is, Sympl is detecting that Apache has run out of resources and restarting it, so it’s somewhat expected, but the question is what’s causing it.

At Mythic Beasts, we’ve been seeing a lot of very aggressive AI-related scraping traffic recently, hitting sites with hundreds of connections at once, until they saturate all the Apache processes.

This has lead to us implementing some ‘adaptive’ filtering to detect and block these for sites on busy servers which a lot of sites, although that probably wouldn’t scale down to Sympl at the moment.

Have a look at your logs for each of the sites and see if you can identify if there’s a burst in traffic on one or more of the sites when this happens, and if so, you should be able to blacklist the relevant addresses/networks.

You can also have a look at /var/log/apache2 for mentions of it running out of resources at the same time which will confirm that is what’s going on.

This is a bit advanced and potentially dangerous, but if you are seeing a burst in traffic, you might want to adjust the values in /etc/apache2/mods-enabled/mpm_prefork.conf (MaxRequestWorkers) and see if that helps, but you may then just find you’re running into other limits.