calcyman wrote:Someone is trying to upload overlarge (many megabyte) hauls. Judging by IP address, I think it might be Apple Bottom, possibly running Day & Night (or a relative thereof).
Aye, and I'm sorry the site suffered problems. This is actually due to an issue with apgmera's design[1] that I've been meaning to report.
If an upload fails, the searcher will continue running and try to upload a bigger haul the next time the configured number of soups have been completed. Normally that's fine and dandy, but Day & Night is special in that it creates very big hauls very quickly, and by the time another attempt is made, the haul may already be too large to upload at all. And when that happens and when the upload fails, apgmera continues searching, and then tries to upload an even larger haul next time. Lather, rinse, repeat.
I always test new rules I want to investigate with smaller hauls first to gauge haul sizes and search rates. Usually there's lots of wiggle room, but Day & Night, when searched with apgmera, is special.
My Day & Night searcher is tuned to use hauls of 8 million soups; this is a compromise to ensure hauls don't hit the 1 MiB haul size limit (the biggest 8m haul I've had was 928 KiB), while also ensuring that Catagolue won't get hammered with a new submission every 30 seconds (one 8m haul takes about 4 to 5 minutes).
The downside is that any time a connection fails, the above issue gets triggered. It's actually happened a few times, but I usually catch it very quickly and terminate and restart the searcher. This time it happened overnight and continued for a few hours; I caught it after I got up and checked on my searchers' health. I didn't expect it would actually cause trouble server-wise -- I was just miffed that a few hours worth of effort had been wasted.
There's several things that could be done.
- Since there's a haul size limit of 1 MiB, apgmera could be taught not to try and upload bigger hauls to begin with.
- The server could return a meaningful status indicating that the haul size is exceeded; apgmera could recognize this and stop souping (or start from a clean slate) instead of adding even more to what is already too much data for the server to handle.
- The current mechanism of adding new data to an existing haul after a failed connection could be overhauled; instead apgmera could start a new search while also retaining the results of the previous search, and then attempt to submit two (or more) hauls the next time.
- apgmera could also make more than one attempt to submit a haul if a connection fails (for a reason other than because the haul's too big). For instance, it could make three further attempts, waiting for (say) 15 seconds, 1 minute and 5 minutes in between respectively, and only give up if they all fail. That should work around intermittent connectivity blips.
For now I've implemented a workaround -- my Day & Night searcher will run for at most one haul and then quit. A wrapper script will then restart it, so there'll be no change in practice if the upload was successful; if it wasn't the search will start from a clean slate, and the next haul won't be overly large.
Again, I'm sorry that the site suffered problems.
Footnote:
- The same issue exists in apgnano, but since hauls don't get that big that quickly in Conway Life, it's not an issue in practice.