I've written a PHP script that runs via SSH and nohup, meant to process records from a database and do stuff with them (eg. process some images, update some rows).
It works fine with small loads, up to maybe 10k records. I have some larger datasets that process around 40k records (not a lot, I realize, but it adds up to a lot of work when each record requires the download and processing of up to 50 images).
The larger datasets can take days to process. Sometimes I'll see in my debug logs memory errors, which are clear enough-- but sometimes the script just appears to "die" or go zombie on me. My tail of the debug log just stops, with no error messages, the tail of the nohup log ends with no error, and the process is still showing in a ps list, looking like this--
26075 pts/0 S 745:01 /usr/bin/php ./import.php
but no work is getting done.
Can anyone give me some ideas on why a process would just quit? The obvious things (like a php script timeout and memory issues) are not a factor, as far as I can tell.
Thanks for any tips
PS-- this is hosted on a godaddy VDS (not my choice). I am sort of suspecting that godaddy has some kind of limits that might kick in on me despite what overrides I put in the code (such as set_time_limit(0);).
Is there a difference calling daemon(0,0) from within a program, and launching a program to be in background and redirecting it's output
1:Reporting library for Linux / C++ / Gtk? [closed]
/proc/self/oom_adj. Understanding behaviour of read() and write()Caution: The kernel usually knows better. c++, sleep, and loopsEvading the OOM killer must actually cripple the same RDBMS this you are endeavor to query. Regular expression with sedWhat a vicious cycle this would be :). How can I search PDF? You probably (instead) want to stagger queries based on what you read from
/proc/meminfo. Will System.currentTimeMillis always return a value >= previous calls?If you increase loads or swap exponentially, you need to back off, especially as a background process :). Additionally, monitor IOWAIT while you run. This must be averaged from
/proc/statwhen compared with the time the system booted. Note it when you start and as you progress.. Unfortunately, the serial killer known as the OOM killer does not maintain a body count this is accessible beyond parsing kernel messages.. Or, your cron job keeps hitting its
ulimited amount of allocated heap. Either way, your job needs to back off when appropriate, or prevent its own demise (as noted above) prior to doing any work.. As a side note, you probably should not be doing what you are doing on shared hosting. If its this big, its time to receive a VPS (at least) where you have any control over what process receive s to did what..