[GE users] Capping logfile size

Reuti reuti at staff.uni-marburg.de
Mon Oct 20 16:42:48 BST 2008


Am 20.10.2008 um 17:17 schrieb Parikh, Neal:

> Thanks. This is close to what I want to do but not quite the same. I
> want to send an alert email about logfile size even if the job is  
> still
> running; it seems like this would only allow me to send the email  
> after
> the job is complete. Is there a way of doing that?

No, not in a straight forward setup.

Nevertheless, you can a) write a custom load sensor and abuse it to  
monitor the files or b) use a cron job. But only the two standard out  
and err files will be checked. If you have the spool directories  
still in $SGE_ROOT, you can check on each machine for the file:

$SGE_ROOT/default/spool/$HOST/active_jobs/*/config

and grep out of it the lines stdout_path/stderr_path. If it's a  
filename, you are lucky and can check it immediately. Otherwise, i.e.  
it's just a directory, you have to built the filename like $job_name.o 
$job_id and $job_name.e$job_id and append it to the path.

If someone redirects the output with a simple >, you have no chance  
to catch this though.

HTH - Reuti


>
> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de]
> Sent: Monday, October 20, 2008 10:45 AM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Capping logfile size
>
> Hi,
>
> Am 20.10.2008 um 15:44 schrieb Parikh, Neal:
>
>> Is it possible to include some automatic monitoring that generates an
>> email alert (to some pre-specified addresses, not just the job owner)
>
> if you would just like to kill the job, you could set s_fsize in the
> queue configuration. The any further write would fail. But this will
> affect all file accesses of the job, not only the logfile. Reading
> bigger files should be possible though.
>
>> when a job's logfile goes over a certain file size? I want to monitor
>> stdout_path_list and stderr_path_list and would prefer to do it
>> directly
>> within SGE.
>
> If you only want to write a warning mail after the job, you could put
> it in queue or global epilog and check therein $SGE_STDOUT_PATH and
> $SGE_STDERR_PATH
>
> -- Reuti
>
>> Thanks,
>> Neal
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list