[GE users] Capping logfile size

Parikh, Neal Neal.Parikh at gs.com
Tue Oct 21 14:50:52 BST 2008


To clarify, I don't care about the user filling up his own space
directory, since that doesn't really happen in practice (plus it only
affects that user, rather than all the other users). What happens is
that they don't write code properly, and stderr or stdout gets
completely flooded, unintentionally, with some infinite loop. So there
is only one directory, the common logfile directory, that I am concerned
about. That should be much lower overhead than what you were asking
about. 

Yes, I will open an issue.

-----Original Message-----
From: Reuti [mailto:reuti at staff.uni-marburg.de] 
Sent: Tuesday, October 21, 2008 7:11 AM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] Capping logfile size

Am 21.10.2008 um 06:51 schrieb Ron Chen:

> --- On Tue, 10/21/08, Parikh, Neal <Neal.Parikh at gs.com> wrote:
>> (2) Hard limits. If the issue mentioned above continues,
>> and some users aren't good about fixing their code to not
>> produce such huge logs, then
>> I was hoping to find some way to just limit the size of the
>> log files, either by having SGE just stop updating those
>> files after a certain point, compressing the logfile and
>> rotating it, or something like that.
>
> Note that the user can submit a job that fills up the temp  
> directory or the user's home directory, and SGE won't be able to  
> detect that! In order to make it 100% loophole free, SGE will need  
> to trap all the system calls performed by the job, and that's high  
> overhead IMO.
>
> I think the truncate(2) system call can be used to reduce the file  
> size, but we will need to discuss about how everything fits in --  
> as each time SGE truncates the files, new data are also written to  
> those files at the same time.
>
> Neal, can you open an issue so that we can track this feature  
> request? Otherwise after a week or two we will forget all this  
> discussion.

This also depends on the OS. AFAIK in NEC's Super-UX you can set the  
user limit "fspace" for the space allocated by all files of a process  
in total (in addition to fsize where it's per file).

-- Reuti


>  -Ron
>
>
>>
>> In both cases, it will definitely be a cluster
>> administrator setting, I
>> don't want users setting any of this at submission
>> time.
>>
>> If there is no simple way to do this, I'll find some
>> workaround outside
>> SGE, but it would have been nice to have the capability.
>>
>> Thanks,
>> Neal
>>
>> -----Original Message-----
>> From: Rayson Ho [mailto:rayrayson at gmail.com]
>> Sent: Monday, October 20, 2008 12:21 PM
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] Capping logfile size
>>
>> First of all, I would like to find out how you want to
>> limit the size
>> file: Do you want the job owner to set the limit at job
>> submission
>> time, or you want the limit to be set globally by the
>> cluster
>> administrator??
>>
>> Currently we don't have a direct way to do this, but
>> you can use an
>> external load sensor (suggested by Reuti).
>>
>> However, if you open an issue (see
>> http://gridengine.sunsource.net/issues/ ), then we may be
>> able to add
>> this feature inside SGE in a future version.
>>
>> I just read the code, one way to implement this feature is
>> to add some
>> code in the main loop of execd. We then iterate through the
>> list of
>> jobs, we check the size of the job's out/err file
>> (JB_stdout_path_list
>> and JB_stderr_path_list). This should be real simple to do
>> (may be a
>> few hours of work), but the only way to set the threshold
>> limit is by
>> the cluster administrator if we implement it this way.
>>
>> Rayson
>>
>>
>>
>> On 10/20/08, Parikh, Neal <Neal.Parikh at gs.com> wrote:
>>> Thanks. This is close to what I want to do but not
>> quite the same. I
>>> want to send an alert email about logfile size even if
>> the job is
>> still
>>> running; it seems like this would only allow me to
>> send the email
>> after
>>> the job is complete. Is there a way of doing that?
>>>
>>> -----Original Message-----
>>> From: Reuti [mailto:reuti at staff.uni-marburg.de]
>>> Sent: Monday, October 20, 2008 10:45 AM
>>> To: users at gridengine.sunsource.net
>>> Subject: Re: [GE users] Capping logfile size
>>>
>>> Hi,
>>>
>>> Am 20.10.2008 um 15:44 schrieb Parikh, Neal:
>>>
>>>> Is it possible to include some automatic
>> monitoring that generates
>> an
>>>> email alert (to some pre-specified addresses, not
>> just the job
>> owner)
>>>
>>> if you would just like to kill the job, you could set
>> s_fsize in the
>>> queue configuration. The any further write would fail.
>> But this will
>>> affect all file accesses of the job, not only the
>> logfile. Reading
>>> bigger files should be possible though.
>>>
>>>> when a job's logfile goes over a certain file
>> size? I want to
>> monitor
>>>> stdout_path_list and stderr_path_list and would
>> prefer to do it
>>>> directly
>>>> within SGE.
>>>
>>> If you only want to write a warning mail after the
>> job, you could put
>>> it in queue or global epilog and check therein
>> $SGE_STDOUT_PATH and
>>> $SGE_STDERR_PATH
>>>
>>> -- Reuti
>>>
>>>> Thanks,
>>>> Neal
>>>>
>>>>
>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
>>>>
>>>>
>>>
>>>
>>>
>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
>>>
>>>
>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail:
>> users-help at gridengine.sunsource.net
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list