[GE users] Capping logfile size

Parikh, Neal Neal.Parikh at gs.com
Fri Oct 24 20:18:05 BST 2008


This article looks interesting, thanks for the link. 

-----Original Message-----
From: Reuti [mailto:reuti at staff.uni-marburg.de] 
Sent: Thursday, October 23, 2008 6:08 AM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] Capping logfile size

Am 21.10.2008 um 16:02 schrieb Reuti:

> Am 21.10.2008 um 15:50 schrieb Parikh, Neal:
>
>> To clarify, I don't care about the user filling up his own space
>> directory, since that doesn't really happen in practice (plus it only
>> affects that user, rather than all the other users). What happens is
>> that they don't write code properly, and stderr or stdout gets
>> completely flooded, unintentionally, with some infinite loop. So  
>> there
>> is only one directory, the common logfile directory, that I am  
>> concerned
>> about. That should be much lower overhead than what you were asking
>> about.

The concept of hard- and soft-limits is (unfortunately) different in  
the underlying OS. There is only one limit in effect - the soft- 
limit, and its value can be set by the user to anything in the range  
0 to hard-limit. It's not used as a warning there (like it's done by  
SGE for some values like h_rt/s_rt).

Just yesterday I read this:

http://www.ibm.com/developerworks/linux/library/l-ubuntu-inotify/ 
index.html

and maybe it can be used in some way to implement a check for the  
file size of the logfiles what you like to have.

-- Reuti


> If your workflow is to put all logfiles into one directory, you can  
> even setup a disk quota for this partition with different limits  
> from their /home and avoid affecting other users. Disk quota will  
> not check the size of a directory, but adding up the size of all  
> files belonging to each user.
>
> -- Reuti
>
>
>>
>> Yes, I will open an issue.
>>
>> -----Original Message-----
>> From: Reuti [mailto:reuti at staff.uni-marburg.de]
>> Sent: Tuesday, October 21, 2008 7:11 AM
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] Capping logfile size
>>
>> Am 21.10.2008 um 06:51 schrieb Ron Chen:
>>
>>> --- On Tue, 10/21/08, Parikh, Neal <Neal.Parikh at gs.com> wrote:
>>>> (2) Hard limits. If the issue mentioned above continues,
>>>> and some users aren't good about fixing their code to not
>>>> produce such huge logs, then
>>>> I was hoping to find some way to just limit the size of the
>>>> log files, either by having SGE just stop updating those
>>>> files after a certain point, compressing the logfile and
>>>> rotating it, or something like that.
>>>
>>> Note that the user can submit a job that fills up the temp
>>> directory or the user's home directory, and SGE won't be able to
>>> detect that! In order to make it 100% loophole free, SGE will need
>>> to trap all the system calls performed by the job, and that's high
>>> overhead IMO.
>>>
>>> I think the truncate(2) system call can be used to reduce the file
>>> size, but we will need to discuss about how everything fits in --
>>> as each time SGE truncates the files, new data are also written to
>>> those files at the same time.
>>>
>>> Neal, can you open an issue so that we can track this feature
>>> request? Otherwise after a week or two we will forget all this
>>> discussion.
>>
>> This also depends on the OS. AFAIK in NEC's Super-UX you can set the
>> user limit "fspace" for the space allocated by all files of a process
>> in total (in addition to fsize where it's per file).
>>
>> -- Reuti
>>
>>
>>>  -Ron
>>>
>>>
>>>>
>>>> In both cases, it will definitely be a cluster
>>>> administrator setting, I
>>>> don't want users setting any of this at submission
>>>> time.
>>>>
>>>> If there is no simple way to do this, I'll find some
>>>> workaround outside
>>>> SGE, but it would have been nice to have the capability.
>>>>
>>>> Thanks,
>>>> Neal
>>>>
>>>> -----Original Message-----
>>>> From: Rayson Ho [mailto:rayrayson at gmail.com]
>>>> Sent: Monday, October 20, 2008 12:21 PM
>>>> To: users at gridengine.sunsource.net
>>>> Subject: Re: [GE users] Capping logfile size
>>>>
>>>> First of all, I would like to find out how you want to
>>>> limit the size
>>>> file: Do you want the job owner to set the limit at job
>>>> submission
>>>> time, or you want the limit to be set globally by the
>>>> cluster
>>>> administrator??
>>>>
>>>> Currently we don't have a direct way to do this, but
>>>> you can use an
>>>> external load sensor (suggested by Reuti).
>>>>
>>>> However, if you open an issue (see
>>>> http://gridengine.sunsource.net/issues/ ), then we may be
>>>> able to add
>>>> this feature inside SGE in a future version.
>>>>
>>>> I just read the code, one way to implement this feature is
>>>> to add some
>>>> code in the main loop of execd. We then iterate through the
>>>> list of
>>>> jobs, we check the size of the job's out/err file
>>>> (JB_stdout_path_list
>>>> and JB_stderr_path_list). This should be real simple to do
>>>> (may be a
>>>> few hours of work), but the only way to set the threshold
>>>> limit is by
>>>> the cluster administrator if we implement it this way.
>>>>
>>>> Rayson
>>>>
>>>>
>>>>
>>>> On 10/20/08, Parikh, Neal <Neal.Parikh at gs.com> wrote:
>>>>> Thanks. This is close to what I want to do but not
>>>> quite the same. I
>>>>> want to send an alert email about logfile size even if
>>>> the job is
>>>> still
>>>>> running; it seems like this would only allow me to
>>>> send the email
>>>> after
>>>>> the job is complete. Is there a way of doing that?
>>>>>
>>>>> -----Original Message-----
>>>>> From: Reuti [mailto:reuti at staff.uni-marburg.de]
>>>>> Sent: Monday, October 20, 2008 10:45 AM
>>>>> To: users at gridengine.sunsource.net
>>>>> Subject: Re: [GE users] Capping logfile size
>>>>>
>>>>> Hi,
>>>>>
>>>>> Am 20.10.2008 um 15:44 schrieb Parikh, Neal:
>>>>>
>>>>>> Is it possible to include some automatic
>>>> monitoring that generates
>>>> an
>>>>>> email alert (to some pre-specified addresses, not
>>>> just the job
>>>> owner)
>>>>>
>>>>> if you would just like to kill the job, you could set
>>>> s_fsize in the
>>>>> queue configuration. The any further write would fail.
>>>> But this will
>>>>> affect all file accesses of the job, not only the
>>>> logfile. Reading
>>>>> bigger files should be possible though.
>>>>>
>>>>>> when a job's logfile goes over a certain file
>>>> size? I want to
>>>> monitor
>>>>>> stdout_path_list and stderr_path_list and would
>>>> prefer to do it
>>>>>> directly
>>>>>> within SGE.
>>>>>
>>>>> If you only want to write a warning mail after the
>>>> job, you could put
>>>>> it in queue or global epilog and check therein
>>>> $SGE_STDOUT_PATH and
>>>>> $SGE_STDERR_PATH
>>>>>
>>>>> -- Reuti
>>>>>
>>>>>> Thanks,
>>>>>> Neal
>>>>>>
>>>>>>
>>>> -------------------------------------------------------------------

>>>> --
>>>>>> To unsubscribe, e-mail:
>>>> users-unsubscribe at gridengine.sunsource.net
>>>>>> For additional commands, e-mail:
>>>> users-help at gridengine.sunsource.net
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>> -------------------------------------------------------------------

>>>> --
>>>>> To unsubscribe, e-mail:
>>>> users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail:
>>>> users-help at gridengine.sunsource.net
>>>>>
>>>>>
>>>> -------------------------------------------------------------------

>>>> --
>>>>> To unsubscribe, e-mail:
>>>> users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail:
>>>> users-help at gridengine.sunsource.net
>>>>>
>>>>>
>>>>
>>>> -------------------------------------------------------------------

>>>> --
>>>> To unsubscribe, e-mail:
>>>> users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail:
>>>> users-help at gridengine.sunsource.net
>>>>
>>>> -------------------------------------------------------------------

>>>> --
>>>> To unsubscribe, e-mail:
>>>> users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail:
>>>> users-help at gridengine.sunsource.net
>>>
>>> __________________________________________________
>>> Do You Yahoo!?
>>> Tired of spam?  Yahoo! Mail has the best spam protection around
>>> http://mail.yahoo.com
>>>
>>> --------------------------------------------------------------------

>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list