[GE users] accounting and parallel jobs

reuti reuti at staff.uni-marburg.de
Tue Nov 10 10:17:16 GMT 2009


Hi,

Am 10.11.2009 um 09:50 schrieb mlmersel:

> Hi Reuti:
>
>  I am using 6.1U4, tight integration.

can you be more specific? What parallel lib are you using with which  
startup method and what did you do to achieve a tight integration?  
Did you monitor the running job on the nodes, so that they all got  
the additional group id attached? Did you check also a single job  
with "qacct -j <id>"?

-- Reuti


>                          Best,
>                            Jerry
>
> <quote who="reuti">
>> Am 09.11.2009 um 12:59 schrieb mlmersel:
>>
>>> and the cpu time?
>>
>> For Tightly Intergrated jobs you will get several entries in `qacct`,
>> unless you specify "accounting_summary TRUE" in the PE configuration.
>>
>> This is the recorded time of the CPU usage. This can be changed to be
>> reserved time (in `qconf -mconf`).
>>
>> There was a bug in 6.2 which was fixed in 6.2u1, when the builtin
>> method killed the slaves too early and their entries were completely
>> missing. Which version are you using and which method to invoke the
>> slaves?
>>
>> -- Reuti
>>
>>
>>>
>>> <quote who="reuti">
>>>> Am 09.11.2009 um 09:20 schrieb mlmersel:
>>>>
>>>>> It is tightly integrated.
>>>>>
>>>>> <quote who="fy">
>>>>>> Jerry
>>>>>>
>>>>>> Is you parallel environment tightly-integrated?
>>>>>> Loose integration is one reason for low cpu usage in accounting.
>>>>>> see:
>>>>>> http://gridengine.sunsource.net/howto/howto.html#Tight%
>>>>>> 20Integration%20of%20Parallel%20Libraries
>>>>
>>>> Wallclock is just the wallclock w/o multipication by slots.
>>>>
>>>> -- Reuti
>>>>
>>>>
>>>>>>
>>>>>> cheers
>>>>>> Fred Youhanaie
>>>>>>
>>>>>>
>>>>>> On 08/11/09 13:09, mlmersel wrote:
>>>>>>> Hi:
>>>>>>>
>>>>>>>   We have a group of users who have their own queue and run  
>>>>>>> almost
>>>>>>> exclusively parallel jobs. The problem is when I calculate the
>>>>>>> utilization per month (wall clock time / (secs in month * cores)
>>>>>>> I get
>>>>>>> a ridiculously small numbers 1%,2%,3%. I know this can't be
>>>>>>> correct.
>>>>>>>
>>>>>>> Is their a problem with the accounting when running parallel  
>>>>>>> jobs?
>>>>>>> I am using gridengine 6.1u4.
>>>>>>>
>>>>>>>                         Thanks,
>>>>>>>                           Jerry
>>>>>>>
>>>>>>> ------------------------------------------------------
>>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>>> dsForumId=38&dsMessageId=225643
>>>>>>>
>>>>>>> To unsubscribe from this discussion, e-mail:
>>>>>>> [users-unsubscribe at gridengine.sunsource.net].
>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ------------------------------------------------------
>>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>> dsForumId=38&dsMessageId=225648
>>>>>>
>>>>>> To unsubscribe from this discussion, e-mail:
>>>>>> [users-unsubscribe at gridengine.sunsource.net].
>>>>>>
>>>>>
>>>>> ------------------------------------------------------
>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>> dsForumId=38&dsMessageId=225782
>>>>>
>>>>> To unsubscribe from this discussion, e-mail: [users-
>>>>> unsubscribe at gridengine.sunsource.net].
>>>>>
>>>>
>>>> ------------------------------------------------------
>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>> dsForumId=38&dsMessageId=225802
>>>>
>>>> To unsubscribe from this discussion, e-mail:
>>>> [users-unsubscribe at gridengine.sunsource.net].
>>>>
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>> dsForumId=38&dsMessageId=225804
>>>
>>> To unsubscribe from this discussion, e-mail: [users-
>>> unsubscribe at gridengine.sunsource.net].
>>>
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessageId=225810
>>
>> To unsubscribe from this discussion, e-mail:
>> [users-unsubscribe at gridengine.sunsource.net].
>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=225967
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=225971

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list