[GE users] selection criteria for automatic job suspension

reuti reuti at staff.uni-marburg.de
Thu Jan 14 11:54:20 GMT 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hi,

Am 11.01.2010 um 16:57 schrieb massot:

> Hello,
>
> According to what I understand after reading source code (man page  
> isn't
> fully informative), when suspend_thresholds is reached on a host, the
> job selected for suspension is the one running for the shortest time,
> and we can't do it another way.
> Did I get that right? If so, I think it would be nice to have the  
> option
> to tell scheduler to select job that has highest load instead of
> shortest run time.

there was a similar discussion about the slot-wise suspend on  
subordination - which job to suspend? The best would be of course to  
let the user decide whether it should be the one with the shortest or  
the longest runtime (and maybe to correct the h_rt).

What do you mean by "highest load"? Every running process which is  
eligible to be executed generates a load of 1. Do you mean parallel  
jobs on a node?


> What often happens, I guess, is that suspend_thresholds is reached  
> only
> when a job "goes mad" so that would make more sense to suspend this  
> one
> rather than another one running normally for a longer time.

Did you define more slots than installed cores? Nowadays the load is  
a little bit misleading, as also uninterruptible kernel tasks will  
increase the load, although they are waiting for the disk or alike  
(state "D"). Maybe the feature of suspend_threshold isn't suited for  
modern Linux systems at all.

-- Reuti


>
> What's your opinion?
> -- 
> Bernard Massot - Bureau D4 - Département de physique
> École Normale Supérieure
> 24 rue Lhomond - 75005 Paris
> Tél: +33 1 44 32 25 89
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=238125
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=238737

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list