[GE users] Problems submitting with fixed allocation rule

Shannon V. Davidson svdavidson at swbell.net
Tue Aug 31 14:35:34 BST 2004


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Thomas,

If you want to schedule jobs solely based on slots, make sure your queue 
load_thresholds attribute is set to NONE.  To see some of the scheduling 
decision details, check out the -tsm option on the qconf(1) man page.

Cheers,
Shannon


Thomas Neumann wrote:

> Hello !
>
> My cluster is configured in 2 parts, the first part are dual processor 
> machines, which I have assigned 200 slots on each machine. The second 
> part are single processor machines which are configured with 100 slots 
> on each machine. I have created 4 parallel environments with fixed 
> allocation rules (1, 2, 3, 4 slots on each machine all are called 
> bash* where * equals the number of slots,  all other settings are 
> identical).
> Submitting on the dual processor machines works fine. On the single 
> processor machines the following situation appears:
> Submitting with bash1 to bash3 causes no problems, but when submitting 
> with bash4  the job only becomes pending. Asking why I receive the 
> message "cannot run because available slots combined under pe bash4 
> are not in range of job". The exact configuration for bash4 is:
>
> Slots 999
> Users SGEAdmins
> XUsers None
> Start Proc Args /bin/true
> Stop Prog Args /bin/true
> Allocation Rule 4
> Urgency Slots min
> Control Slaves true (I also tried false)
> Job is first task true (I also tried false)
>
> Increasing the slots does not work. I checked all privileges for 
> users, projects, etc., but there are no differences between the 
> machines in the cluster. For testing the pe I decreased the slots of 
> bash4 to 3 slots on each machine and the job started imediately. 
> Looking in qstat after submitting with bash3, there are 3/100 slots 
> used. All other complex_values are also available and identical for 
> all machines in the whole cluster. Has anybody got an idea where the 
> problem could be?
>
> Thomas
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>


-- 
___________________________________________

Shannon V. Davidson <svdavidson at swbell.net>
Senior Software Engineer           Raytheon
636-479-7465 office        443-383-0331 fax
___________________________________________




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list