[GE issues] [Issue 3250] New - Unable to submit job using advance reservation if h_rt is longer than 32999 seconds

mhanby mhanby at uab.edu
Fri Mar 12 21:37:11 GMT 2010


http://gridengine.sunsource.net/issues/show_bug.cgi?id=3250
                 Issue #|3250
                 Summary|Unable to submit job using advance reservation if h_rt
                        | is longer than 32999 seconds 
               Component|gridengine
                 Version|6.2u5
                Platform|All
                     URL|http://gridengine.sunsource.net/ds/viewMessage.do?dsFo
                        |rumId=38&dsMessageId=248081
              OS/Version|Linux
                  Status|NEW
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|DEFECT
                Priority|P3
            Subcomponent|scheduling
             Assigned to|andreas
             Reported by|mhanby






------- Additional comments from mhanby at sunsource.net Fri Mar 12 13:37:06 -0800 2010 -------
See the discussion here:
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=248081

GE 6.2u5
Linux x86_64
CentOS 5.4

I have created an advance reservation that has a duration of 225 hours and 59 minutes and 64 slots. The users in the ACL for the reservation
can submit jobs to using the reservation ID so long as their h_rt is 32999 seconds or less (9 hours 9 minutes and 59 seconds), any longer
than that and the qsub fails "Unable to run job: error: no suitable queues"

$ qrstat -ar 15
--------------------------------------------------------------------------------
id                             15
name                           testAR
owner                          mikeh
state                          r
start_time                     03/12/2010 13:00:00
end_time                       03/21/2010 23:59:00
duration                       225:59:00
submission_time                03/12/2010 11:36:36
group                          sge
account                        sge
granted_slots_list            
all.q at compute-1-4.local=8,all.q at compute-0-8.local=8,all.q at compute-0-7.local=8,all.q at compute-0-3.local=8,all.q at compute-0-12.local=8,all.q at compute-0-10.local=8,all.q at compute-0-5.local=3,all.q at compute-0-6.local=8,all.q at compute-0-14.local=5
granted_parallel_environment   lam_loose_rsh slots 64
mail_options                   abe
acl_list                       mikeh,jdoe

Next we can see the 64 slots in the reservation:
$ qstat -g c
CLUSTER QUEUE                   CQLOAD   USED    RES  AVAIL  TOTAL aoACDS  cdsuE  
--------------------------------------------------------------------------------
all.q                             0.66    128     64     64    192      0      0

Now, try and submit two jobs using the reservation:

$ echo `/bin/hostname` | qsub -ar 15 -pe lam_loose_rsh 32 -l h_rt=09:09:59
Your job 111005 ("STDIN") has been submitted

$ echo `/bin/hostname` | qsub -ar 15 -pe lam_loose_rsh 32 -l h_rt=09:10:59
Unable to run job: error: no suitable queues.
Exiting.

It seems that I can submit jobs using this AR so long as the max runtime is less than 32999 seconds. Any job submission 33000 seconds or
longer fails.

If I submit the same jobs without specifying a reservation, they will both submit and run properly.

Reuti also confirmed this behavior, although he found a slightly different max:

=========================================
> I also tried using "h_rt=32999" and "h_rt=33000" with the same  
> results.

Yep, I must confirm this. But for me the limit is 9:09:00, i.e. 32940.

-- Reuti
=========================================

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=36&dsMessageId=248241

To unsubscribe from this discussion, e-mail: [issues-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list