[GE users] when a queue is full

Peiran Song peirans at cs.uoregon.edu
Thu Dec 8 00:29:28 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi All,

We have an Apple cluster running Grid Engine. We observed much longer 
execution time of two subjobs scheduled to the same duel-CPU node, 
comparing to directly send the two sub-jobs about simultaneously by dsh 
to the same node. The time difference is two minutes versus 15 seconds. 
When I tried qstat -j during the executions, for the first case, I got 
the queue is full info as below, but not for the second case.

usage    2:                 cpu=00:00:00, mem=0.00000 GBs, io=0.00000, 
vmem=N/A, maxvmem=N/A
usage    3:                 cpu=00:00:00, mem=0.00000 GBs, io=0.00000, 
vmem=N/A, maxvmem=N/A
scheduling info:            queue instance 
"all.q at node005.cluster.private" dropped because it is full

I am wondering at what circumstance a queue would be deemed full (no 
spare CPU, no spare memory?). Is that truly full or is that an 
estimate?  Seems that when it is deemed full, it took much longer for 
the job to be done. Could the configuration parameters be tweaked 
somehow to limit/avoid this happening? Here is our current configuration:

algorithm                         default
schedule_interval                 0:0:1
maxujobs                          0
queue_sort_method                 load
job_load_adjustments              NONE     --- should we adjust?
load_adjustment_decay_time        0:0:0
load_formula                      np_load_avg
schedd_job_info                   true
flush_submit_sec                  0
flush_finish_sec                  0
params                            none
reprioritize_interval             0:0:0
halftime                          168
usage_weight_list                 cpu=1.000000,mem=0.000000,io=0.000000
compensation_factor               5.000000
weight_user                       0.250000
weight_project                    0.250000
weight_department                 0.250000
weight_job                        0.250000
weight_tickets_functional         0
weight_tickets_share              0
share_override_tickets            TRUE
share_functional_shares           TRUE
max_functional_jobs_to_schedule   200
report_pjob_tickets               TRUE
max_pending_tasks_per_job         50
halflife_decay_list               none
policy_hierarchy                  OFS
weight_ticket                     0.010000
weight_waiting_time               0.000000
weight_deadline                   3600000.000000
weight_urgency                    0.100000
weight_priority                   1.000000
max_reservation                   0
default_duration                  0:10:0

Any comments and ideas would be very much appreciated!

Regards,
Peiran Song





---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list