[GE users] when a queue is full

Peiran Song peirans at cs.uoregon.edu
Thu Dec 8 01:54:36 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Ron,

>1) for the job execution time difference, can you use top,
>vmstat, and/or iostat to find out what is going on in the
>system? Are any of the SGE daemons running consuming the
>processor?
>  
>
vmstat is not available at our system. iostat doesn't show any status 
difference. top, the only difference I saw, when comparing the outputs 
from two simultaneous jobs submitted by dsh versus tow jobs scheduled 
through the scheduler, is the VSIZE. Two jobs submitted through dsh, 
which executes much faster, are each consuming 50M less virtual memory. 
That doesn't seem big enough to be the matter, does it?

>2) if a queue instance (host) is full, it means all the job
>slots are used up. If you have 2 CPUs, and 2 jobs are running on
>the host, then SGE tells you that it is full.
>  
>
So, I didn't see queue full message when submit jobs by dsh was because 
that bypassed SGE?
I am still naive with SGE and sys admin...

Thanks,
Peiran


> -Ron
>
>
>--- Peiran Song <peirans at cs.uoregon.edu> wrote:
>  
>
>>Hi All,
>>
>>We have an Apple cluster running Grid Engine. We observed much
>>longer 
>>execution time of two subjobs scheduled to the same duel-CPU
>>node, 
>>comparing to directly send the two sub-jobs about
>>simultaneously by dsh 
>>to the same node. The time difference is two minutes versus 15
>>seconds. 
>>When I tried qstat -j during the executions, for the first
>>case, I got 
>>the queue is full info as below, but not for the second case.
>>
>>usage    2:                 cpu=00:00:00, mem=0.00000 GBs,
>>io=0.00000, 
>>vmem=N/A, maxvmem=N/A
>>usage    3:                 cpu=00:00:00, mem=0.00000 GBs,
>>io=0.00000, 
>>vmem=N/A, maxvmem=N/A
>>scheduling info:            queue instance 
>>"all.q at node005.cluster.private" dropped because it is full
>>
>>I am wondering at what circumstance a queue would be deemed
>>full (no 
>>spare CPU, no spare memory?). Is that truly full or is that an
>>
>>estimate?  Seems that when it is deemed full, it took much
>>longer for 
>>the job to be done. Could the configuration parameters be
>>tweaked 
>>somehow to limit/avoid this happening? Here is our current
>>configuration:
>>
>>algorithm                         default
>>schedule_interval                 0:0:1
>>maxujobs                          0
>>queue_sort_method                 load
>>job_load_adjustments              NONE     --- should we
>>adjust?
>>load_adjustment_decay_time        0:0:0
>>load_formula                      np_load_avg
>>schedd_job_info                   true
>>flush_submit_sec                  0
>>flush_finish_sec                  0
>>params                            none
>>reprioritize_interval             0:0:0
>>halftime                          168
>>usage_weight_list                
>>cpu=1.000000,mem=0.000000,io=0.000000
>>compensation_factor               5.000000
>>weight_user                       0.250000
>>weight_project                    0.250000
>>weight_department                 0.250000
>>weight_job                        0.250000
>>weight_tickets_functional         0
>>weight_tickets_share              0
>>share_override_tickets            TRUE
>>share_functional_shares           TRUE
>>max_functional_jobs_to_schedule   200
>>report_pjob_tickets               TRUE
>>max_pending_tasks_per_job         50
>>halflife_decay_list               none
>>policy_hierarchy                  OFS
>>weight_ticket                     0.010000
>>weight_waiting_time               0.000000
>>weight_deadline                   3600000.000000
>>weight_urgency                    0.100000
>>weight_priority                   1.000000
>>max_reservation                   0
>>default_duration                  0:10:0
>>
>>Any comments and ideas would be very much appreciated!
>>
>>Regards,
>>Peiran Song
>>
>>
>>
>>
>>
>>
>>    
>>
>---------------------------------------------------------------------
>  
>
>>To unsubscribe, e-mail:
>>users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail:
>>users-help at gridengine.sunsource.net
>>
>>
>>    
>>
>
>
>__________________________________________________
>Do You Yahoo!?
>Tired of spam?  Yahoo! Mail has the best spam protection around 
>http://mail.yahoo.com 
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>  
>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list