[GE users] Long delay when submitting large jobs

Stephan Grell stephan.grell at sun.com
Tue Jan 18 16:11:04 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hello,

I did some testing my self and got some surprising times.... (reported 
via qping)
I started 3 mpi jobs with different sizes. The job start up time is:
- pe mpi 300 : 41s
- pe mpi 340 : 53s
- pe mpi 350 : 57s

During this time the qmaster is not processing any requests. I have not 
done any
debuging yet, but it looks worth than I expected.

I had a dual sparc III ~1 GHz machine with BDB spooling and 175 exec hosts.

Stephan

-

Eric Whiting wrote:

> Similar setup here. Similar problem.
>
> From a usability standpoint it is a 'bad thing' to see these very long 
> qstat/qsub hangs. qstat is something that users want to type and see a 
> response.  When it takes more than 10s they think it is sick. When it 
> takes more than 60s they think it is broken.  We are seeing qstat hang 
> longer than 60s.
>
> I've turned on scheduler profiling. I've noticed that the schedd time 
> numbers are sometimes less than the actual 'qstat hang time'.  I'll 
> qsub a job. I'll wait for a few seconds and then I'll do a 
> date;qstat;date. Then I wait. The time diff is often more than the 
> schedd run time reported in the log file.   Here are two recent 
> entries from the logs.
>
> 01/18/2005 07:33:55|schedd|master|I|PROF: schedd run took: 98.970 s 
> (init: 0.000 s, copy: 0.130 s, run:98.750, free: 0.090 s, jobs: 16, 
> categories: 4)
> 01/18/2005 07:46:16|schedd|master|I|PROF: schedd run took: 99.630 s 
> (init: 0.000 s, copy: 0.130 s, run:99.410, free: 0.090 s, jobs: 19, 
> categories: 4)
>
> I'm watching NFS traffic. I don't see that much traffic. This might be 
> something else
>
> qmaster: sun 4800, solaris 9
> NFS spooling
> N1GE 6.0U2
> 230 linux execd hosts (v20z)
>
> eric
>
>
>
>
>
>
>
>
> Craig Tierney wrote:
>
>>
>>
>> Qmaster has to talk to 128+ nodes.  On systems like this, it is probably
>> done linearly and just takes a buch of time.  The issue is, why does the
>> multi-threaded qmaster block when doing this?  Why can't another thread
>> respond to requests?  If not for all operations, at least for actions
>> that read the database.
>>
>> Thanks,
>> Craig
>>
>>
>>  
>>
>>> Cheers,
>>> Stephan
>>>
>>> Craig Tierney wrote:
>>>
>>>   
>>>
>>>> I have been running SGE6.0u1 for a few months now on a new system.
>>>> I have noticed very long delays, or even SGE hangs, when starting
>>>> large jobs.  I just tried this on the latest CVS source and
>>>> the problem persists.
>>>>
>>>> It appears that the hang while the job is moved from 'qw' to t.
>>>> In general the system does continue to operate normally.  However
>>>> the delays can be large, 30-60 seconds.  'Hang' is defined as
>>>> system commands like qsub and qstat will delay until the job
>>>> has finished migrating to the 't' status.  Sometimes the delays
>>>> are long enough to get GDI failures.  Since qmaster is threaded,
>>>> I wonder why I get the hangs.
>>>>
>>>> I have tried debugging the situation.  Either the hang is in qmaster,
>>>> or sge_schedd is not printing enough information.
>>>>
>>>> Here is some of the text from the sge_schedd debug for a 256 cpu job
>>>> using a cluster queue.
>>>>
>>>> 79347   7886 16384     J=179999.1 T=STARTING S=1105726988 D=43200 
>>>> L=Q O=qecomp.q at e0129 R=slots U=2.000000
>>>> 79348   7886 16384     J=179999.1 T=STARTING S=1105726988 D=43200 
>>>> L=Q O=qecomp.q at e0130 R=slots U=2.000000
>>>> 79349   7886 16384     J=179999.1 T=STARTING S=1105726988 D=43200 
>>>> L=Q O=qecomp.q at e0131 R=slots U=2.000000
>>>> 79350   7886 16384     Found NOW assignment
>>>> 79351   7886 16384     reresolve port timeout in 536
>>>> 79352   7886 16384     returning cached port value: 536
>>>> scheduler tries to schedule job 179999.1 twice
>>>> 79353   7886 16384        added 0 ticket orders for queued jobs
>>>> 79354   7886 16384     SENDING 10 ORDERS TO QMASTER
>>>> 79355   7886 16384     RESETTING BUSY STATE OF EVENT CLIENT
>>>> 79356   7886 16384     reresolve port timeout in 536
>>>> 79357   7886 16384     returning cached port value: 536
>>>> 79358   7886 16384     ec_get retrieving events - will do max 3 
>>>> fetches
>>>>
>>>> The hang happens after line 79352.  In this instance the message
>>>> indicates the scheduler tried twice.  Other times, I get a timeout
>>>> at this point.  In either case, the output pauses in the same
>>>> manner that a call to qsub or qstat would.
>>>>
>>>> I have followed the optimization procedures listed on the website
>>>> and they didn't seem to help (might have missed some though).
>>>>
>>>> I don't have any information from sge_qmaster.  I tried several
>>>> different SGE_DEBUG_LEVEL settings, but sge_qmaster would always
>>>> stop providing information after daemonizing.
>>>>
>>>> System configuration:
>>>>
>>>> Qmaster runs on Fedora Core 2, x86, (2.2 Ghz Xeon)
>>>> clients (execd) run on Suse 9.1 x86_64, (3.2 Ghz EM64T)
>>>> SGE is configured to use old style spooling over NFS
>>>>
>>>> I can provide more info, I just don't know where to go from here.
>>>>
>>>> Thanks,
>>>> Craig
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>
>>>>
>>>>     
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>   
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>  
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list