[GE users] SGE 5.3p6 - jobs being submitted, going into 't' state, then disappearing

Robert Griffiths Robert.Griffiths at int.sc.mufg.jp
Tue Dec 4 18:38:05 GMT 2007


How long do you expect your jobs to run for?

If they're only a few seconds each, then in all likelihood you won't
ever see the "r" state - just the "t" state. The qmaster just won't
update fast enough (problem gets worse as grid gets larger).

This may well sound like stating the bleeding obvious - but you've gotta
ask ;-)

Another of the "bleeding obvious" questions is "Do your jobs run
interactively on the nodes on which you've submitted them to be run?"

Do your jobs have any output or errors that can be trapped? Use the "-j
y" option to put the standard out and standard err into the same stream.
The output will be called something like JobName.o{$JOB_NUMBER} and with
JobName.e{$JOB_NUMBER} if you don't join the output streams.

These files could tell you more about what's going on.

It's a start...

Rob

-----Original Message-----
From: Richard Hobbs [mailto:richard.hobbs at crl.toshiba.co.uk] 
Sent: 04 December 2007 18:15
To: users at gridengine.sunsource.net
Subject: [GE users] SGE 5.3p6 - jobs being submitted, going into 't'
state, then disappearing


Hello,

When i submit jobs to our queue (5.3p6 qmaster running on Solaris 10
with 5.3p6 exec hosts running Linux), the jobs go into 'qw', then get
assigned to a queue, as state 't', and after a few seconds they
disappear.

I also cannot find anything in the logs - does anyone have any idea
where to look?

Thanks in advance!

Richard.

-- 
Richard Hobbs (Systems Administrator)
Toshiba Research Europe Ltd. - Cambridge Research Laboratory
Email: richard.hobbs at crl.toshiba.co.uk
Web: http://www.toshiba-europe.com/research/
Tel: +44 1223 436999        Mobile: +44 7811 803377

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



***********************************************************
Mitsubishi UFJ Securities International plc ("MUSI") is registered in England, company number 1698498, registered office at 6 Broadgate, London EC2M 2AA, and is part of the Mitsubishi UFJ Financial Group.  MUSI is authorised and regulated in the UK by The Financial Services Authority Limited (FSA). The information contained herein or attached hereto has been obtained from sources we believe to be reliable but we do not represent that it is accurate or complete and is not  to be viewed as a 'personal recommendation' within the meaning of the FSA rules.  The Information is not to be construed as an offer or solicitation to buy or sell any security, instrument or investment. Any reference to past performance should not be taken as an indication of future performance.  MUSI or any affiliated company may have an interest, position, or effect transactions, in any investment mentioned herein. Any opinions or recommendations expressed herein are solely those of the author or analyst and are subject to change without notice. Neither MUSI nor any of its affiliates accept any liability whatsoever for any direct or consequential loss arising from any use of information or material contained herein.



This message is intended solely for the individual addressee named above.  The information contained in this e-mail is confidential and may be legally privileged.   If you are not the intended recipient, you must not copy, distribute or take any action in reliance on it.   Messages sent via this medium may be subject to delays, non-delivery and unauthorised alteration.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list