[GE users] possible SGE 6.2u2_1 bug

crei crei at sun.com
Mon Jun 22 16:08:31 BST 2009


I think your "T" is a job state, but "-qs" switch talks about queue state ...
Your example will also filter out job states!


Anyhow:
qstat -help:

  [-s {p|r|s|z|hu|ho|hs|hd|hj|ha|h|a}] show pending, running, suspended, zombie jobs,
                                           jobs with a user/operator/system/array-dependency hold,
                                           jobs with a start time in future or any combination only.
                                           h is an abbreviation for huhohshdhjha
                                           a is an abbreviation for prsh


Is a "qstat -s s" also working for you ?


Following state symbols are defined:

(see file symbols.h (source/common))

#define ALARM_SYM                          'a'
#define SUSPEND_ALARM_SYM                  'A'
#define SUSPEND_ON_COMP_SYM                'c'  /* NOT PART OF P1003.15D12! */
#define SUSPENDED_ON_CALENDAR_SYM          'C'  /* NOT PART OF P1003.15D12! */
#define DISABLED_SYM                       'd'
#define DISABLED_ON_CALENDAR_SYM           'D'  /* NOT PART OF P1003.15D12! */
#define ENABLED_SYM                        'e'
#define HELD_SYM                           'h'
#define MIGRATING_SYM                      'm'  /* NOT PART OF P1003.15D12! */
#define QUEUED_SYM                         'q'
#define RESTARTING_SYM                     'R'  /* NOT PART OF P1003.15D12! */
#define RUNNING_SYM                        'r'
#define SUSPENDED_SYM                      's'  /* NOT PART OF P1003.15D12! */
#define SUSPENDED_ON_SUBORDINATE_SYM       'S'
#define SUSPENDED_ON_THRESHOLD_SYM         'T' /* NOT PART OF P1003.15D12! */
#define TRANSISTING_SYM                    't'
#define UNKNOWN_SYM                        'u'
#define WAITING_SYM                        'w'
#define EXITING_SYM                        'x'  /* NOT P1003.15D12 compliant! 'e' */
#define ERROR_SYM                          'E'


If you want to enhance the qstat -s switch please file an enhancment issue at
http://gridengine.sunsource.net/


Thanks,

Christian





On 06/22/09 16:18, adary wrote:
> Actually T is a valid state. Job that is suspended due to load on the host is in state T :
> 
> adary at adary-lnx:~$ qstat -q bulk -u \* | grep " T "
> 1964980 0.50000 runtst     henri        T     06/22/2009 16:37:20 bulk at lnx83.il.marvell.com          1
> 1965489 0.50000 runtst.exe ilanmf       T     06/22/2009 16:52:31 bulk at lnx142.il.marvell.com         1
> 1965494 0.50000 runtst.exe ilanmf       T     06/22/2009 16:52:31 bulk at lnx142.il.marvell.com         1
> 1966036 0.50000 runtst     yehudab      T     06/22/2009 17:14:52 bulk at lnx156.il.marvell.com         1
> 1965612 0.50000 remote_run mohammad     T     06/22/2009 16:58:09 bulk at lnx172.il.marvell.com         1
> 1965844 0.50000 runtst     davidp       T     06/22/2009 17:06:39 bulk at lnx172.il.marvell.com         1
> 1965883 0.50000 runtst     davidp       T     06/22/2009 17:09:09 bulk at lnx69.il.marvell.com          1
> 1965013 0.50000 runtst     henri        T     06/22/2009 16:38:20 bulk at lnx103.il.marvell.com         1
> 1965019 0.50000 runtst     henri        T     06/22/2009 16:38:34 bulk at lnx103.il.marvell.com         1
> 1965022 0.50000 runtst     henri        T     06/22/2009 16:38:34 bulk at lnx103.il.marvell.com         1
> 1965550 0.50000 runtst     reem         T     06/22/2009 16:54:19 bulk at lnx99.il.marvell.com          1
> 1945064 0.50000 runtst     sami         T     06/21/2009 23:29:20 bulk at lnx88.il.marvell.com          1
> 1963798 0.50000 remote_run mohammad     T     06/22/2009 15:42:21 bulk at lnx88.il.marvell.com          1
> 1965903 0.50000 remote_run mohammad     T     06/22/2009 17:10:25 bulk at lnx109.il.marvell.com         1
> 1965972 0.50000 remote_run mohammad     T     06/22/2009 17:12:27 bulk at lnx109.il.marvell.com         1
> 1965973 0.50000 remote_run mohammad     T     06/22/2009 17:12:27 bulk at lnx109.il.marvell.com         1
> 1965875 0.50000 remote_run mohammad     T     06/22/2009 17:08:54 bulk at lnx130.il.marvell.com         1
> 1964967 0.50000 remote_run mohammad     T     06/22/2009 16:37:20 bulk at lnx188.il.marvell.com         1
> 1964979 0.50000 remote_run mohammad     T     06/22/2009 16:37:20 bulk at lnx188.il.marvell.com         1
> 1965219 0.50000 runtst     henri        T     06/22/2009 16:44:29 bulk at lnx189.il.marvell.com         1
> 1965223 0.50000 runtst     henri        T     06/22/2009 16:44:29 bulk at lnx189.il.marvell.com         1
> 1965028 0.50000 runtst     henri        T     06/22/2009 16:39:00 bulk at lnx151.il.marvell.com         1
> 1965042 0.50000 runtst     henri        T     06/22/2009 16:39:15 bulk at lnx152.il.marvell.com         1
> 1965556 0.50000 remote_run mohammad     T     06/22/2009 16:54:19 bulk at lnx174.il.marvell.com         1
> 1965720 0.50000 remote_run mohammad     T     06/22/2009 17:02:04 bulk at lnx176.il.marvell.com         1
> 1965727 0.50000 remote_run mohammad     T     06/22/2009 17:02:04 bulk at lnx176.il.marvell.com         1
> 1965073 0.50000 runtst     henri        T     06/22/2009 16:40:00 bulk at lnx191.il.marvell.com         1
> 1965101 0.50000 runtst     henri        T     06/22/2009 16:40:54 bulk at lnx191.il.marvell.com         1
> 1965516 0.50000 runtst     reem         T     06/22/2009 16:53:16 bulk at lnx196.il.marvell.com         1
> 1965519 0.50000 runtst.exe ilanmf       T     06/22/2009 16:53:16 bulk at lnx196.il.marvell.com         1
> 1965523 0.50000 runtst.exe ilanmf       T     06/22/2009 16:53:16 bulk at lnx196.il.marvell.com         1
> 
> The question is why doesn't qselect recognize T as a valid job state (or qstat)
> 
> -----Original Message-----
> From: Christian.Reissmann at Sun.COM [mailto:Christian.Reissmann at Sun.COM] On Behalf Of crei
> Sent: Monday, June 22, 2009 5:06 PM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] possible SGE 6.2u2_1 bug
> 
> Hi,
> 
>  > qselect -help
> SGE 6.2
> usage: qselect [options]
>          [-help]                           print this help
>          [-l resource_list]                request the given resources
>          [-pe pe_list]                     select only queues with one of these parallel environments
>          [-q wc_queue_list]                print information on given queue
>          [-qs {a|c|d|o|s|u|A|C|D|E|S}]     selects queues, which are in the given state(s)
>          [-U user_list]                    select only queues where these users have access
> 
> => "T" is no valid queue state. Please substantiate your problem a bit more.
> 
> 
> Christian
> 
> 
> On 06/22/09 13:58, adary wrote:
>> I don't think I ever checked this in a previous version, but I cant do
>> qselect -qs T
>>
>>
>>
>> This can be very useful in my setup to see how many jobs are suspended
>> at any given time.
>>
>>
>>
>> I have a workaround ofcourse, but it would be nice to have this feature
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> Yuval Adar, Marvell Israel - Senior UNIX System Administrator
>> 6 Hamada Street
>>
>> Mordot HaCarmel Industrial Park
>>
>> Yokneam, 20692, Israel
>> Email: adary at marvell.com <mailto:adary at marvell.com>
>> Office:  +972.4.9091188 - OnNet: 704.1188
>>
>> Fax:      +972.4.9091501
>> Mobile: +972.54.2493958
>> Web site: http://www.marvell.com <http://www.marvell.com/>
>>
>>
>> This message may contain confidential, proprietary or legally privileged
>> information. The information is intended only for the use of the
>> individual or entity named above. If the reader of this message is not
>> the intended recipient, you are hereby notified that any dissemination,
>> distribution or copying of this communication is strictly prohibited. If
>> you have received this communication in error, please notify us
>> immediately by telephone or by e-mail and delete the message from your
>> computer.
>>
>> ------------------------------------------------------------------------
>>
>>
>>
> 
> --
> Sun Microsystems GmbH             Christian Reissmann
> Dr.-Leo-Ritter-Str. 7             Software Engineer
> D-93049 Regensburg                Phone: +49 (0)941 3075 112
> Germany                           Fax:   +49 (0)941 3075 222
> http://www.sun.de                 mailto: Christian.Reissmann at sun.com
>                                    http://www.sun.com/gridengine
> Sitz der Gesellschaft:
> Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
> Amtsgericht Muenchen: HRB 161028
> Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
> Vorsitzender des Aufsichtsrates: Martin Haering
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=202922
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=202925
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

-- 
Sun Microsystems GmbH             Christian Reissmann
Dr.-Leo-Ritter-Str. 7             Software Engineer
D-93049 Regensburg                Phone: +49 (0)941 3075 112
Germany                           Fax:   +49 (0)941 3075 222
http://www.sun.de                 mailto: Christian.Reissmann at sun.com
                                   http://www.sun.com/gridengine
Sitz der Gesellschaft:
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Haering

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=202931

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list