[GE users] qstat bug found?

Chris Dagdigian dag at sonsorol.org
Fri Apr 15 03:31:01 BST 2005


Can someone replicate this just to make sure I'm not seeing things? I'll 
file the bugzilla report if this is real.

Basically qstat with the "-xml" option works great in all invocations 
except seemingly when there is any sort of job in state "E".

-Chris


I'm writing XSL translators for qstat HTML output and was just ready to 
put my demo CGI online when I found something interesting:

On 2 systems I tested (Suse SLES on SGI Altix & Apple OS X Server) the 
following seems to occur:

  $ qstat -f -xml

Will fail with a sefgault (on SGI Altix) or a malloc error on OS X if 
any running or pending job is  in error state E

The quickest way for me to build a test case with a job in E state was 
just to submit a job as a user that exists on the sgemaster but not on 
the exec hosts.

Without "-xml":

> dag at xxx:~/public_html/xml-qstat> qstat -f
> queuename                      qtype used/tot. load_avg arch          states
> ----------------------------------------------------------------------------
> all.q at node249.cluster.private  BIP   0/2       0.00     lx26-amd64    
> ----------------------------------------------------------------------------
> all.q at xxx.cluster.private  BIP   2/2       0.00     lx26-amd64    
>     126 0.55500 Sleeper    dag          r     04/14/2005 15:19:53     1        
>     129 0.55500 Sleeper    dag          r     04/14/2005 15:19:53     1        
> 
> ############################################################################
>  - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
> ############################################################################
>     127 0.00000 Sleeper    dag          Eqw   04/14/2005 15:19:50     1        
>     128 0.00000 Sleeper    dag          Eqw   04/14/2005 15:19:51     1        
>     130 0.00000 Sleeper    dag          Eqw   04/14/2005 15:19:52     1        
>     131 0.00000 Sleeper    dag          Eqw   04/14/2005 15:19:53     1        
>     132 0.00000 Sleeper    dag          Eqw   04/14/2005 15:19:53     1        
> dag at xxx:~/public_html/xml-qstat> 

With the "-xml" switch:

> dag at xxx:~/public_html/xml-qstat> qstat -f -xml
> Segmentation fault
> dag at xxx:~/public_html/xml-qstat> 

Same problem occurs on an OS X system but the error is different:

> workgroupcluster:~ bioteam$ qstat -f -xml
> *** malloc[4996]: Deallocation of a pointer not malloced: 0x8840; This could be a double free(), or free() called with the middle of an allocated block; Try setting environment variable MallocHelp to see tools to help debug
> *** malloc[4996]: Deallocation of a pointer not malloced: 0x21e94; This could be a double free(), or free() called with the middle of an allocated block; Try setting environment variable MallocHelp to see tools to help debug
> Bus error



-Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list