[GE users] Strange qstat XML output

juby justin.ottley at gmail.com
Tue Feb 16 20:04:14 GMT 2010

Hey all,

Im running SGE 6.1u4; qmaster on linux x86.
I've recently started seeing strange output being returned from detailed 
job info of qstat -xml; what I'm seeing is a tag "<JATASK: 123456.>" 
where 123456 is a job number, instead of the usual tag I'd expect at 
that position: "<ulong_sublist>".

An example XML snippet would look like:

        <JATASK:  376272.>
          </JATASK:  376272.>

Apart from the fact that the output is different and is not consistent 
(doesn't happen for all jobs), the XML fails to parse.

It seems the most common (but not perfectly consistent) qmaster log 
relating to an affected job is:
|W|scheduler tries to change tickets of a non running job 377129 task 
97(state 0)

Another more rare qmaster error for a problem job is:
|E|JEXITING report for job 377129.688: which is in status 0

What is this "state 0", and might it be related to the qstat -xml 
output? Perhaps a damaged BDB job entry?
There doesn't seem to be any mention of a tag "JATASK" in the schema 
definition at $SGE_ROOT/util/resources/schemas/qstat/detailed_job_info.xsd.

My google-fu on this particular phenomenon hasnt turned up much so far 
(minus what appears to be a ganglia script that filters the output for 
this problem)..

The first time I noticed this the sge_schedd was also not running, 
presumed crashed (not sure if related or not yet), the second time was 
after a scheduled shutdown with no obvious problems. No problem jobs 
since then.

Any info appreciated. Thanks,


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list