[GE users] Strange qstat XML output
justin.ottley at gmail.com
Tue Feb 16 20:04:14 GMT 2010
Im running SGE 6.1u4; qmaster on linux x86.
I've recently started seeing strange output being returned from detailed
job info of qstat -xml; what I'm seeing is a tag "<JATASK: 123456.>"
where 123456 is a job number, instead of the usual tag I'd expect at
that position: "<ulong_sublist>".
An example XML snippet would look like:
Apart from the fact that the output is different and is not consistent
(doesn't happen for all jobs), the XML fails to parse.
It seems the most common (but not perfectly consistent) qmaster log
relating to an affected job is:
|W|scheduler tries to change tickets of a non running job 377129 task
Another more rare qmaster error for a problem job is:
|E|JEXITING report for job 377129.688: which is in status 0
What is this "state 0", and might it be related to the qstat -xml
output? Perhaps a damaged BDB job entry?
There doesn't seem to be any mention of a tag "JATASK" in the schema
definition at $SGE_ROOT/util/resources/schemas/qstat/detailed_job_info.xsd.
My google-fu on this particular phenomenon hasnt turned up much so far
(minus what appears to be a ganglia script that filters the output for
The first time I noticed this the sge_schedd was also not running,
presumed crashed (not sure if related or not yet), the second time was
after a scheduled shutdown with no obvious problems. No problem jobs
Any info appreciated. Thanks,
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users