Opened 19 years ago

Last modified 11 years ago

#29 new enhancement

IZ224: qstat -j redesign

Reported by: ernst Owned by:
Priority: low Milestone:
Component: sge Version: 5.3
Severity: Keywords: clients


[Imported from gridengine issuezilla]

        Issue #:      224              Platform:     All           Reporter: ernst (ernst)
       Component:     gridengine          OS:        All
     Subcomponent:    clients          Version:      5.3              CC:
                                                                             [_] nmm
                                                                             [_] uddeborg
                                                                             [_] Remove selected CCs
        Status:       NEW              Priority:     P4
      Resolution:                     Issue type:    ENHANCEMENT
                                   Target milestone: ---
      Assigned to:    andreas (andreas)
      QA Contact:     roland
       * Summary:     qstat -j redesign
   Status whiteboard:

     Issue 224 blocks:
   Votes for issue 224:

   Opened: Tue Apr 9 00:30:00 -0700 2002 

Several support requests and messages in the support forum
indicate that there is a lack of information about jobs during all phases of existence. Some of them could be meet by redesigning the "qstat -j" command.

- job logging to trace the full life cycle of a job
- missing attributes in qstat -j output
- qstat -j and qmon output seem to be out of sync in SGEEE
  clusters. According to the qstat -j output a job X seems to
  be running even if X is shown as finished job within qmon
- currently we have no job state information in the
  qstat -j output
- Global scheduling messages (information about queues, hosts,...)
  should not be shown if a user asks for a specific job
  (qstat -j <jobid>)

   ------- Additional comments from sgrell Tue Dec 6 08:37:48 -0700 2005 -------
Changed subcomponent.


   ------- Additional comments from ernst Tue Aug 19 02:45:30 -0700 2008 -------
User nmm reported:

I don't know what 'qstat -j' is intended for, but
its use for finding
out why a job won't run appears minimal.  Here, it
prints 32 lines of
irrelevant messages and none of the important
ones.  What it seems to
do is to tell you every queue that is disabled,
not available or full,
IRRESPECTIVE of whether it meets the requirements
of the job.

But it does NOT tell you about why a job will not
run if it tries a
queue and the job gets into error state.  You have
to look in the
qmaster messages file for that.  We had some very
confused users and
it took me over an hour to work out why their jobs
would not work
but mine did.

Now, if it is intended for user use, this is
precisely what it should
NOT be doing.  If the failure to run is clearly
job-related (as a job
being in error state is), it should NOT display
what it might have
done were the configuration different and it
should display WHY the
job got into error state.

   ------- Additional comments from ernst Tue Aug 19 02:47:32 -0700 2008 -------
*** Issue 491 has been marked as a duplicate of this issue. ***

   ------- Additional comments from ernst Thu Aug 21 01:49:28 -0700 2008 -------
When I do "qstat -j" on a pending job, I get information why this job could not
be scheduled last time.  When I do it on a running job some scheduler
information is still shown.  I think all scheduler information should be
suppressed for running jobs, it isn't relevant for the job any more.

(A short thread on the subject can be found here:

   ------- Additional comments from ernst Thu Aug 21 01:51:55 -0700 2008 -------
Added CC from # 1756

   ------- Additional comments from ernst Thu Aug 21 01:52:11 -0700 2008 -------
*** Issue 1756 has been marked as a duplicate of this issue. ***

Change History (0)

Note: See TracTickets for help on using tickets.