Opened 17 years ago

Last modified 8 years ago

#63 new enhancement

IZ341: Wrong message when interactive job fails to start

Reported by: andy Owned by:
Priority: low Milestone:
Component: sge Version: 5.3
Severity: Keywords: www
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=341]

        Issue #:      341              Platform:     All           Reporter: andy (andy)
       Component:     gridengine          OS:        All
     Subcomponent:    www              Version:      5.3              CC:    None defined
        Status:       NEW              Priority:     P4
      Resolution:                     Issue type:    ENHANCEMENT
                                   Target milestone: ---
      Assigned to:    issues@gridengine
      QA Contact:     issues@gridengine
          URL:
       * Summary:     Wrong message when interactive job fails to start
   Status whiteboard:
      Attachments:

     Issue 341 blocks:
   Votes for issue 341:


   Opened: Mon Aug 5 01:52:00 -0700 2002 
------------------------


When an interactive job fails to start the client
may print the error message

"No free slots for interactive job <jobid>"

However the error message which should be printed
when there is a job start problem should be more
concise about the root ofthe problem.

A HOWTO should describe how a user can be made
aware about the problem. See description below:


When an interactive job is started the client side
does some polling in an
interval to query if the job is already running.
If an interactive job
cannot be scheduled it is removed by qmaster. This
is the indication for the
client that the job could not be scheduled.

However there is a second case where the job is
removed quite quickly from
the job list: if the job immediately fails
(technically it just exits) it is
reported as finished job to qmaster and qmaster
removes the job from its job
list. When the client next time queries qmaster
the job does not exist
anymore and the lcient prints the messages about
the non-schedulable job.

The workaround to find the reason why the job
failed is qacct: If you do a
"qacct -j <jobid>" you will get no accounting
entry if the job could not be
scheduled and you get an accounting entry if the
job was started.

   ------- Additional comments from andreas Thu Oct 21 08:35:19 -0700 2004 -------
Should be mentioned in troubleshooting guide.
But it is a web content issue.

Change History (0)

Note: See TracTickets for help on using tickets.