[GE users] Determining the failure states of completed jobs in SGE 5.3

Dennis Williams dennis.williams at bjss.co.uk
Thu Jun 7 10:46:27 BST 2007

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

My team are in the process of building an application that submits jobs to node clusters running SGE 5.3. One of the requirements is to monitor the status of a job (throughout its lifecycle) that has been submitted to the SGE.
Using the "qstat" command it is possible to determine if a job is currently waiting in a queue or running on a node, but once the job has completed I would like to be able to determine if the job has completed successfully or with errors. I understand that once a job has completed two files are written on the compute node containing the stdout and stderr, but our application will not have access to these nodes as they are on private networks.
So my question is:
1) Does SGE 5.3 provide commands (or techniques) that would enable clients to determine if a job has completed with or without errors?
2) Does SGE 5.3 provide commands (or techniques) that would enable clients to access the stdout and stderr files for jobs that have completed?
Further to my Question...
The documentation suggests that the "qacct" command provides the client with information about jobs that have completed. However one of our cluster administrators has explained that this command can only be run on the "head  node" which is not an acceptable option for us. 
Many Thanks

BJSS Limited, 1st Floor Coronet House, Queen Street, Leeds LS1 2TW.
Registered in England with company number 2777575.

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list