[GE users] qsub/qstat/qacct information sequencing

Daniel Templeton Dan.Templeton at Sun.COM
Tue Nov 15 22:44:51 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]



David Pinsky 288-3739 wrote On 11/15/05 23:25,:

>>>From: Daniel Templeton <Dan.Templeton at Sun.COM>
>>>Subject: Re: [GE users] qsub/qstat/qacct information sequencing
>>>      
>>>
>
>I've seen some other suggestions since you wrote this that tried
>to use the qsub return code.  As others have suggested, that does not work
>(range of job numbers not representable as a unix rc).
>
>The behavior we had used for a long time (in a proprietary DRM) is:
>  Given a certain qsub commandline option (e.g. -B or -batch or ???),
>  output the job number, and only the job number, on standard output
>
>e.g.  (if % is the prompt)
>
>    % qsub -o LOG -j y ~/bin/mycmd		# normal standard output
>    Your job 181615 ("mycmd") has been submitted.
>    % qsub -o LOG -j y -batch ~/bin/mycmd	# *only* job no
>    181616
>    # Allows for easy capture of the job number
>    % jobno=$(qsub -o LOG -j y -batch ~/bin/mycmd)
>    % echo $jobno
>    181617
>
>If qsub itself has an error, I would expect *no* standard output,
>an Error message on standard error, and a non-zero return code.
>Without an error, a return code of zero would be desired.
>
>I think that captures the entirety of the 'little fix' useful 'to so many'
>to which I referred.
>
That sounds like a reasonable feature definition.  If I see no further
contributions to the discussion, I will submit your summary as an RFE.

>
>This leads nicely into an additional capability that I have to construct
>because it does not seem that SGE provides it...
>
>This capturing of job numbers is very useful (at the script level)
>when there is 'wait-on-job' functionality (also at the script level).
>I have constructed this capability (in a somewhat crude fashion),
>and will be looking into "cleaner" implementation using DRMAA.
>
>But I have to be able to call drmaa_wait and drmaa_wexitstatus based solely
>on the job number *string* (ie the jobnum as a shell cmdline argument).  In
>other words, I want to be able to write a small program using
>drmaa_wait/drmaa_wexitstatus that itself does *not* submit the job, but is
>passed the job number, e.g.:
>
>    % jobno=$(qsub -o LOG -j y -batch mycmd)
>    % mydrmaa_cmd -v $jobno	# *ONLY* when jobno completes, output is:
>    181618 Task: 0 SGE: 0 <executionHostName> <DateOfJobCompletion>
>    #  Where 'Task: 0' indicates the return code of mycmd
>    #  Where 'SGE: 0' indicates 'qacct' failed code
>
>David
>

Here is where you'll run into a limitation of DRMAA.  drmaa_wait() and
drmaa_synchronize() only operate on jobs submitted during the current
DRMAA session.  That means that your utility cannot work because the
jobs in which you're interested are always submitted outside of DRMAA. 
Unfortunately, the drmaa_wait() function is specified to work that way. 
drmaa_synchronize() allows reference to outside job ids, but it cannot
give you any exit status information.  Your best bet is to either use
qconf -xml to poll or to extend JAPI to not be bound by session limits
and use that to build your custom app.

Daniel


>>Well, if we can clearly define a simple fix that is generally agreed to
>>be useful, we might be able to slip it into u8.  In any case I will
>>submit an RFE for it.  What behavior did you have in mind?
>>
>>Daniel
>>
>>David Pinsky 288-3739 wrote On 11/14/05 18:17,:
>>
>>    
>>
>>>>>From: "Olesen, Mark" <Mark.Olesen at arvinmeritor.com>
>>>>>Subject: RE: [GE users] qsub/qstat/qacct information sequencing
>>>>>     
>>>>>
>>>>>          
>>>>>
>>> 
>>>
>>>      
>>>
>>>>>The (pseudo-code) sequence I am trying to work with is:
>>>>>
>>>>>jobno = qsub cmd
>>>>>qstat -j $jobno
>>>>>when [qstat reports "Following jobs do not exist: $jobno"]
>>>>> qacct -j $jobno
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>Not an answer, but a question:
>>>>It there a clean way of capturing the job number from a qsub?
>>>>In the past I've resorted to catching and parsing stderr output, but this is
>>>>not exactly 'clean'.
>>>>
>>>>/mark
>>>>
>>>>   
>>>>
>>>>        
>>>>
>>>Like you, we have to process the output, ala (bash-ish):
>>>	jobno=$(qsub [options] cmds | cut -f3 -d" "|cut -f1 -d.)
>>>
>>>Seems like such a little fix would be so useful to so many.
>>>
>>>And much thanks to Daniel Templeton for providing the info
>>>on "Immediate Accounting Data Flushing".
>>>
>>>David
>>>
>>>David Pinsky                                    david_pinsky at agilent.com
>>>Agilent Technologies, ISD                       970-288-3739
>>>4380 Ziegler Road                               970-288-6580 (fax)
>>>Fort Collins, CO  80525-9790                    Mailstop 72
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>> 
>>>
>>>      
>>>
>>-- 
>>***************************************************
>>*        Daniel Templeton   ERGB01 x60220         *
>>*       Staff Engineer, Sun N1 Grid Engine        *
>>***************************************************
>>* "So let the sunshine in.  Face it with a grin.  *
>>*  Smilers never lose, and frowners never win."   *
>>*      -Let the Sunshine In, Pebbles Flintstone   *
>>***************************************************
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>    
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>  
>

-- 
***************************************************
*        Daniel Templeton   ERGB01 x60220         *
*       Staff Engineer, Sun N1 Grid Engine        *
***************************************************
* "So let the sunshine in.  Face it with a grin.  *
*  Smilers never lose, and frowners never win."   *
*      -Let the Sunshine In, Pebbles Flintstone   *
***************************************************



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list