[GE users] qsub/qstat/qacct information sequencing

Daniel Templeton Dan.Templeton at Sun.COM
Wed Nov 16 19:17:15 GMT 2005


See Issue 1899:

http://gridengine.sunsource.net/issues/show_bug.cgi?id=1899

Daniel

David Pinsky 288-3739 wrote On 11/15/05 23:25,:

>>>From: Daniel Templeton <Dan.Templeton at Sun.COM>
>>>Subject: Re: [GE users] qsub/qstat/qacct information sequencing
>>>      
>>>
>
>I've seen some other suggestions since you wrote this that tried
>to use the qsub return code.  As others have suggested, that does not work
>(range of job numbers not representable as a unix rc).
>
>The behavior we had used for a long time (in a proprietary DRM) is:
>  Given a certain qsub commandline option (e.g. -B or -batch or ???),
>  output the job number, and only the job number, on standard output
>
>e.g.  (if % is the prompt)
>
>    % qsub -o LOG -j y ~/bin/mycmd		# normal standard output
>    Your job 181615 ("mycmd") has been submitted.
>    % qsub -o LOG -j y -batch ~/bin/mycmd	# *only* job no
>    181616
>    # Allows for easy capture of the job number
>    % jobno=$(qsub -o LOG -j y -batch ~/bin/mycmd)
>    % echo $jobno
>    181617
>
>If qsub itself has an error, I would expect *no* standard output,
>an Error message on standard error, and a non-zero return code.
>Without an error, a return code of zero would be desired.
>
>I think that captures the entirety of the 'little fix' useful 'to so many'
>to which I referred.
>
>This leads nicely into an additional capability that I have to construct
>because it does not seem that SGE provides it...
>
>This capturing of job numbers is very useful (at the script level)
>when there is 'wait-on-job' functionality (also at the script level).
>I have constructed this capability (in a somewhat crude fashion),
>and will be looking into "cleaner" implementation using DRMAA.
>
>But I have to be able to call drmaa_wait and drmaa_wexitstatus based solely
>on the job number *string* (ie the jobnum as a shell cmdline argument).  In
>other words, I want to be able to write a small program using
>drmaa_wait/drmaa_wexitstatus that itself does *not* submit the job, but is
>passed the job number, e.g.:
>
>    % jobno=$(qsub -o LOG -j y -batch mycmd)
>    % mydrmaa_cmd -v $jobno	# *ONLY* when jobno completes, output is:
>    181618 Task: 0 SGE: 0 <executionHostName> <DateOfJobCompletion>
>    #  Where 'Task: 0' indicates the return code of mycmd
>    #  Where 'SGE: 0' indicates 'qacct' failed code
>
>David
>
>
>  
>
>>Well, if we can clearly define a simple fix that is generally agreed to
>>be useful, we might be able to slip it into u8.  In any case I will
>>submit an RFE for it.  What behavior did you have in mind?
>>
>>Daniel
>>
>>David Pinsky 288-3739 wrote On 11/14/05 18:17,:
>>
>>    
>>
>>>>>From: "Olesen, Mark" <Mark.Olesen at arvinmeritor.com>
>>>>>Subject: RE: [GE users] qsub/qstat/qacct information sequencing
>>>>>     
>>>>>
>>>>>          
>>>>>
>>> 
>>>
>>>      
>>>
>>>>>The (pseudo-code) sequence I am trying to work with is:
>>>>>
>>>>>jobno = qsub cmd
>>>>>qstat -j $jobno
>>>>>when [qstat reports "Following jobs do not exist: $jobno"]
>>>>> qacct -j $jobno
>>>>>     
>>>>>
>>>>>          
>>>>>
>>>>Not an answer, but a question:
>>>>It there a clean way of capturing the job number from a qsub?
>>>>In the past I've resorted to catching and parsing stderr output, but this is
>>>>not exactly 'clean'.
>>>>
>>>>/mark
>>>>
>>>>   
>>>>
>>>>        
>>>>
>>>Like you, we have to process the output, ala (bash-ish):
>>>	jobno=$(qsub [options] cmds | cut -f3 -d" "|cut -f1 -d.)
>>>
>>>Seems like such a little fix would be so useful to so many.
>>>
>>>And much thanks to Daniel Templeton for providing the info
>>>on "Immediate Accounting Data Flushing".
>>>
>>>David
>>>
>>>David Pinsky                                    david_pinsky at agilent.com
>>>Agilent Technologies, ISD                       970-288-3739
>>>4380 Ziegler Road                               970-288-6580 (fax)
>>>Fort Collins, CO  80525-9790                    Mailstop 72
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>> 
>>>
>>>      
>>>
>>-- 
>>***************************************************
>>*        Daniel Templeton   ERGB01 x60220         *
>>*       Staff Engineer, Sun N1 Grid Engine        *
>>***************************************************
>>* "So let the sunshine in.  Face it with a grin.  *
>>*  Smilers never lose, and frowners never win."   *
>>*      -Let the Sunshine In, Pebbles Flintstone   *
>>***************************************************
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>    
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>  
>

-- 
***************************************************
*        Daniel Templeton   ERGB01 x60220         *
*       Staff Engineer, Sun N1 Grid Engine        *
***************************************************
* "So let the sunshine in.  Face it with a grin.  *
*  Smilers never lose, and frowners never win."   *
*      -Let the Sunshine In, Pebbles Flintstone   *
***************************************************



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list