[GE users] SGE capability question

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Fri Dec 14 13:45:28 GMT 2007


Hi Dan,

On Fri, 14 Dec 2007, Dan McMahill wrote:

> I've been reading various docs and am a bit overwhelmed.  Before I go 
> further, I have a basic question about capability.
>
> We have SGE deployed and I use qsub to submit jobs all the time.  qstat to 
> check on jobs, and qdel to remove them from the queue.  So far the jobs have 
> been fairly self contained.  But now I'm interested in writing a short 
> program probably in perl (but ruby or maybe just /bin/sh would be ok) that 
> has a way of submitting jobs and monitoring their status. Is this something 
> SGE can easily do?  If so, what documents should I be reading to get going?

this is what DRMAA API is for. It provides you operations for job submission, 
status control/monitoring plus a wait operation that mimics waitpid(2). Under

    http://www.ogf.org/documents/GFD.22.pdf

you find the standard language independent interface specification. With 
regards to language bindings you can choose between C, Java, Perl, Python 
and Ruby. Online manuals for C and Java or links to scripting lanugage 
wrappers can be found under "Programming interfaces" of

    http://gridengine.sunsource.net/documentation.html

plus tutorials for C and Java under

    http://gridengine.sunsource.net/howto/drmaa.html
    http://gridengine.sunsource.net/howto/drmaa_java.html

> If I were just writing my script for a multi-processor machine, I'd just use 
> fork/join to keep N jobs running at a time and monitor their status with the 
> parent process, but I'm not sure how to do that with grid engine.  In other 
> words, I can make my program submit all of these jobs to the queue, but I 
> don't know how to automatically monitor the results of the job short of some 
> hacks where I have a file in a shared file system that I monitor.

Status monitoring can done with drmaa_job_ps(3)

    http://gridengine.sunsource.net/nonav/source/browse/~checkout~/gridengine/doc/htmlman/htmlman3/drmaa_job_ps.html

or drmaa_wait(3)

    http://gridengine.sunsource.net/nonav/source/browse/~checkout~/gridengine/doc/htmlman/htmlman3/drmaa_wait.html

if you just need to synchronize with job finishing.

In case you are looking for a rather comprehensive DRMAA sample application that 
includes submission of new jobs upon jobs finishing flow.rb could be interesting

    http://drmaa4ruby.sunsource.net/source/browse/drmaa4ruby/src/samples/flow/flow.rb?rev=1.1&view=markup

> The application is running the same calculation (which is done via an 
> external program that may run for hours at a time) for different sets of 
> input parameters and after all jobs have completed, assembling the results. 
> Actually it would be neat if results were assembled as each piece completed 
> so I could get a partial picture along the way.

Should be doable. Under

    http://gridengine.sunsource.net/source/browse/gridengine/source/libs/japi/test_drmaa_issue1832.c?view=markup

there is a sample that illustrates how DRMAA can be used to implementing kind
of a progress bar for compound jobs.

Regards,
Andreas

http://gridengine.info/

Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list