[GE users] debugging tight integration
harald.pollinger at sun.com
Thu Dec 17 17:07:08 GMT 2009
> pollinger <harald.pollinger at sun.com> writes:
>> I don't know if this was already answered, but when a process was
>> started by SGE and is not a children of the shepherd when it's running,
>> then it detached itself from the shepherd.
>> What's the parent process ID of these processes?
> They (`gamess.64.x' in the pstree below) are just children of init:
> | `-hald-addon-inpu
> | `-qmgr
> [I realize there's junk running as I haven't properly purged our
> so-called integrator's setup yet.]
> I don't see anything odd in the source from a quick look, and was
> particularly interested in typical things that might cause this from
> experience, in the hope of avoiding all the work of debugging it
So the process chain from the sge_execd to the qrsh_starter is fine, but
the job itself (gamess.64.x) is not a child of the qrsh_starter, but a
child of init. And I'm missing a shell at the end of the process chain.
Did you specify the "-shell no" option to qrsh?
It seems either the job script exited/died or gamess daemonized itself.
But then I'm wondering why the qrsh_starter doesn't quit.
You could replace gamess by a script like this:
and start it with exactly the same command line. If it works fine and is
a child (or a child of a child) of the qrsh_starter, gamess itself does
Sun Microsystems GmbH Harald Pollinger
Dr.-Leo-Ritter-Str. 7 Sun Grid Engine Engineering
D-93049 Regensburg Phone: +49 (0)941 3075-209 (x60209)
Germany Fax: +49 (0)941 3075-222 (x60222)
mailto:harald.pollinger at sun.com
Sitz der Gesellschaft:
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Haering
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users