[GE users] Questions about log file: $SGE_ROOT/default/spool/qmaster
reuti at staff.uni-marburg.de
Tue May 31 19:08:05 BST 2005
Viktor Oudovenko wrote:
>>- There is no need for using the $SGE_ROOT/mpi/mpirun any longer. The
>>default $SGE_ROOT/mpi should handle this as well.
> Could you explain this better?
> I user only mpistart.sh and mpistop.sh from $SGE_ROOT/mpi and surely I
> used mpich.template to create my ones.
Typo: I meant $SGE_ROOT/mpi/myrinet. The only thing I know (was told)
is, that in former times you had to reserve a port on the card, because
they were very limited, today the provide enough
>>It really seems, that in some versions the "exec", which I
>>put into the Perl script, is in already. Plz have a look at
>>"Tight Integration of MPICH and SGE" for explanations.
> I'll take a look on http://gridengine.sunsource.net/ right?
Yes, in detail:
>>- info2.txt looks good. This should be possible to stop with
>>built-in terminate_method. The jobs will just continue to run, if you
>>don't set SIGTERM? Are they bound to the shepherd or having a
>>ppid of 1?
> All the processes are bound to the shepherd. At least they do not pave
> parent pid =1 .
> Processes could be killed with NONE option but not always it was clean. With
> SIGTERM it works 100%.
> For now I decided to keep it as it is.
This way I think you will invoke the MPICH built-in shutdown like
pressing CTRL-C in interactive mode. So they will shutdown on their own.
But it may happen, that this is not working under all circumstances.
Having it set to NONE, it has to work, if the jobs are tightly integrated.
Besides being still a child of the shepherd, they also must have the
same processgroup as the first child of qrsh_starter. Otherwise the kill
of the processgroup by SGE will not kill all children, but only the
first and the survivors will then get a ppid of 1.
Best greetings - Reuti
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users