[GE users] Questions about log file: $SGE_ROOT/default/spool/qmaster

Reuti reuti at staff.uni-marburg.de
Tue May 31 19:08:05 BST 2005


Hi Viktor!

Viktor Oudovenko wrote:
<snip>
>>- There is no need for using the $SGE_ROOT/mpi/mpirun any longer. The
>>default $SGE_ROOT/mpi should handle this as well.
> 
> 
> Could you explain this better? 
> I user only mpistart.sh and mpistop.sh from  $SGE_ROOT/mpi  and surely I
> used mpich.template to create my ones.

Typo: I meant $SGE_ROOT/mpi/myrinet. The only thing I know (was told) 
is, that in former times you had to reserve a port on the card, because 
they were very limited, today the provide enough

>>It really seems, that in some versions the "exec", which I
>>suggested to 
>>put into the Perl script, is in already. Plz have a look at  
>>the Howto 
>>"Tight Integration of MPICH and SGE" for explanations.
> 
> I'll take a look on http://gridengine.sunsource.net/  right? 

Yes, in detail:

http://gridengine.sunsource.net/howto/mpich-integration.html

>>- info2.txt looks good. This should be possible to stop with 
>>the default 
>>built-in terminate_method. The jobs will just continue to run, if you 
>>don't set SIGTERM? Are they bound to the shepherd or having a 
>>ppid of 1?
> 
> 
> All the processes are bound to the shepherd. At least they do not pave
> parent pid =1 .
> Processes could be killed with NONE option but not always it was clean. With
> SIGTERM it works 100%.
> For now I decided to keep it as it is.

This way I think you will invoke the MPICH built-in shutdown like 
pressing CTRL-C in interactive mode. So they will shutdown on their own. 
But it may happen, that this is not working under all circumstances. 
Having it set to NONE, it has to work, if the jobs are tightly integrated.

Besides being still a child of the shepherd, they also must have the 
same processgroup as the first child of qrsh_starter. Otherwise the kill 
of the processgroup by SGE will not kill all children, but only the 
first and the survivors will then get a ppid of 1.

Best greetings - Reuti


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list