[GE users] ge6 mpi job startup trouble

Jason Crane Jason.Crane at mrsc.ucsf.edu
Wed Nov 3 00:14:38 GMT 2004


Hi,  

Yes, this is apparently just a hostname resolution problem.  
'hostname' returns FQDN on our nodes, though the pe_hostfile 
contains only short names.  David S. suggested making the  
following change to startmpi.sh so that FQDN's
are written to the machine file in PeHostfile2MachineFile(), 

<   #host=`echo $line|cut -f1 -d" "|cut -f1 -d"."`
<    host=`echo $line|cut -f1 -d" "`

but our PeHostfile contains only short names to begin with 
so the following solution seems to resolve the problem as 
well:
   
   #echo $host
   echo $host.domain.org

This looks to be essentially the same as the solution you 
suggest for MPI_HOST in mpich.args.  In the near future we 
will be spanning multiple domains so I'll need to devise a 
solution other than hard coding a domain name.  This works 
for now though.

Thanks for all the help!
-Jason
 
 

>From: Reuti <reuti at staff.uni-marburg.de>
>Hi again,
>
>> host1.domain.org 0 test_mpich
>> host2 1 test_mpich
>> host3 1 test_mpich
>> host4 1 test_mpich
>> host5 1 test_mpich
>> host6 1 test_mpich
>> host7 1 test_mpich
>> host2 1 test_mpich
>
>what's in /etc/hosts? We set only the hostname without any 
domain and it's 
>working this way for us.
>
>MPICH does the following: use the `hostname` and skip one 
of these entries 
>during the first scan of the machinefile. What is 
`hostname` on host1? It seems 
>to be completely skipped during the further scan (hence 
gives you a second 
>host2).
>
>If necessary, you can also tell MPICH that it's running on 
node1 (without the 
>domain) by inserting one line at the beginning of 
mpich.args like:
>
>MPI_HOST=`hostname | sed "s/.domain.org//"`
>
>
>Cheers - Reuti
>
>-----------------------------------------------------------
----------
>To unsubscribe, e-mail: 
users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: 
users-help at gridengine.sunsource.net
>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list