[GE users] pe_hostfile contains wrong hosts

John Hearns john.hearns at streamline-computing.com
Fri Mar 23 08:41:29 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

mac muff wrote:
> I submitted an MPICH job with a machine file that lists only a subset of 
> all my available compute nodes. The job executed by mpirun runs properly 
> on that list of nodes, but the PE_HOSTFILE created by the GRID Engine 
> wrongly shows other nodes that are not in the machine file are being 
> used. "QSTAT" also wrongly displays the queues being used as the queues 
> from the created PE_HOSTFILE.
> 

I think you have it the wrong way round.
SGE allocates slots (and therefore nodes) for your parallel job.
It assembles a machines file for you - which is PE_HOSTFILE.
You then use this hostfile to mpirun.
Choosing your own set of hosts - how does SGE know about them?
And that will result in what you see.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list