[GE users] MPI problems persist

heine heine at sun.ac.za
Mon Nov 15 06:04:34 GMT 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Reuti,

No, the firewall does not restrict traffic on the 'private' network, and I do not have SELinux enabled. I already mentioned that I can run mpiexec exec with any program successfully with -np (x) and --hostfile (xxxx). It just does not work when I submit the jib to Grid Engine. I can also run it successfully using a test configuration using Torque?

Thank you
Heine

On Sun, 2010-11-14 at 17:21 +0200, reuti wrote:


Am 10.11.2010 um 07:09 schrieb heine:

> <snip>
> The statement is 'This may be because the daemon was unable to find all the needed shared libraries on the remote node.' And I guess it could have been, but for something as simple as the hostname command to fail, seems to point to something else?

Possibly. Did you move your Open MPI installation after you compiled it to a different location?

Do you have any firewall on the machines or SELinux enabled?

-- Reuti


>>> location of the shared libraries on the remote nodes and this will
>>> automatically be forwarded to the remote nodes.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> mpirun noticed that the job aborted, but has no info as to the process
>>> that caused that situation.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> mpirun was unable to cleanly terminate the daemons on the nodes shown
>>> below. Additional manual cleanup may be required - please refer to
>>> the "orte-clean" tool for assistance.
>>> --------------------------------------------------------------------------
>>>         comp019 - daemon did not report back when launched
>>>         comp017 - daemon did not report back when launched
>>>
>>> Thanks
>>> Heine
>>>
>>>
>
>
> --
>
> Heine de Jager  * Stelsel Administrateur * Universiteit Stellenbosch * Tel: 021 808 4989

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=295633

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net<mailto:users-unsubscribe at gridengine.sunsource.net>].



--

Heine de Jager  * Stelsel Administrateur * Universiteit Stellenbosch * Tel: 021 808 4989



More information about the gridengine-users mailing list