[GE users] OpenMPI job on stay on one node

sgexav xaviercouvelard at gmail.com
Mon Sep 7 14:18:34 BST 2009


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

reuti a écrit :
> Am 07.09.2009 um 13:32 schrieb sgexav:
>
>   
>>> <snip>
>>> as Lydia wrote: you don't need this argument, just leave the option -
>>> machinefile ... out. Open MPI will detect the granted nodes on its
>>> own from the original pe_hostfile. The $TMPDIR/machines would be
>>> created by the start_proc_args for other MPI libraries, but can be
>>> left out here hence the file won't be create
>>>
>>>       
>> OK, doing it that way with "pe orte" et without mychinefile in mpirun
>> command
>> i see my run starting on the nodes, but i get this error
>> error: error: ending connection before all data received
>> error:
>> error reading job context from "qlogin_starter"
>> error: error: ending connection before all data received
>> error: error: ending connection before all data received
>> error:
>> error reading job context from "qlogin_starter"
>> error: error: ending connection before all data received
>> error:
>> error reading job context from "qlogin_starter"
>> error: error: ending connection before all data received
>> error:
>> error reading job context from "qlogin_starter"
>> error: error: ending connection before all data received
>> error:
>> error reading job context from "qlogin_starter"
>> error:
>> error reading job context from "qlogin_starter"
>> ---------------------------------------------------------------------- 
>> ----
>> A daemon (pid 11082) died unexpectedly with status 1 while attempting
>> to launch so we are aborting.
>>
>> What doe it mean?
>>     
>
> Did you redefine the settings of (here the 6.2u3 setup with the  
> builtin method in former versions it was different):
>
> $ qconf -sconf
> #global:
> ...
> qlogin_command               builtin
> qlogin_daemon                builtin
> rlogin_command               builtin
> rlogin_daemon                builtin
> rsh_command                  builtin
> rsh_daemon                   builtin
>
> -- Reuti
>
>   
i am using 6.2u2 dilivered with Rocks 5.2
qconf -sconf gave:

qlogin_command               builtin
qlogin_daemon                builtin
rlogin_command               builtin
rlogin_daemon                builtin
rsh_command                  builtin
rsh_daemon                   builtin
but also:
qrsh_command                 /usr/bin/ssh
rsh_command                  /usr/bin/ssh
rlogin_command               /usr/bin/ssh

i tried to suppress those last 3 line without any succes....
X.
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=216240
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=216250

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list