[GE users] again on PVM and integration with sge

davide cittaro daweonline at gmail.com
Mon Apr 10 13:17:01 BST 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

On 4/10/06, Reuti <reuti at staff.uni-marburg.de> wrote:
> To the other mail: as I wrote, I meant of course an execution host
> when I mentioned the "headnode of your parallel job".

Then it happened because pvmd started on head node but then failed on
slave... PVM_ROOT was not passed... Maybe I wrongly understood why you
were confused :-)

>
> Am 10.04.2006 um 13:41 schrieb davide cittaro:
>
> > whoops!
> >  cat /tmp/pvml.2486
> > [t80040000] 04/10 13:33:00 node1.sge.ifom-ieo-campus.it
> > (85.239.175.37:35452) LINUX64 3.4.5
> > [t80040000] 04/10 13:33:00 ready Mon Apr 10 13:33:00 2006
> >
> > [t80000000] 04/10 13:33:00 stderr at node2.sge.ifom-ieo-campus.it:
> > localshell: /lib/pvmd: No such file or directory
> > [t80000000] 04/10 13:33:00 stdout at node2.sge.ifom-ieo-campus.it: EOF
> > [t80000000] 04/10 13:33:00 stderr at node2.sge.ifom-ieo-campus.it:
> > localshell: line 0: fg: no job control
> > [t80000000] 04/10 13:33:00 stdout at node2.sge.ifom-ieo-campus.it: EOF
> > [t80040000] 04/10 13:33:00 startack() host
> > node2.sge.ifom-ieo-campus.it expected version, got "PvmCantStart"
> > [t80040000] 04/10 13:34:00 dm_halt() from
> > (node1.sge.ifom-ieo-campus.it), halting...
> > [t80040000] 04/10 13:34:00 work() pvmd halting
> > [t80040000] 04/10 13:34:00 pvmbailout(0)
> >
> >
> > it seems that PVM_ROOT is not passed to the PE, even if
>
> With a loose integration, this must be defined in any file, which is
> sourced during a non interactive login, e.g. .bashrc.
>

This means that in a tight integration (that I would like to have)
this is no more necessary?

> Inside the PE start_proc_args, it is set to the value of the third
> parameter.

Yeah, but then why it was not passed? Is it true that startpvm.sh is
launched only on a node and then pvm takes care to start the other
nodes?

>
> HTH - Reuti
>
> BTW: your preferred shell is csh? Maybe it's also advisable to change
> "shell /bin/bash" and "shell_start_mode unix_behavior" in your queue
> definition.

This is a long story. My shell is a symlink to bash, but it is
"localshell" (and it is defined with LDAP...). Of course I can execute
and use /bin/bash.

Thanks again.

d

>
>
> > $ echo $PVM_ROOT
> > /usr/share/pvm3
> >
> > Mmm...
> >
> > d
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>


--
dawe
http://dawe.ilbello.com
---
"Prediction is very difficult, especially if it's about the future." -
Niels Bohr

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list