[GE users] PVM tight integration

Reuti reuti at staff.uni-marburg.de
Tue Apr 11 13:51:43 BST 2006


Below...

Am 11.04.2006 um 14:48 schrieb davide cittaro:

> I can remove it, but then it happens that a program I will use (hmmer)
> can't find another executable (hmmersearch-pvm).
>
> Let's see...
> [10 minutes later]
>
> No, it doesn't work. It seems to me that the PVM_DPATH cannot be
> passed correctly... slave nodes cannot start pvm. How can I solve this
> if neither "env" in the rsh wrapper doesn't work?
>
> d
>
> On 4/11/06, Reuti <reuti at staff.uni-marburg.de> wrote:
>> Can it be, that:
>>
>> -ep /usr/bin
>>
>> will bypass the rsh-wrapper - can you remove it?
>>
>> ...more below
>>
>> Am 11.04.2006 um 14:03 schrieb davide cittaro:
>>
>>> On 4/11/06, Reuti <reuti at staff.uni-marburg.de> wrote:
>>>> I put all in .bashrc according to my location of the files:
>>>>
>>>> export PVM_ROOT=$HOME/pvm3
>>>> export PATH=$PVM_ROOT/bin/LINUX:$PVM_ROOT/lib/LINUX:$PATH
>>>
>>> I'm going to try.
>>>
>>>>
>>>> What is the output of the .po and .pe are the started pvmds counted
>>>> correctly?
>>>
>>> $ cat tester_tight.sh.po39641
>>> -ep /usr/bin -catch_rsh
>>> /opt/n1ge6/omix/spool/node6/active_jobs/39641.1/pe_hostfile
>>> node6.sge.ifom-ieo-campus.it /usr/share/pvm3
>>> /tmp/pvmtmp017402.0
>>> startpvm.sh: startup failed - invoking cleanup script
>>> -catch_rsh /opt/n1ge6/omix/spool/node6/active_jobs/39641.1/ 
>>> pe_hostfile
>>> node6.sge.ifom-ieo-campus.it
>>> -catch_rsh /opt/n1ge6/omix/spool/node6/active_jobs/39641.1/ 
>>> pe_hostfile
>>> node6.sge.ifom-ieo-campus.it
>>>
>>> $ cat tester_tight.sh.pe39641
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399]: pvm_mytid(): Can't contact local daemon
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399]: pvm_mytid(): Can't contact local daemon
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399]: pvm_mytid(): Can't contact local daemon
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399]: pvm_mytid(): Can't contact local daemon
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399]: pvm_mytid(): Can't contact local daemon
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399]: pvm_mytid(): Can't contact local daemon
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399]: pvm_mytid(): Can't contact local daemon
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399]: pvm_mytid(): Can't contact local daemon
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399]: pvm_mytid(): Can't contact local daemon
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17399]: pvm_mytid(): Can't contact local daemon
>>> start_pvm: Couldn't enroll to pvm
>>> libpvm [pid17430] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17430] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17430]: pvm_halt(): Can't contact local daemon
>>> rm: cannot remove `/tmp/39641.1.bofh.q/hostfile': No such file or
>>> directory
>>> rm: cannot remove `/tmp/39641.1.bofh.q/rsh': No such file or  
>>> directory
>>
>> So, here is the problem: why are the files "hostfile" and (the link)
>> "rsh" not created in the intended location? Something is not working
>> with your start_proc_args. Are you setting TMPDIR by hand in any of
>> your login scripts which supersedes the SGE setting of it? - Reuti

Did you also check this ^^^ ? - Reuti

>>
>>> libpvm [pid17435] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17435] /tmp/39641.1.bofh.q/pvmd.2486: No such file or
>>> directory
>>> libpvm [pid17435]: pvm_halt(): Can't contact local daemon
>>>
>>>
>>>>
>>>>> but still the headnode can't start slavenodes...
>>>>> Also, if I delete a job with qdel, pvm doesn't stop on the head
>>>>> node,
>>>>> and I have to enter to pvm console and manually halt it.
>>>>> Any hint on this?
>>>>>
>>>>
>>>> This shouldn't work, unless you set also PVM_TMP by hand before
>>>> entering the console, as inside PVM_TMP the special files are
>>>> created. Where are they in your cluster placed?
>>>
>>> the default is in /tmp (pvmd.ID, pvml.ID and the socket)
>>>
>>> d
>>>
>>>>
>>>> -- Reuti
>>>>
>>>>
>>>>> Thanks again
>>>>>
>>>>> --
>>>>> dawe
>>>>> http://dawe.ilbello.com
>>>>> ---
>>>>> "Prediction is very difficult, especially if it's about the
>>>>> future." -
>>>>> Niels Bohr
>>>>>
>>>>> ------------------------------------------------------------------ 
>>>>> --
>>>>> -
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail: users- 
>>>>> help at gridengine.sunsource.net
>>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>>
>>>>
>>>
>>>
>>> --
>>> dawe
>>> http://dawe.ilbello.com
>>> ---
>>> "Prediction is very difficult, especially if it's about the  
>>> future." -
>>> Niels Bohr
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
>
> --
> dawe
> http://dawe.ilbello.com
> ---
> "Prediction is very difficult, especially if it's about the future." -
> Niels Bohr
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list