[GE users] Tight integration with PVM

JONATHAN SELANDER S026655 at utb.hb.se
Fri Apr 15 13:48:45 BST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

I don't have any files like that in SGE_ROOT or the TMPDIR

---

# cat tester_tight.sh
#!/bin/sh

export PVM_TMP=/opt/sge/tmp

./hello

exit 0

---

# ls -ld /opt/sge/tmp
drwxrwxrwt   2 root     root         512 Apr 15 13:21 /opt/sge/tmp

---

-----Original Message-----
From: Reuti <reuti at staff.uni-marburg.de>
To: users at gridengine.sunsource.net
Date: Fri, 15 Apr 2005 14:44:01 +0200
Subject: Re: [GE users] Tight integration with PVM

Is there anything in the .po or .pe files, or doesn't they exist at all?

JONATHAN SELANDER wrote:
> Adding the PE to a queue fixed that error message. However, one node seems to fail each time i run the job (it has state E when i do qstat -f). It's not the same node each time either that fails.
> 
> ---
> 
> # tail -2 /opt/sge/default/spool/brasnod-2/messages
> 04/15/2005 22:08:06|execd|brasnod-2|E|shepherd of job 102.1 exited with exit status = 10
> 04/15/2005 22:08:06|execd|brasnod-2|W|reaping job "102" ptf complains: Job does not exist
> 
> ---
> 
> # qstat -explain E
> queuename                      qtype used/tot. load_avg arch          states
> ----------------------------------------------------------------------------
> all.q at brasnod-2                BIP   0/1       0.02     sol-sparc64   E
>         queue all.q marked QERROR as result of job 102's failure at host brasnod-2
> ----------------------------------------------------------------------------
> all.q at brasnod-3                BIP   0/1       0.02     sol-sparc64
> ----------------------------------------------------------------------------
> all.q at brasnod-4                BIP   0/1       0.01     sol-sparc64
> 
> ############################################################################
>  - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
> ############################################################################
>     102 0.55500 tester_tig root         qw    04/15/2005 14:08:35     3
> 
> 
> 
> 
> -----Original Message-----
> From: Reuti <reuti at staff.uni-marburg.de>
> To: users at gridengine.sunsource.net
> Date: Fri, 15 Apr 2005 14:02:23 +0200
> Subject: Re: [GE users] Tight integration with PVM
> 
> Hi,
> 
> did you add the PE to the queue definition (qconf -mq <queue>) like:
> 
> pe_list    pvm
> 
> CU - Reuti
> 
> 
> JONATHAN SELANDER wrote:
> 
>>I followed the howto at http://gridengine.sunsource.net/howto/pvm-integration/pvm-integration.html for setting up PVM integration with SGE after I had compiled pvm 3 and installed/compiled the utilities in the SGE_ROOT/pvm dir (aimk and install.sh)
>>
>>However, when i try the example tester_tight.sh from the howto, i get these scheduling errors in the logs:
>>
>>---
>>
>>cannot run in queue instance "all.q at brasnod-2" because PE "pvm" is not in pe list
>>cannot run in queue instance "all.q at brasnod-4" because PE "pvm" is not in pe list
>>cannot run because resources requested are not available for parallel job
>>cannot run because available slots combined under PE "pvm" are not in range of job
>>
>>---
>>
>># qconf -sp pvm
>>pe_name           pvm
>>slots             100
>>user_lists        NONE
>>xuser_lists       NONE
>>start_proc_args   /opt/sge/pvm/startpvm.sh -catch_rsh $pe_hostfile $host \
>>                  /opt/sge/pvm
>>stop_proc_args    /opt/sge/pvm/stoppvm.sh -catch_rsh $pe_hostfile $host
>>allocation_rule   1
>>control_slaves    TRUE
>>job_is_first_task FALSE
>>urgency_slots     min
>>
>>---
>>
>>
>>What does this mean? brasnod-2,3,4 are execution hosts which work correctly when i run ordinary jobs.
>>
>>J
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net





---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list