[GE users] tight intergration problem

Reuti reuti at staff.uni-marburg.de
Fri Jan 27 10:31:56 GMT 2006


Hi,

Am 27.01.2006 um 11:11 schrieb Jean-Paul Minet:

> <snip>
>
> Now, the qstat -j usage line is updated with proper values:
>
> lemaitre /home/pan/minet/abinit/parallel_eth # qstat -j 2488
> ...
> parallel environment:  mpich range: 2
> usage    1:                 cpu=00:10:08, mem=169.88418 GBs,  
> io=0.00000, vmem=671.281M, maxvmem=671.309M
> scheduling info:            queue instance "all.q at lmexec-66"  
> dropped because it is full
> ...
>
> but the qstat -ext reports wrong value:
>
> 2488 0.02271 0.02271 Test_abini root         NA                
> defaultdep r 0:00:06:17 105.69577 0.00000 11289     0     0    27  
> 11261 0.01  all.q at lmexec-64                    2
>

if you look at all tasks:

$ qstat -g t -ext

it might give you more information. But for parallel jobs using often  
qrsh for one subtask after the other, only the added up values are  
useful.

> Now, issuing a qdel of this running job will properly stop slave  
> process, but on master node, remains a defunct:
>
> root      5699     1 99 09:14 ?        00:13:11 /home/pan/minet/ 
> abinit/parallel_eth/abinip_eth -p4pg /home/pan/minet/abinit/ 
> parallel_eth/PI5615 -p4wd /home
> root      5700  5699  0 09:14 ?        00:00:00 /home/pan/minet/ 
> abinit/parallel_eth/abinip_eth -p4pg /home/pan/minet/abinit/ 
> parallel_eth/PI5615 -p4wd /home
> root      5701  5699  0 09:14 ?        00:00:00 [qrsh] <defunct>
>
> Have you an idea where does this come from ?  mpich?
>

Did you set, as mentioned in the Howto:

export MPICH_PROCESS_GROUP=no

in your jobscript (or as default request to SGE in $SGE_ROOT/default/ 
common/sge_request: -v MPICH_PROCESS_GROUP=no) and set -V in the rsh- 
wrapper?

-- Reuti

> jp
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list