[GE users] failed receiving gdi request

Reuti reuti at staff.uni-marburg.de
Fri May 25 13:12:07 BST 2007


Hi,

Am 25.05.2007 um 11:12 schrieb geno:

> We freshly set up a GE, version N1GE 6.0u9
> qmaster on a Xeon, with 2.6.9-42.ELsmp i686
> sgeexecd  on Opteron nodes, with 2.6.9-42.ELsmp x86_64
>
> Our first jobs seemed to run fine.
> Parallel jobs did not run because MPI wasn't (and maybe isn't) set  
> up properly.
> So we got errors like "cannot run in PE "mpi" because it only  
> offers 0 slots"

you set the number of slots in the PE definition to a sensible value,  
and attached the PE also to a cluster queue of your choice?

> By adding lamboot and lamhalt in the script, and adding some  
> changes to the PE environment, these PE related errors disappeared.
> Now we got a new error :
>    error: can't unpack gdi request
>    error: error unpacking gdi request: bad argument
>    failed receiving gdi request

For a proper LAM/MPI integration, this might help:

http://gridengine.sunsource.net/howto/lam-integration/lam- 
integration.html

> In your mailing list archive, this error was related to:
> - having different GE versions. we don't.
> - having too much in messages in read buffer. we don't (0).
>
> The gdi error prevents us now from starting new jobs, parallel or not.
> I have no idea about what gdi is. Does anyone know what happens ?
> geno

Can you please check, whether any queues are in status E (error) and  
clear it by using qmod?

-- Reuti

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list