[GE users] All OpenMPI process run on same node

reuti reuti at staff.uni-marburg.de
Tue Oct 26 23:41:24 BST 2010


Am 25.10.2010 um 20:55 schrieb bwillems <bwi565 at gmail.com>:

> I did some searching on the Rocks discussion list as well. Adding  
> "unset SGE_ROOT" to the job submission script fixed the problem.  
> Just out of curiosity, do you have any ideas why unsetting this  
> environment variable matters?

when do this, you will switch off the tight integration of your  
parallel job into SGE. Means no job control of slave processes and  
wrong accounting. The command:

$ ps -e f

(f w/o -) should show in your setup then, that the processes are no  
longer children of SGE's execd on the slave nodes. I also wonder,  
whether the slave slot allocation is the one granted by SGE.

It's better to investigate the real cause of the failed startup.

-- Reuti

> Thanks,
> Bart
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=289999
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list