[GE users] SGE admin issue

reuti reuti at staff.uni-marburg.de
Fri Nov 6 17:35:12 GMT 2009


Hi,

Am 06.11.2009 um 17:25 schrieb fgarret:

> I've just installed a cluster with 7 execution nodes (56 cores) +  
> an extra node as master. This node
> runs sge_master, has the shared HDD and is the only one with a  
> direct connection with the Internet.
> All the others only have connection to the master node. The cluster  
> is working pretty ok but I'm
> having some difficulties with some issues:
>
> - Sending mail
> 	I've managed to install sendmail on the master node and tested it  
> OK. However, the "-m be -M
> user at host" doesn't work. Who sends the mail on job start/end? The  
> master node? submission node?
> execution node? If it is the execution node that sends the emails,  
> is there any possibility of being
> the master/submission node?

the exec host will send the emails (the one for the jobs). Some admin  
emails are also generated on the master node. So you need to use the  
master node as a relay, any maybe change the name of the sender  
(which is root at node01 or alike), to the one of the master node, as  
many email servers are refusing to accept emails with an unresolvable  
sender address.

Pitfall: root (which is the sender of the emails) won't be  
masqueraded by default, there is a default rule which you must comment.

dnl EXPOSED_USER(`root')dnl
define(`SMART_HOST',`smtp:myheadnode.ub.edu')dnl
MASQUERADE_AS(`myheadnode.ub.edu')dnl

Any reason why you use sendmail, often it's replaced nowadays with  
postfix or exim.


> - MPI
> 	I've installed OpenMPI and it is also working OK. The only thing  
> is that jobs are note removed from
> the queue when they finish. They just stand there eternally and the  
> only way to remove them is the
> root user with "qdel -f". Any way to fix this?
>

You compiled OpenMPI with SGE suppport?


> - Reserving nodes
> 	When I want to run some job with threads it will occupy one slot  
> but will be in fact using more
> processors. Any way to reserve slots?

You will need to create a PE with "allocation_rule $pe_slots" with  
name maybe "smp" which you also request in the "qsub -pe smp 4 ..."  
and use in the jobscript for the sake of easiness:

export OMP_NUM_THREADS=$NSLOTS

-- Reuti


>
> thanks in adv,
> FG
>
> -- 
> Filipe G. Vieira
> Departament de Genetica
> Universitat de Barcelona
> Av. Diagonal, 645
> 08028 Barcelona
> SPAIN
> Phone: +34 934 035 306
> Fax: +34 934 034 420
> fgarret at ub.edu
> http://www.ub.edu/molevol/
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=225402
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=225419

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list