[GE users] SGE admin issue

fgarret fgarret at ub.edu
Fri Nov 13 20:38:05 GMT 2009


Thanks for your quick help.

Yes, I've compiled OpenMPI with SGE support.
Any idea of what may be the problem?

thanks,
FG

PS - Haven't tried the other issues...



reuti wrote:
> Hi,
> 
> Am 06.11.2009 um 17:25 schrieb fgarret:
> 
>> I've just installed a cluster with 7 execution nodes (56 cores) +  
>> an extra node as master. This node
>> runs sge_master, has the shared HDD and is the only one with a  
>> direct connection with the Internet.
>> All the others only have connection to the master node. The cluster  
>> is working pretty ok but I'm
>> having some difficulties with some issues:
>>
>> - Sending mail
>> 	I've managed to install sendmail on the master node and tested it  
>> OK. However, the "-m be -M
>> user at host" doesn't work. Who sends the mail on job start/end? The  
>> master node? submission node?
>> execution node? If it is the execution node that sends the emails,  
>> is there any possibility of being
>> the master/submission node?
> 
> the exec host will send the emails (the one for the jobs). Some admin  
> emails are also generated on the master node. So you need to use the  
> master node as a relay, any maybe change the name of the sender  
> (which is root at node01 or alike), to the one of the master node, as  
> many email servers are refusing to accept emails with an unresolvable  
> sender address.
> 
> Pitfall: root (which is the sender of the emails) won't be  
> masqueraded by default, there is a default rule which you must comment.
> 
> dnl EXPOSED_USER(`root')dnl
> define(`SMART_HOST',`smtp:myheadnode.ub.edu')dnl
> MASQUERADE_AS(`myheadnode.ub.edu')dnl
> 
> Any reason why you use sendmail, often it's replaced nowadays with  
> postfix or exim.
> 
> 
>> - MPI
>> 	I've installed OpenMPI and it is also working OK. The only thing  
>> is that jobs are note removed from
>> the queue when they finish. They just stand there eternally and the  
>> only way to remove them is the
>> root user with "qdel -f". Any way to fix this?
>>
> 
> You compiled OpenMPI with SGE suppport?
> 
> 
>> - Reserving nodes
>> 	When I want to run some job with threads it will occupy one slot  
>> but will be in fact using more
>> processors. Any way to reserve slots?
> 
> You will need to create a PE with "allocation_rule $pe_slots" with  
> name maybe "smp" which you also request in the "qsub -pe smp 4 ..."  
> and use in the jobscript for the sake of easiness:
> 
> export OMP_NUM_THREADS=$NSLOTS
> 
> -- Reuti
> 
> 
>> thanks in adv,
>> FG
>>
>> -- 
>> Filipe G. Vieira
>> Departament de Genetica
>> Universitat de Barcelona
>> Av. Diagonal, 645
>> 08028 Barcelona
>> SPAIN
>> Phone: +34 934 035 306
>> Fax: +34 934 034 420
>> fgarret at ub.edu
>> http://www.ub.edu/molevol/
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessageId=225402
>>
>> To unsubscribe from this discussion, e-mail: [users- 
>> unsubscribe at gridengine.sunsource.net].
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=225419
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
> 

-- 
Filipe G. Vieira
Departament de Genetica
Universitat de Barcelona
Av. Diagonal, 645
08028 Barcelona
SPAIN
Phone: +34 934 035 306
Fax: +34 934 034 420
fgarret at ub.edu
http://www.ub.edu/molevol/

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=226735

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list