[GE users] SGE and OpenMPI 1.3.2

doraz dobre.razvan at gmail.com
Thu Jan 21 16:22:23 GMT 2010


Sorry my bad..

On our cluster at school  the openmpi pe is configured like this and the
jobs are running without problems.
We use sge 6.2u3 and openmpi 1.3.2.

[rdobre at fep-53-2 ~]$ qconf -sp openmpi
pe_name            openmpi
slots              999
user_lists         NONE
xuser_lists        NONE
start_proc_args    /bin/true
stop_proc_args     /bin/true
allocation_rule    $fill_up
control_slaves     TRUE
job_is_first_task  TRUE
urgency_slots      min
accounting_summary FALSE



-----Original Message-----
From: Dan.Templeton at Sun.COM [mailto:Dan.Templeton at Sun.COM] 
Sent: Thursday, January 21, 2010 5:07 PM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] SGE and OpenMPI 1.3.2

olesen wrote:
>> This is what I'm using for my openmpi PE for a reference:
>>
>> $ qconf -sp openmpi
>> pe_name            openmpi
>> slots              9999
>> user_lists         NONE
>> xuser_lists        NONE
>> start_proc_args    /bin/true
>> stop_proc_args     /bin/true
>> allocation_rule    $fill_up
>> control_slaves     TRUE
>> job_is_first_task  FALSE
>> urgency_slots      min
>> accounting_summary FALSE
>>     
>
>
> Is '/bin/true' correct? I have
>
>   start_proc_args    NONE
>   stop_proc_args     NONE
>
>   

NONE is equivalent to /bin/true.  Both are just no-ops.

>> I haven't quite decided whether or not to use "accounting_summary 
>> TRUE" yet, as it doesn't seem to account properly for parallel jobs.
>>     
>
> I don't bother with accounting there either. Instead I parse the
> accounting file and count the slots/walltime.
> For our system the overall time that machines and licenses are occupied
> is the primary accounting factor.
>   

There was a bug in u4 that prevented correct accounting of PE jobs if 
accounting_summary was TRUE.  That's fixed in u5.  accounting_summary 
tells the qmaster whether to aggregate the accounting information for 
parallel jobs into a single accounting file entry.  If you're running 
huge parallel jobs, you want it set to TRUE.

Daniel

> /mark
>
> This e-mail message and any attachments may contain legally privileged,
confidential or proprietary Information, or information otherwise protected
by law of EMCON Technologies, its affiliates, or third parties. This notice
serves as marking of its "Confidential" status as defined in any
confidentiality agreements concerning the sender and recipient. If you are
not the intended recipient(s), or the employee or agent responsible for
delivery of this message to the intended recipient(s), you are hereby
notified that any dissemination, distribution or copying of this e-mail
message is strictly prohibited. 
> If you have received this message in error, please immediately notify the
sender and delete this e-mail message from your computer.
>
> ------------------------------------------------------
>
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=2
40185
>
> To unsubscribe from this discussion, e-mail:
[users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=2
40187

To unsubscribe from this discussion, e-mail:
[users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=240200

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

    [ Part 2, Application/PKCS7-SIGNATURE (Name: "smime.p7s") 4.6 KB. ]
    [ Unable to print this part. ]



More information about the gridengine-users mailing list