[GE users] Help: Slots Problem

Sean Davis sdavis2 at mail.nih.gov
Wed Sep 17 02:13:30 BST 2008


    [ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

On Tue, Sep 16, 2008 at 9:03 PM, Lee Amy <openlinuxsource at gmail.com> wrote:
>
>
> 2008/9/17 Reuti <reuti at staff.uni-marburg.de>
>>
>> Am 16.09.2008 um 14:03 schrieb Lee Amy:
>>
>>> 2008/9/15 Sean Davis <sdavis2 at mail.nih.gov>
>>> On Sun, Sep 14, 2008 at 9:06 AM, Lee Amy <openlinuxsource at gmail.com>
>>> wrote:
>>> >
>>> >
>>> > 2008/9/14 Ravi Chandra Nallan <Ravichandra.Nallan at sun.com>
>>> >>
>>> >> Lee Amy wrote:
>>> >>>
>>> >>> Hello,
>>> >>>
>>> >>> I build a small cluster with 5 nodes, every node has 2 Opteron 270 HE
>>> >>> processors(4 cores). There's a program I always called "TGICL" can
>>> >>> use the
>>> >>> total amount of processors on one machine. For example, when I
>>> >>> specific CPU
>>> >>> number of tgicl it will execute as the same number of threads.
>>> >>> However, SGE
>>> >>> still shows the job is taking up 1 slot. But I have set 4 cores to
>>> >>> run on
>>> >>> one node.
>>> >>>
>>> >>> So my problem is how to let SGE know the "real" taking-up slots?
>>> >>>
>>> >>> Thank you very much~
>>> >>>
>>> >>> Regards,
>>> >>>
>>> >>> Amy Lee
>>> >>>
>>> >> If you want SGE to know the no. of CPU's used by tgicl you could use a
>>> >> new
>>> >> complex (qconf -sc). And for restricting the usage of CPUs per tgicl
>>> >> program
>>> >> you may force to request the complex by your program. Then you could
>>> >> set the
>>> >> complex value at host level and making it consumable should do the
>>> >> job.
>>>
>>> Just to come back here, using a parallel environment with $pe_slots as
>>> the allocation method is an exact fit for the original post.  I don't
>>> think there is a need to involve complexes, etc.
>>>
>>> Sean
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>> Should I use loose integration to replace tight integration? Because I
>>> just enable the corresponding slots.
>>
>> What do you mean with this? If you have e.g. 4 cores, you would need 4
>> slots on this machine, and using a PE with allocation_rule $pe_slots you
>> would always get 4 slots from one and the same machine.
>>
>> This would be a Tight Integration which is always worth to be implemented.
>> If you want to avoid serial jobs on the nodes: set "qtype NONE" in the queue
>> definition and attach only the PE to it.
>>
>> I agree with Sean, that you won't need any complexes for your setup.
>>
>> -- Reuti
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
> Thank you for reply. What I mean is if I choose tight integration PE, what
> kind of start and stop script I should take? I don't know whether it's right
> but I hear about that the tight integration PE will use "qrsh -inherit" to
> start job.
>
> Or I just fill up the script with /bin/true?

/bin/true will do it, I think.

Sean

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list