[GE users] SGE installation on Altix 4700 query

Shannon V. Davidson svdavidson at charter.net
Wed Oct 15 17:56:51 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Rayson Ho wrote:
> On 10/13/08, Shannon V. Davidson <svdavidson at charter.net> wrote:
>   
>> The latest cpuset version is implemented in the Linux kernel, making it
>> useful not just for Altix computers, but for other linux-based SMPs as well.
>>  I recently developed a new Grid Engine / cpuset integration based on the
>> kernel cpuset implementation. The integration is implemented as a single C
>> program configured as a queue starter_method which takes care of allocating
>> the cpuset, starting the job, and cleaning up cpusets after job failures.
>>     
>
> Shannon,
>
> Does your code keep track of cpuset membership of each processor?? Or
> does it handle it at job start time by checking which processors are
> free and creates a cpuset on the fly??
>   

At job start, the starter_method runs and checks which CPUs are free and 
dynamically creates the cpuset.  All information about the active 
cpusets is in the file system mounted at /dev/cpuset so no state 
information has to be maintained by the integration itself. The cpusets 
are configured so that the cpusets are automatically deleted when the 
last process exits.  If a cpuset is partially created (e.g. due to a 
qdel on the job while the starter_method is running), it will be deleted 
the next time the starter_method runs.  A simple POSIX file lock 
protects the cpuset allocation and guarantees that a CPU is not given to 
more than one cpuset.

- Shannon

> Rayson
>
>
>
>   
>> Most of the real work is handled by the kernel-based cpuset implementation.
>> This integration is running at a production Altix site.  I'm currently
>> making a few more updates to ensure that it works without propack - for
>> example, on dual-socket or quad-socket quad-core Intel or AMD x86_64
>> machines running Red Hat 5.1 or CentOS 5.1.  Once I'm happy with the
>> updates, I plan to make it available as a download on
>> gridengine.sunsource.net.  I believe the implementation is modular enough to
>> consider integrating as part of the shepherd daemon in a future Grid Engine
>> release.
>>
>> -Shannon
>>
>>
>>
>> On Sun, 2008-10-12 at 13:10 +0300, Walid wrote:
>>     
>
>   
>> 2008/10/11 Rayson Ho <rayrayson at gmail.com>
>>     
>  We have:
>
>   
>> http://gridengine.sunsource.net/files/documents/7/3/sge_cpuset.tar.gz
>>
>>     
>  Note that the arch name for Linux on IA64 is "ia64linux" for
>  SGE 5.3,
>   
>> and for SGE 6.0 the arch name is "lx24-ia64" and "lx26-ia64".
>>     
>  So you
>  need
>   
>> to replace all ia64linux with the correct arch name in
>>     
>  the
>  cpuset_clean
>   
>> and cpuset_epilog script.
>>     
>
> I have installed it, however on SLES10U2 SP6r2
>   
>> cpuset have different
>>     
> flag options, and rpm package than the older Altix
>   
>> with RHEL3U4 SP3U6
>>     
> as can be seen from the cpuset.log file under the spool
>   
>> dir :
>>     
>
> Sun Oct 12 12:32:39 zzz 2008 cpuset_prolog[15.1] ERROR:
>   
>> cpuset
>>     
> creation failure - /usr/bin/cpuset: invalid option -- q cpuset
>   
>> failed:
>>     
> unrecognized option Try
> 'cpuset -h' for more
>   
>> information.
>>     
> .......
> Sun Oct 12 12:32:50 zzz 2008 cpuset_prolog[16.1] ERROR:
>   
>> cpuset
>>     
> creation failure - /usr/bin/cpuset: invalid option -- q cpuset
>   
>> failed:
>>     
> unrecognized option Try
> 'cpuset -h' for more information.
> .....
> Sun
>   
>> Oct 12 12:33:29 zzz 2008 cpuset_prolog[15.1] ERROR: cpuset
>>     
> allocation of
>   
>> sge1 in /tmp/sgelocks/cpuset_15.1 failed
>>     
>
> regards
>
> Walid
>
>
>   
> ---------------------------------------------------------------------
> To
>   
>> unsubscribe, e-mail:
>> users-unsubscribe at gridengine.sunsource.net
>>     
> For additional
>   
>> commands, e-mail: users-help at gridengine.sunsource.net
>>     
>
>   
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>   

-- 
_________________________________________

Shannon V. Davidson <sdavidson at appro.com>
Software Engineer     Appro International
636-633-0380 (office)  443-383-0331 (fax)
_________________________________________





More information about the gridengine-users mailing list