[GE users] Reserving a queue for memory usage a <= b

Atle Rudshaug atle at numericalrocks.com
Tue Sep 23 11:30:57 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Reuti wrote:
> Hi Aaron,
>
> Am 23.09.2008 um 11:24 schrieb Aaron Turner:
>
>> Hello SGE users,
>>
>> What I am looking to achieve is reserving some machines for higher 
>> memory jobs.
>>
>> The idea would be for:
>> 1. Jobs capable of being scheduled on the smaller memory machines 
>> (memory usage <= a) going to those machines.
>
> you can set up a consumable complex (better: you can make virtual_free 
> or h_vmem consumable) and set a sensible value for each node in the 
> exechost definition. This complex you have to request when you submit 
> the job.
>
>> 2. Jobs of greater memory usage than a but less than b going to the 
>> high memory machines.
>> 3. Very high memory jobs we can't accomodate get rejected and the 
>> user alerted.
>
> See below.
>
>> 4. The high memory machines being kept as busy as possible.
>
> You can sort queue instances by setting a sequence number and set the 
> scheduler to sort by seqno:
>
> seq_no 0,[@big_machines=10],[@small_machines=20]
>
>> 5. Any user with a job of memory usage greater than a and less than b 
>> having the minimum wait possible before their job starts running.
>>
>> In an ideal world all jobs would be checkpointable and submitted as 
>> such so I could simply reduce the time slice down to a shortish time 
>> and simply get the jobs rescheduled for additional processing. This 
>> would also make the fair sharing a bit less clumpy. But I am still 
>> not convinced the checkpointing issue is fully solved for arbitrary 
>> code.
>
> There is no "issue" with SGE regarding checkpointing, it is simply not 
> designed to do it on its own. SGE will support checkpointing if it's 
> built into the application or provided by any 3rd party library. It's 
> not the intention of SGE to offer checkpointing facitilies.
>
>> So given this what is the best way to approach it? I did try setting 
>> up a series of subordinations for the queues with a shorter queue to 
>> absorb excess  jobs but not lock up the high memory machine for a 
>> long time period with them but it doesn't seem to operate quite as I 
>> would have hoped. Is there a better way of approaching this, such as 
>> adding a complex to do this? The base complex relationships offer >= 
>> and <= but not a <= b <= c!
>
> You can submit jobs with:
>
> -w e
>
> and you will get an error message if there aren't any queues/hosts at 
> all to satisfy the request.
>
> -- Reuti
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>

Hi!

I would like to do the same (reserve different machines according to job 
memory needs), only through DRMAA. Especially the -w e option with user 
feedback if no applicable hardware is available. Is this possible 
through DRMAA?

- Atle

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list