[GE users] Newbie: @group and fractional usage
dkulp+sge at cs.umass.edu
Wed May 17 06:03:25 BST 2006
I have two questions as a new grid engine user.
First, I'm running on linux and attempts to create a userlist with
the @unixgroup notation doesn't seem to work. qmon accepts it, but
subsequent commands don't recognize it. For example, I added
"@group" to the deadlineusers userset, but when I try to submit
deadline jobs I get an error 'job rejected: the user "dkulp" is no
deadline initiation user'. Deadline job submission only works when I
add my explicit username in the deadlineusers userset. But I don't
want to do that for every new user.
Second, I would like to implement a usage policy that removes
(reschedules/migrates) a user's jobs from running queues if the user
is currently exceeding his fractional share and there is a demand for
resources. I've set up a share tree, which works well when all
running jobs are short. However, we want a policy that preempts
running programs according to that share tree policy.
I would think that our scenario is common, but I haven't found
anything on this. Our compute cluster is fractionally owned by
multiple groups; that is, different groups have contributed nodes.
Usage is bursty, but jobs some times can run for days. Suppose Alice
and Bob each own 50% of the cluster. Initially the cluster is idle,
so when Alice submits her jobs they fill up all the queues for 100%
utilization. Then Bob wants to run his jobs. If Alice's jobs are
short, then the share tree policy would quickly balance out the
resource usage to 50-50. But if Alice's jobs run for days, then Bob
is stuck waiting. Alice and Bob would prefer if Alice's job was just
terminated (or checkpointed) and rescheduled.
The only solution that I can think of is to create two queues for
every host, one queue for Alice and one for Bob. On 50% of the hosts
the Alice queue will be subordinate to Bob. Vice versa on the other
half. But this requires a lot of manual queue configuration as the
number of cluster owners increases. It would be nice if there were
some more general scheme like the share tree. In other words, I
would like the share tree to effect preemption policy. Any ideas?
Thanks in advance.
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users