[GE users] SGE Configuration

ms mark.sprenger at gmx.de
Wed Feb 18 10:19:43 GMT 2009


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hello Gridengine User,

I've got several questions about transfering our business rules to the
configuration of the gridengine. I hope you can help me a little bit.

Infrastructure:
12 Nodes with 8 Cores  (2 Quad-Xeon)
4 Nodes with 16 GB, 8 Nodes with 4 GB

Business rules:
a)
Because the gridengine will be used by only a few people, they can manage the
overall queuing offline. So the best way will be the default way, that every
user enqueue their jobs to the default queue all.q.
Gridengine starts these jobs in FIFO-order.

But sometimes, users want to test their jobs, so jobs have to start immediatly,
regardless of other jobs current running.

I think, the best way to implement this rule is to set up a fast.q with
subordinate all.q, isn't it?
But how can I configure Gridengine, to suspend only one job (from all.q) in pair
with starting a job from fast.q. Sometimes he starts 1 fast.q job and suspends
all other jobs of the node, which is obviously not necessary.
b)
We want a maximum utilization of our nodes, so generally each node should be
filled up with 8 jobs; most of our jobs are non-parallel. Users should be able
to append memory-using information to their jobs. But this must be optional,
because sometime users don?t know this information.

I think, that the mem_free value represents this rule.
But, if I start an array-job with the mem_free=4G value, the gridengine ignores
the amount of memory and fills up the nodes with 8 jobs per node in the first
scheduling-run. I think that it only compares the current free memory with the
amount of memory the user has given to the array job, but don?t decrement the
value with each start.
How can I handle this problem?

c)
Sometimes jobs need a huge amount of memory, but only at the start. The
operations are only at a small memory bandwidth, so the OS can swap the main
part. So we want to be able to say ?start only X jobs at one node?. Does the
gridengine support that kind of scheduling, and how can I implement that?

d)
We?ve got 2 resources where we can acquire CPLEX license.  A script for a load
sensor, that reports the current amount of free license, exists. The gridengine
runs the script on a node and reports the current value properly.
But here the same problem with the free_mem value occurs. If any CLPEX licence
is available, he will fill up the nodes and doesn?t memorizes, that he has
already start jobs.

I hope you can give me some hints!
Best wishes,

Mark

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=108838

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list