[GE users] some startup questions....
dan.templeton at sun.com
Mon Mar 16 19:30:34 GMT 2009
[ The following text is in the "utf-8" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some characters may be displayed incorrectly. ]
> SGE Users -
> new SGE admin here, coming up to speed on a system that I've inherited
> from a guy that "moved on"..... Nice thing is that those that are left
> are somewhat confused about how the system is set up, so I will probably
> be able to just configure it the way I want, in conjunction with a
> version upgrade (to 6.2)....
> I've read the user's manual, admin manual, install manual, and a few of
> the other things on the website, including the very-helpful
> "SCHEDULER POLICIES FOR JOB PRIORITIZATION IN THE SUN N1? GRID ENGINE 6 SYSTEM"
> whitepaper by Charu Chaubal. I've played around with some of the commands
> on the existing system. But, I have some questions, some probably stupid,
> so be nice.... :)
> 1) what the heck does the "N1" in "N1 Grid Engine" mean ?!
Don't ask. It's gone now, so let's forget that it ever existed.
> (BTW, in the following questions, I'm talking about CLUSTER queues unless
> I specifically say otherwise, which I never do....)
> 2) I'm confused about the states of a job. When it's submitted, using
> qsub, is the job immediately and always sent to a queue ? If not,
> where is it, and how would I see it ?
When a job is in the queued and waiting state (qw), it is still in the
pending job list waiting to be assigned to a queue. When a job is in
the running state (r), it is assigned to a queue. I don't see how
that's confusing. ;) qstat will show you jobs in both (all) states.
> 3) This question kind of depends on the answer to the one above, but I'll
> ask it anyway... when a job is in a queue, does that mean it's running ?
> If not, which I assume is the answer, then can more than one job in a
> queue be running at the same time ?
> 4) The jobs in a queue are re-prioritized at each scheduling interval, correct ?
> So it's possible that a job that's not running (in a queue) could all of
> a sudden get a higher priority (say due to some override tickets assigned
> to it) than a running job, and so the running job is suspended - is that
> how it works ?
Nope. SGE is not natively preemptive. Once a job is scheduled to a
queue, it runs to completion, unless it fails or is canceled. The
exception to that rule is queue subordination, which introduces a sort
of aftermarket preemption.
> 5) somewhat related to the previous question, maybe, but in Charu's whitepaper
> he talks about a "dispatch priority" - is this something different than
> the priority of the jobs in a queue ?
I'd have to go read the paper again to know what he meant. There's only
one priority that's relevant, and that's the priority the job is
assigned while waiting to be scheduled.
> 6) I'm searching for a "good" way to visualize in my mind, if not on paper,
> what the SGE queueing system looks like - does anyone have such a thing ?
> For instance, can a queue be represented by a vertical tube, where jobs are
> dropped into the top, and come out the bottom when they are ready to be
> run ? (probably not, eh ?!) Or do they not come out of the tube until their
> run is completed, and more than one can be running at once ? (getting back
> to a previous question)
See the attached slide.
> and now for something that has nothing to do with queues, I think -
> 7) how do you handle clusters that are made up of many types of machines, some
> of which are quad-core, some of which dual-core, and some single-core ? If
> a job only requires a single core, does that mean that SGE can/will submit
> 4 separate jobs to a quad-core machine ?
Yep. SGE schedules jobs to job slots. You can assign how ever many
jobs slots to a machine you'd like. The default assumption is that
slots = cores.
> Thanks for the help !!!
You should also have a look at
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
[ Part 2, "ExampleConfig.pdf" Application/PDF (Name: ]
[ "ExampleConfig.pdf") 241 KB. ]
[ Unable to print this part. ]
More information about the gridengine-users