[GE users] newbie questions

Liudvikas Bukys bukys at cs.rochester.edu
Fri Apr 30 22:20:30 BST 2004

Subject: newbie questions

I am new to SGE, forgive my naivete.

There are some basic requirements we have here,
and I'm not sure how well SGE supports them,
and I wouldn't mind advice on which mechanisms
to use either.

(1) First, we have a possibly above-average need
    for machine reservation of whole clusters or
    subclusters (for people to run performance
    benchmarks).  It does look like "parallel
    environment" is suited to this.  If the machines
    run most of the time with a dynamic flexible
    load, but with an occasional need to kick everybody
    off some subset, is the kicking-off process handled
    smoothly by the scheduler, either forced and
    immediate, or gradual, letting low-priority jobs
    terminate until the subcluster is free enough?
    Is there a preferred mechanism for making
    either priority or calendar-based changes in use

(2) Similar question: Is scheduling downtime for
    specific components something that can also be
    done similarly, and is it handled well?
    (Or is the model "kill things, let the application
    recover via restart or checkpoint"?)

(3) Are there any tie-ins to other exclusive-use mechanisms
    so that an SGE-managed system cuts off other methods of
    entry (ssh, rsh, via nologin, pam, or other)?
    Any support for killing processes that "don't belong"
    on my reserved machine?

Naive questions, I know, but I'd appreciate advice.
Reply to me, I'll summarize to the list.


