A default Grid Engine installation (without CSP) is highly insecure, and demands trusting all users who have access to it. For instance, any exec host can be owned using something like:
$ fakeroot qrsh id -u 0 $
That can be defeated for root by setting
However, any user with access to an admin host (e.g. with qmaster
running on the login node) can just run
fakeroot qconf -sconf
to change it. In that case, if they can write executables to a file
system visible from the qmaster, they can also own the qmaster, say by
Thus, to protect exec hosts and the qmaster with default “security”,
it is at least necessary to set
gidmin and run the qmaster on
a separate host with no access to non-admin users.
[Obviously even in the absence of separate hardware, a virtual machine can provide a separate host for a cluster head node. Having a separate head with only admin access also helps with admin tools like PowerMan which lack authentication.]
(If that can’t be done, it might be possible to restrict access to the qmaster socket, perhaps with
With admin host restrictions and uid limits in place, it is still
possible to submit jobs as any allowed user with qsub and an
LD_PRELOAD trick, as with
fakeroot or otherwise—with a simple DRMAA
client, for instance.
In most environments you will want to use either CSP or MUNGE, as below.
security is the original method. It prevents job submission as another
configuration changes by a non-admin/operator user. Thus it does allow
admin hosts to be more safely used also as submission hosts if required.
It also secures the daemon communication channels, but that will usually
be a secondary consideration.
[A single certificate covers the qmaster and execd daemons, and so is potentially vulnerable to compromise of an execution host in case man-in-the-middle-type threats are a concern.]
There are some limitations of using CSP:
Users must be explicitly added (no
enforce_user true) and have keys generated for them (
The keys must be distributed to the relevant hosts, though you can have selective authorization of users by submission host according to which keys are distributed where;
Keys must be renewed (using
util/sgeCA/renew_all_certs.sh) after the set expiry time (a year by default) and redistributed;
Currently (SGE 8.1.2) only a single security method can be configured, so using CSP excludes AFS support, for instance (see below).
It is not necessary to re-install to turn on CSP, just:
Stop all the SGE daemons;
Generate certificates with
Distribute the certificates as appropriate;
Restart the daemons.
Note that CSP isn’t really a public key system as used for https, but is
basically relying on shared secrets distributed within a single
administrative domain. The security of a user’s key is dependent on the
security of all hosts to which it is distributed—a privileged user on a
submit host can impersonate any user whose keys are on that host.
Typically keys will only be distributed en masse to submit hosts which
are secure login nodes. Users can copy their own keys to another submit
host, such as a personal workstation (with
sge_ca -copy, assuming the
home directory is secure).
for more usage information. The real security of CSP is unclear;
there is no known audit of it.
Authentication with MUNGE was introduced
in SGE 8.1.9, and may be most convenient in an HPC cluster. However,
it has not been well tested at the time of writing. It is probably
more convenient than CSP since it only requires a secret shared by
daemons running on each host. It also allows operation with
However, it provides authentication, not encryption of of the
communication channels, and is probably only appropriate in a
tightly-coupled security domain like and HPC cluster.
To use it, SGE must be built against the MUNGE library, e.g. the
GNU/Linux packaged versions. Then MUNGE must be set up (see the
on all the SGE hosts, i.e. with the daemon running against
the shared key for the cluster. Then the SGE daemons can be started
munge configured as the
The only other security method which currently (SGE 8.1.9) works
properly is the
afs one. However, as implemented, it doesn’t provide
authentication of users submitting jobs. Without that it is possible to
submit a job as another user (as above) and steal credentials from
another job of that user running on the same host, so that it could
actually facilitate security breaches. It may be possible to use
AUKS with CSP as an
alternative, but there’s currently no setup recipe published.
dce) method would work for authenticating job submission
and passing (but not renewing) Kerberos tickets, but the mechanism for
calling the sub-programs involved is partially broken and needs
Some largely historical information on security is available. The security framework should be re-done with hooks to allow arbitrary, composable methods.
SGE no longer runs external remote startup programs (see
in the user-defined environment, and so is not vulnerable to that part
of CVE-2012-0208 (see the
Other published responses to the CVE pass the
environment with some sanitation, but fail to remove all the known
Although the environment passed to other external methods, such as the
prolog, is sanitized when they are invoked with privileges (user
the sanitization may not be foolproof. See the SECURITY section of
Obviously these concerns are moot without restrictions imposed by the
uidmin limit or user authentication, as above.
Copyright © 2012, 2013, 2016 Dave Love, University of Liverpool