[GE users] "locking down" grid machines
dag at sonsorol.org
Wed May 12 23:11:36 BST 2004
I've dealt with these situations ocasionally on various consulting
projects and I've come to the conclusion that the solution is management
and policy, not technical tricks.
There will always be users trying to game the system or people who want
to bypass the scheduler and hop onto nodes directly. The effort required
to lock people out becomes a non-productive spiral of
response/counter-response between the cluster managers and users.
The fact of the matter is that gaming the system is effectively taking
resources away from collegues or projects. In a business or research
institutional setting these other collegues and projects may actually
have a documented higher business or scientific priority than the work
the 'gamer' is trying to get done.
The issue should be treated as a management/administrative issue. If
users violate the acceptable use policy for a cluster then they should
be dealt with by their manager or group leader. It is far easier I think
to frame this issue as one of "personal conduct" rather than trying to
errect fences and technical traps within the grid.
This is all good in theory but it can actually work in practice as well.
Some cluster admins I know use "personal conduct / acceptable use
policy" as the first approach and then resort to scripts or monitoring
tools that will kill a process belonging to a user that has not been
launched for instance by a sge_shepard process. The combination works
well -- if a user has been taught the rules then they can't really
complain when their attempts to bypass the system end up with killed
jobs and missing data.
Boone J. Severson wrote:
> Since implementing SGE on our compute servers we've had a few cases
> where people think they're just too busy to learn the command line
> switches to qsub and qrsh so they just directly "ssh" into the machine,
> bypassing the grid submit host and our complexes/hard resources configs
> that we've got. It was ok in the beginning because it was just a few
> known users, but now qmon is noting that several queues are being
> disabled due to processor load when the grid is unaware of any users
> being assigned to those queues. >:(
> Is there a method for locking down non-superuser access to a (SuSE Linux
> 9.0) machine except for qrsh/qsub? I'm guessing our IS/IT group won't
> enjoy creating customized /etc/passwd files but if that's the only
> option we'll have to consider it.
> Any and all input would be appreciated since I can't imagine this is the
> first time "grid abuse" has happened.
> Boone Severson
Chris Dagdigian, <dag at sonsorol.org>
BioTeam - Independent life science IT & informatics consulting
Office: 617-665-6088, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E iChat/AIM: bioteamdag Web: http://bioteam.net
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users