[GE users] Educating users

Chris Dagdigian dag at sonsorol.org
Thu Mar 31 18:58:15 BST 2005

When doing Grid Engine training I usually sit down and give customized 
talks and examples to the following "types" of people:

o IT staff & cluster operators
o SGE Admins (specifically the folks who have to maintain policies)
o "power user" end users
o "basic" I-just-need-my-results-back end users

Each group needs to hear about different things.

For power users and people comfortable with unix shell scripting and 
perl it is often *very* helpful to spend a few hours teaching them how 
to write array jobs and other complex workflows. These people really 
need to hear about job dependencies, array jobs and resource request 

For "basic" users we share pre-written simple job scripts and spend most 
of the time teaching simple usage (qsub, qstat & qrsh) along with simple 
job failure troubleshooting techniques.

The biggest mistake companies and groups make is thinking that running 
and managing a cluster requires the same skills and background as 
"effectively using a cluster for a specific use case or workflow". They 
invest all their time and money in getting a cluster and cluster 
operator(s) while failing to do any sort of serious training for end 
users because "oh, the IT people can answer those questions..."

The best and most productive Grid Engine deployments I've seen in the 
biotech and pharmaceutical markets are in places where people are smart 
enough to realize that they need to find and train one or more 
"toolsmiths" how to use the cluster most efficiently. These toolsmiths 
have the proper industry/scientific background to understand user 
requirements and translate them into job scripts or cluster workflows.

Many times these toolsmiths will be used in 2 main ways: (a) writing 
precanned job scripts for basic end users and (b) talking technical with 
the power-users who plan on writing cluster aware code. Many times the 
toolsmith is just a researcher or poweruser who need to seriously make 
use of the cluster for personal research while also being interested in 
helping others out. In other cases the toolsmith is a dedicated hire.

Anyway, the takehome message is that often the sysadmins running cluster 
do not have the specific industry/scientific background (and time!) to 
sit down and seriously train end-users. Often times it is best to the 
other way -- cherry pick the best people from the user community and 
invest real effort into giving them the training and knowledge they need 
to spread usage best practices around.

My .02


Marconnet, James E Mr /Computer Sciences Corporation wrote:

> Just curious if anyone has anything to share about how you educate your
> rocket scientist** users how to set up their software/data/run matrixes for
> successful cluster runs, how to test a few cases first before submitting the
> entire fine-pitched run matirx, how to submit multiple jobs other than
> manually, how to monitor jobs, what the applicable ques are, etc. etc.  I've
> seen two school web pages on specific SGE commands that were somewhat
> helpful. But nothing so far on pre-run planning to run effectively on a
> cluster.
> **Note: Yes, sometimes it DOES take a Rocket Scientist, especially here in
> Rocket City, Huntsville, AL!)
> Jim Marconnet

Chris Dagdigian, <dag at sonsorol.org>
BioTeam  - Independent life science IT & informatics consulting
Office: 617-665-6088, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E iChat/AIM: bioteamdag  Web: http://bioteam.net

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list