[GE users] GridEngine on Sierra systems

Fedele STABILE fedele at fis.unical.it
Wed May 28 12:33:50 BST 2008


In my installation GE doesn't substitute the native resource manager.
It's necessary explain the mechanism:
HP resource manager (we call RMS) is part of the Operating System. 
So if i need to run jobs or allocate cpus on the cluster i need to use
RMS commands like prun or allocate. 
HP-RMS uses a database to manage informations on the state of the
system, but it doesn't manage any job queue.
When i submit a job via GridEngine, i must interact with HP-RMS  to
execute my job on the cluster. If i don't have any particular
requirement (example: location of the reserved resources) i can use prun
instead of  mpirun to launch the job. 
Killing the job is easy because the signal kills also all processes, but
suspending and resume are problems because these signals are not
propagated. 
For this reason i developed scripts that i'm testing, but i see they are
ok.


Now there's another question to solve: is it possible that GE reserve a
resource and HP-RMS another?

Fedele 


Il giorno lun, 26/05/2008 alle 16.48 +0200, Reuti ha scritto:
> Hi,
> 
> Am 26.05.2008 um 16:40 schrieb Fedele STABILE:
> 
> > I've installed GridENgine on my Sierra System and it works !!
> >
> > Sierra System is an HP project of supercomputer that uses a QSW  
> > network
> > to connect a cluster of HP server. It uses a Resource Manager that  
> > loads
> > the parallel executable, as mpirun, and send it to the nodes in the
> > cluster.
> > So to submit a job GridEngine needs to communicate with this Resource
> > Manager.
> >
> > I have created script for "execution methods" that can suspend and
> > resume parallel jobs, no modification needed for starting and
> > termination.
> >
> > Is there anyone interested to discuss this argument?
> 
> if you have it running already, it would be nice if you could prepare  
> a Howto for it (how it's working in Sierra Systems and what the  
> scripts do). Is it still a Tight Integration, although it sends the  
> jobs in the end to HP's resource manager?
> 
> -- Reuti
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list