[GE users] suspend/resume rsh/qrsh parallel task with SGE

reuti reuti at staff.uni-marburg.de
Mon Mar 9 13:10:43 GMT 2009


Hi,

Am 09.03.2009 um 10:46 schrieb fboucher:

> I would like to be able to suspend parallel task that are not based on
> MPI communications.
> The main script, that runs on the master, start child processes using
> rsh (or ssh) on different nodes. All those tasks are independent  
> and can
> be done in parallel (no communications between them). However, one  
> need
> to finish all of them before continuing the whole job.
> I would like to be able to suspend all the job (as one can do with
> mpitask). At the moment, the SIGTSTP or SIGSTOP signal that is send
> using qmod -sj. However, the child processes generated by the master
> script completely ignore this SIGNAL (it is not trap by rsh/qrsh  
> nor ssh).
> Does a way exist to send directly this SIGTSTP signal to all the child
> process created by the master script (or to trap it with the rsh/ssh
> command) ?

a patch was on the list some time ago (of course, you need a tight  
integration of the parallel application then):

http://gridengine.sunsource.net/ds/viewMessage.do? 
dsForumId=38&dsMessageId=74965

http://gridengine.sunsource.net/issues/show_bug.cgi?id=2740

-- Reuti

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=125423

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list