[GE users] Qsub never returns; though the jobs complete successfully

gtatachar gopinath.tatachar at gs.com
Tue May 4 14:49:25 BST 2010

Our production qmaster process hung/died and the failover worked after about 10/15m and qmaster process started on the sahdow server. 

This caused our production batch which had kicekd off a number of jobs never returned. So while the job ran successfully, qsub never came back, sitting out there waiting! The qsub commands (launched via scripts in autosys) were outastanding waiting - a lot of them. 

Is there any way to get qsub to return? or do we need to kill them and have autosys jobs go to failure? 

I would appreciate any information?



To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list