[GE issues] qsub never returns; though the jobs complete successfully

gtatachar gopinath.tatachar at gs.com
Tue May 4 14:50:16 BST 2010

Our production qmaster process hung/died and the failover worked after about 10/15m and qmaster process started on the sahdow server. 

This caused our production batch which had kicekd off a number of jobs never returned. So while the job ran successfully, qsub never came back, sitting out there waiting! The qsub commands (launched via scripts in autosys) were outastanding waiting - a lot of them. 

Is there any way to get qsub to return? or do we need to kill them and have autosys jobs go to failure? 

I would appreciate any information?



To unsubscribe from this discussion, e-mail: [issues-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list