[GE users] Jobs sticking in grid

reuti reuti at staff.uni-marburg.de
Tue Jan 26 23:18:55 GMT 2010


Am 26.01.2010 um 21:15 schrieb gclark:

> After a job completes we're seeing an export job still sitting on  
> the grid. A 'job' in our situation ends up being a large number of  
> scripts that get started by a single submission to the SGE. As part  
> of this process our developers have found it necessary to export  
> some environment variables. It seems that the export command stays  
> behind and continues to run on the grid. This happens even when the  
> jobs complete successfully.

can you tell us more details about your workflow? You submit one job  
and this one submits many in addition and this first submitted job  
does not end?

And what do you mean by export command, it's a command in one of your  
scripts? When the process is still visible in the process list, you  
can (as root or the user running the job) use strace to start to peek  
at the waiting job on the executing node:

$ strace -p <pid>

Maybe you can get an idea why it's hanging there.

-- Reuti

> I'm not sure why they're sticking around, its not like the job is  
> hung or failed. If the export fails none of the scripts will complete.
> Any ideas or suggestions on where to look to correct this one?
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=241155
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list