[GE users] sge_shepherd staying around

Margaret Doll Margaret_Doll at brown.edu
Thu Jun 21 21:59:38 BST 2007


The sge_shepherd jobs stay around after a user has qdel their jobs.

The first five jobs indicated by the number after "_shepherd" are  
shown as running in qmon. (status is "r")
The first four jobs are actually taking up 100% of the CPU for their  
slot.
The last two jobs were "qdel" by the user over a day ago.



sge       6281  4225  0 Jun19 ?        00:00:00 sge_shepherd-482 -bg
sge       6355  4225  0 Jun19 ?        00:00:00 sge_shepherd-484 -bg
sge       6429  4225  0 Jun19 ?        00:00:00 sge_shepherd-488 -bg
sge       6503  4225  0 Jun19 ?        00:00:00 sge_shepherd-490 -bg
sge      11285  4225  0 Jun20 ?        00:00:00 sge_shepherd-508 -bg
nanguyen 11286 11285  0 Jun20 ?        00:00:00 sge_shepherd-508 -bg
nanguyen 11292     1  0 Jun20 ?        00:00:00 sge_shepherd-511 -bg
nanguyen 11311     1  0 Jun20 ?        00:00:00 sge_shepherd-516 -bg

I have only been able to get rid of the extraneous sge_shepherd jobs  
by rebooting the compute node.

Obviously I need to learn a lot about sge_shepherd.

Any one know what is causing the problem.  I believe that the  
extraneous sge_shepherd jobs count as a job filling a slot.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list