[GE users] sge_shepard not dying

Margaret Doll Margaret_Doll at brown.edu
Tue Jun 19 17:07:58 BST 2007


I have one user who submits jobs, sometimes deletes them and leaves  
the compute nodes full of sge_sheperd-nnn -bg jobs.

[root at compute-0-1 ~]# ps -ef | grep sge*
sge       4207     1  0 May09 ?        03:07:50 /opt/gridengine/bin/ 
lx26-amd64/sge_execd
sge      19994  4207  0 May23 ?        00:00:03 sge_shepherd-176 -bg
sge      20070  4207  0 May23 ?        00:00:03 sge_shepherd-181 -bg
sge      21361  4207  0 May24 ?        00:00:01 sge_shepherd-184 -bg
nanguyen 21362 21361  0 May24 ?        00:00:00 sge_shepherd-184 -bg
sge      28576  4207  0 Jun06 ?        00:00:00 sge_shepherd-286 -bg
nanguyen 28577 28576  0 Jun06 ?        00:00:00 sge_shepherd-286 -bg
sge      28584  4207  0 Jun06 ?        00:00:00 sge_shepherd-288 -bg
nanguyen 28585 28584  0 Jun06 ?        00:00:00 sge_shepherd-288 -bg
sge      28652  4207  0 Jun06 ?        00:00:00 sge_shepherd-297 -bg
nanguyen 28653 28652  0 Jun06 ?        00:00:00 sge_shepherd-297 -bg
sge      31052  4207  0 Jun18 ?        00:00:00 sge_shepherd-470 -bg
nanguyen 31053 31052  0 Jun18 ?        00:00:00 sge_shepherd-470 -bg
root      3220  3085  0 12:03 pts/1    00:00:00 grep sge*


Until this are cleared from the node, jobs won't run.  I know that I  
can clear the jobs by rebooting the compute node, but there must be a  
cleaner way of clearing the sge_shepard jobs.

Any idea how the user is doing this?  Other users do not leave the  
sge_shepard jobs around.

Thanks.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list