[GE users] cant delete host from SGE

adary at marvell.com adary at marvell.com
Wed Dec 17 11:39:37 GMT 2008


Hi Andy,

I actually checked each and every hostgroup, and lnx400 is not referenced in any of them.

I also ran the three commands you listed just to be sure, and lnx400 is nowhere to be found.

The SGE version is 6.1u3

I'm pretty sure that this is a bug, since this is not the only host that behaves like this. I have at least 5 more (out of 300+ hosts in the grid)

> Yuval,
> 
> that might indicate there's a bug if lnx400 is also not referenced directly
> in lnx400. Are you using any hosts aliasing via the "host_aliases" file?
> 
> Can you do a check as follows:
> 
>   qconf -sq bulk|grep "@"
>      -> should show all hostgroups used by "bulk" queue
> 
>   qconf -shgrp_tree <hostgroups_referenced_by_bulk_queue>
> 
>   qconf -shgrp_resolved <<hostgroups_referenced_by_bulk_queue>
> 
> Does the problem also occur after a qmaster restart?
> 
> Which version/patch level are you using?
> 
> Andy
> 
> On Wed, 17 Dec 2008, Yuval Adar wrote:
> 
> > In certain rare cases I'm not able to remove a host completely from SGE
> >
> > [117] root at sge_master ==>qconf -de lnx400
> > Host object "lnx400" is still referenced in cluster queue "bulk".
> >
> > When I look at the bulk queue, it doesn't reference the said host at all, and the host is not included in any host group that is included in that queue in fact, the host is not listed in any hostgroup at all :
> >
> > bash-3.00# for i in `qconf -shgrpl`; do qconf -shgrp $i | grep lnx400; done
> > bash-3.00#
> >
> > Has anyone ever experienced something similar?

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=92940

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list