[GE users] reschedule_unknown_list

Andy Schwierskott andy.schwierskott at sun.com
Thu Apr 1 17:26:47 BST 2004


Sean,

sounds like a bug.

could you please file an issue in Issuezilla wit hthe description below?

Thanks,
Andy

> While running strace on sge_qmaster, I noticed that it was updating the
> exec_hosts/ files for some of the nodes rather often.  It was updating
> and several of them every second, even though no new jobs or real status
> changes were going on during these periods.  It seems there's a group of
> 66 nodes whose status get updated constantly (several times a second).
>
> Looking in the files, I notice they all have a reschedule_unknown_list
> line similar to this:
>
> reschedule_unknown_list    131 1=8,132 1=8,141 1=8,148 1=8,158 1=8
>
> The other nodes that don't get constantly updated have a value of NONE
> for this.
>
> Looking at the code that produces this file, it seems that the 131, 132,
> etc are supposed to be job numbers.  However, that doesn't make a whole
> lot of sense.  The lowest job number on the system right now is 2062.
>
> What are these numbers on the reschedule_unknown_list?  If they are job
> numbers, how do I make SGE forget about them?  Or at least, how do I get
> qmaster to stop updating these files several times a second?
>
> Thanks,

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list