[GE users] Clearing errors: qmod -cq / qmod -cj issues

Sebastian Stark stark at tuebingen.mpg.de
Wed Jan 25 09:32:23 GMT 2006


Am 24.01.2006 um 16:58 schrieb Reuti:

>>
>> Yes, I get the same error for queue instances that are not in  
>> error state:
>>
>> neckar ~ % qmod -cq all.q at node143
>> invalid queue "all.q at node143"
>>
>> And now I find that if I want to check if this queue instance has  
>> an error it does not work at all:
>>
>> neckar ~ % qstat -f -q all.q at node143
>> neckar ~ %
>>
>> This should return the status if this queue instance, right? This  
>> works, however:
>>
>> neckar ~ % qstat -f | grep node143
>> all.q at node143                  BP    2/2       14.97    lx24-amd64
>> parallel.q at node143             BIP   13/16     14.97    lx24-amd64
>>
>> Something seriously broke in my sge installation since I upgraded  
>> to 6.0u7, as it seems. It might also be a problem that I switched  
>> from /etc/hosts to DNS based hostname resolution. Could it be that  
>> SGE is confused about the fact that IP lookups return the fqdn of  
>> the hosts now instead of just the host part?
>>
>
> yes, there might be an issue, as the internal name known to SGE is  
> only the short one. What is "qhost", "qconf -sel" saying? Maybe you  
> have to add the machines now again (and you will see the FQDN also  
> in qstat for each queue instance).

qhost shows the hostnames only and qconf -sel shows the FQDN.

If re-adding all hosts is the way to go I will try that as soon as  
possible. Is there any way to have the domain name stripped from the  
output of qstat et al? It's kind of ugly :/


-Sebastian

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list