[GE users] user configs disappear after qmaster restart (6.2u5).

ccaamad m.c.dixon at leeds.ac.uk
Sat May 15 20:14:37 BST 2010


On Sat, 15 May 2010, ccaamad wrote:

> Anyone seen this one before?
>
> I've just restarted a 6.2u5 qmaster after patching its host and noticed
> that most of my users have vanished from "qconf -suserl". There were 104
> and there are now 17.
>
> I'm using classic spooling to an NFS server and all 104 configuration
> files are still there under $SGE_ROOT/$SGE_CELL/spool/qmaster/users.
>
> However, the qmaster messages file contains lots of scary things like:

I went home to have a bit of a think. I think I've worked it out now. Of 
course, I didn't actually post the lines of the log which let me suss it 
out!

They were:

05/15/2010 19:58:21|  main|sched1|I|read job database with 262 entries in 0 seconds
05/15/2010 19:58:21|  main|sched1|E|error parsing double value from string "SccccSccccSccccScccc"
05/15/2010 19:58:21|  main|sched1|E|unrecognized characters after the attribute values in line 12: "0.000000"
05/15/2010 19:58:21|  main|sched1|E|unknown attribute name "binding_inuse"

The core binding feature turned-on. This episode shows a double-fault in 
the code:

1) Classic spooling is writing data that it cannot read.
    Tokens of the form "binding_inuse=SccccSccccSccccScccc=0.000000"
    appear in the first user file GE objected to.

2) Once GE had objected to a user, all subsequent user definitions it read
    were also marked as a bad format, despite being ok. Some error flag is
    not being reset.

Once I had deleted references to "binding_inuse" for the two user accounts 
who had them, all 104 users appeared again.

I'll look through the bug database and log these if they're not there 
already.

I'll try and get time to look through the source and see if I can fix 
them...

Cheers,

Mark
-- 
-----------------------------------------------------------------
Mark Dixon                       Email    : m.c.dixon at leeds.ac.uk
HPC/Grid Systems Support         Tel (int): 35429
Information Systems Services     Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=257424

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list