[GE users] qmaster logging error every 10 seconds

Greg A clusterman at gmail.com
Fri Jun 30 00:00:38 BST 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

I'm using truss -f -t read,write,open,close -wall -rall -p pid_of_qmaster
and have also done with  pid of scheduler.

Here is a snippet of the output.

16106/4:        read(9, 0x10419A450, 533)                       = 533
16106/4:          \0\0\0\01002\0\0\0\0\002\0\0\00310\01001\0\0\001\0\0\001 l
p\0\0
16106/4:          \0\001\0\0\0 #\0\001C2\0\0 '\f\0\001C3\0\0
\t\0\001C4\0\0  \t\0
16106/4:          \001C5\0\0  \t\0\001C6\0\0
\t\0\001C7\0\0\003\0\001C8\0\0\003\0
16106/4:          \001C9\0\0  03\0\001CA\0\0  \t\0\001CB\0\0
\t\0\001CC\0\0  \t\0
16106/4:          \001CD\0\0
\t\0\001CE\0\0\002\0\001CF\0\0\003\0\001D0\0\0\003\0
16106/4:
\001D1\0\0\003\0\001D2\0\0\003\0\001D3\0\0\0\b\0\001D4\0\0\003\0
16106/4:
\001D5\0\0\002\0\001D6\0\0\002\0\001D7\0\0\003\0\001D8\0\0\0\t\0
16106/4:
\001D9\0\0\0\t\0\001DA\0\0\003\0\001DB\0\0\003\0\001DC\0\0\0\t\0
16106/4:
\001DD\0\0\0\t\0\001DE\0\0\003\0\001DF\0\0\003\0\001E0\0\0\003\0
16106/4:          \001E1\0\0\0\t\0\001E2\0\0\003\0\001E3\0\0
\t\0\001E4\0\0\0\t\0
16106/4:          \0\002\0\0\0 #\0\0
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0
16106/4:
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0
16106/4:
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0
16106/4:
\0\0\0\0\0\0\0\0\001\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0
16106/4:
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0
16106/4:          \0\0\0\0\0\0\0\0\0\0 d * ~ * d * ~ * h ~ g , g n g p h e g
, ~ *
16106/4:           h ~ g p g p h e\0\0\0\001\0\0 EBA\0\0\0\0
16106/10:       write(2, " e r r o r :  ", 7)                   = 7
16106/5:        write(9, 0x103A4E520, 25)                       = 25
16106/5:           < g m s h > < d l > 1 0 1 < / d l > < / g m s h >
16106/10:       write(2, 0x1021E88C0, 52)                       = 52
16106/10:          d e n i e d :   " r e m o t e "   m u s t   b e   m a n a
g e r
16106/10:            f o r   t h i s   o p e r a t i o n\n
16106/5:        write(9, 0x103A4E520, 101)                      = 101
16106/5:           < m i h   v e r s i o n = " 0 . 1 " > < m i d > 3 5 7 0 1
< / m
16106/5:           i d > < d l > 3 9 < / d l > < d f > a m < / d f > < m a t
> n a
16106/5:           k < / m a t > < t a g > 0 < / t a g > < r i d > 0 < / r i
d > <
16106/5:           / m i h >
16106/10:       open("messages", O_WRONLY|O_APPEND|O_CREAT, 0666) = 121
16106/5:        write(9, 0x1044E2990, 39)                       = 39
16106/5:           < a m   v e r s i o n = " 0 . 1 " > < m i d > 1 7 8 5 1 <
/ m i
16106/5:           d > < / a m >
16106/10:       write(121, 0xFFFFFFFF7B8F8F80, 91)              = 91
16106/10:          0 6 / 2 9 / 2 0 0 6   1 8 : 1 7 : 3 9 | q m a s t e r | m
a s t e r
16106/10:         h o s t | E | d e n i e d :   " r e m o t e "   m u s t
b e
16106/10:          m a n a g e r   f o r   t h i s   o p e r a t i o n\n
16106/10:       close(121)





On 6/29/06, Alan Barclay <barclay at rtda.com> wrote:
>
> Are you using the -f option of truss to follow child processes also?
>
>
>
> Greg A wrote:
> >
> > We are receiving the following error every 10 seconds in the messages
> > file on the qmaster.  I've tryed a snoop and truss to try and capture
> > what/who is causing the message but I'm coming up empty.  I also tried
> > changing the loglevel to equal log_info but that didn't shed anymore
> > light.
> >
> > 06/29/2006 17:53:32|qmaster|masterhost|E|denied: "remote" must be
> > manager for this operation
> > 06/29/2006 17:53:42|qmaster|masterhost|E|denied: "remote" must be
> > manager for this operation
> > 06/29/2006 17:53:52|qmaster|masterhost|E|denied: "remote" must be
> > manager for this operation
> >
> > The scheduler runs every 10 seconds but a truss of the scheduler
> > doesn't have any information.  There are no jobs in error state and
> > nobody with a user account of remote trying to qsub jobs.
> >
> > Does anyone have any ideas on how I can find the node/job/process
> > causing this error?
> >
> > Thanks in advance!
>
>
> --Alan Barclay--  barclay at rtda.com   www.rtda.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>



More information about the gridengine-users mailing list