No subject


Wed Jan 12 20:38:46 GMT 2011


message receive timeout error).

And running the qping command returns this information:

qping -info sge_master_host 801 qmaster 1
11/07/2008 09:07:55:
SIRM version:             0.1
SIRM message id:          1
start time:               11/01/2008 11:22:19 (1225538539)
run time [s]:             510336
messages in read buffer:  0
messages in write buffer: 0
nr. of connected clients: 411
status:                   2
info:                     MAIN: E (510335.92) | signaler000: E
(510333.48) | event_master000: E (0.01) | timer000: E (3.00) |
worker000: E (57078.01) | worker001: E (56779.02) | listener000: E
(0.25) | listener001: E (0.10) | scheduler000: E (56751.01) | ERROR
malloc:                   arena(451719168) |ordblks(14) | smblks(52) |
hblksr(2) | hblhkd(2105344) usmblks(0) | fsmblks(1904) |
uordblks(451578592) | fordblks(140576) | keepcost(126736)
Monitor:                  disabled


The sge_master process is still running on the master host, and contains
about 12 child sge_master processes.

Would stopping and starting the sge_master service kill any running
jobs, or should they happily communicate with the new master process.

Any help would be much appreciated 

Cheers,

Mat Bradford

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88269

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list