Opened 4 years ago

Last modified 3 years ago

#1565 new defect

Modifying an exechost(qconf -aattr) covered by an Advance Reservation crashes qmaster

Reported by: wish Owned by:
Priority: normal Milestone:
Component: sge Version: 8.1.8
Severity: minor Keywords:
Cc:

Description

Was present in 6.2u3 too.
One probably shouldn't do this but the qmaster shouldn't crash if you try.

Change History (3)

comment:1 Changed 4 years ago by dlove

I can't reproduce this simply. Can you supply an explicit example?

comment:2 Changed 3 years ago by wish

Annoyingly while this reliably breaks our production cluster our dev/test cluster seems immune. Possibly load related. The reservation was for a pe and the only resource requested was an exclusive resource associated with each host.

comment:3 Changed 3 years ago by dlove

Annoyingly while this reliably breaks our production cluster our dev/test
cluster seems immune. Possibly load related. The reservation was for a pe
and the only resource requested was an exclusive resource associated with
each host.

Can you get a core dump (export SGE_ENABLE_COREDUMP=1 in
/etc/sysconfig/sgemaster if using the distributed init scripts on a
RHEL-ish system)? If you can do it reliably, presumably knocking
qmaster off for a minute or two won't cause too much grief.

Note: See TracTickets for help on using tickets.