Opened 17 years ago

Closed 9 years ago

#95 closed task (invalid)

IZ554: GE scheduler and ntpdate seems incompatible.

Reported by: yoshiki Owned by:
Priority: normal Milestone:
Component: sge Version: 5.3p2
Severity: minor Keywords: PC Linux qmaster
Cc:

Description

[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=554]

        Issue #:      554              Platform:     PC      Reporter: yoshiki (yoshiki)
       Component:     gridengine          OS:        Linux
     Subcomponent:    qmaster          Version:      5.3p2      CC:    None defined
        Status:       NEW              Priority:     P3
      Resolution:                     Issue type:    TASK
                                   Target milestone: ---
      Assigned to:    andreas (andreas)
      QA Contact:     ernst
          URL:
       * Summary:     GE scheduler and ntpdate seems incompatible.
   Status whiteboard:
      Attachments:

     Issue 554 blocks:
   Votes for issue 554:


   Opened: Thu May 15 18:12:00 -0700 2003 
------------------------


My cluster has one ntp server node.
The others has some clone script for time
synchronization as following.
0 3 * * * /usr/sbin/ntpdate -s
192.168.1.1; /sbin/hwclock --systohc

When I run some job at about ntpdate setting time
(ex. 3 a.m.),
the job is sometimes re-run. Some output of
qmaster message is following.
--------------------------------------------------
------------------------
Wed May 14 02:59:51 2003|qmaster|node01|W|system
time has been modified (-8 seconds)
Wed May 14 02:59:51
2003|qmaster|node01|E|enrolled, but leave_commd()
call failed with status: NOT ENROLLED
Wed May 14 03:00:28 2003|qmaster|node01|W|job
3746.1 failed on host node07  rescheduling
because: manual/auto rescheduling
Wed May 14 03:00:28
2003|qmaster|node01|W|rescheduling job 3746.1
Wed May 14 03:00:28 2003|qmaster|node01|W|job
3828.1 failed on host node10  rescheduling
because: manual/auto rescheduling
Wed May 14 03:00:28
2003|qmaster|node01|W|rescheduling job 3828.1
--------------------------------------------------
------------------------
I suppose it happens a long time delay to report
qmaster from execd on time changed node.

   ------- Additional comments from sgrell Tue Dec 6 08:38:58 -0700 2005 -------
Changed subcomponent.

Stephan

   ------- Additional comments from mnikhil Wed Jan 16 07:44:27 -0700 2008 -------
may I know anything latest on this please?

   ------- Additional comments from joga Wed May 28 03:40:25 -0700 2008 -------
The corresponding code is only in 5.3 source base,
so are you seeing this in a 5.3 cluster?
Then you should consider upgrading to the current version (6.1u4).

A workaround might be to run ntpd instead of regular calls to ntpdate.

Change History (1)

comment:1 Changed 9 years ago by dlove

  • Resolution set to invalid
  • Severity set to minor
  • Status changed from new to closed

Seems to be only 5.3.

Note: See TracTickets for help on using tickets.