[GE issues] [Issue 2801] New - Problem with broken dot-file writing for sequential jobs in classic spooling mode

andreas andreas.haas at sun.com
Fri Nov 21 11:25:36 GMT 2008


http://gridengine.sunsource.net/issues/show_bug.cgi?id=2801
                 Issue #|2801
                 Summary|Problem with broken dot-file writing for sequential jo
                        |bs in classic spooling mode
               Component|gridengine
                 Version|6.2
                Platform|All
                     URL|
              OS/Version|All
                  Status|NEW
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|DEFECT
                Priority|P3
            Subcomponent|qmaster
             Assigned to|ernst
             Reported by|andreas






------- Additional comments from andreas at sunsource.net Fri Nov 21 03:25:32 -0800 2008 -------
Due to a bug in classic spooling for sequential jobs the writing operation is a
non-atomic one.

As a result fail-savety is not as good as it could be.

The bug can cause problems only at the time when qmaster restarts. At this time
there can be a loss of job in case the non-atomic writing operation failed. 

Below is a truss output that unveils the IO-operations in this case. Instead of

   open(<dot-file>)
   rename(<dot-file>, <file>)

it does 

   open(<file>)
   rename(<file>, <file>)

which is fairly pointless ...

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=36&dsMessageId=89331

To unsubscribe from this discussion, e-mail: [issues-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list