[GE users] sge 6.2u4 issue on Windows execution host
kostikbel at ukr.net
Wed Dec 23 11:36:59 GMT 2009
[Sorry for possible wrong reply formatting.
I am answering using web].
> kostikbel wrote:
> > We have SGE 6.2u4 installation that includes a group of Windows XP
> > SP2/SFU 3.5 hosts. Quite often, queues get into the "E" state. sge_execd
> > log contains the following
> > 12/14/2009 16:33:09| main|avexec4|E|ERROR: unlinking "jobs/00/0002/9728.1": Device busy
> > 12/14/2009 16:33:09| main|avexec4|E|can not remove file job spool file: jobs/00/0002/9728.1
> > 12/14/2009 16:33:09| main|avexec4|E|can't remove directory "active_jobs/29728.1": opendir(active_jobs/29728.1) failed: No such file or directory
> > The admin gets notification by email (for another job id,
> > just for illustration):
> > failed assumedly before job:can't open jobs/00/0002/.2808.1 for writing of job: Device busy
> > Note the "Device busy" part. I did searched for "Device busy" for SFU,
> > found http://www.suacommunity.com/forum/tm.aspx?m=5580&mpage=3#16800
> > but this seems not to help.
> > Any idea what is going on there ? What information shall I gather to
> > diagnose the problem ?
> "Device busy" normally means that you want to remove a removable disc
> while still some process has files of this disc open.
> I guess here it means there is still a file in use in this directory
> while this directory is deleted. Unlike UNIX, Windows doesn't allow
> this. However, I don't quite understand why writing to .../0002/.2808.1
> isn't possible.
> I'm not sure what the error reason is, but perhaps answering these
> questions can point us to the right direction:
> * Do the exec daemons spool on a normal HD or on a SSD or something else
> which is somehow dynamically 'mounted'?
"Normal" HDD as in VMWare ESXi 4.0, I believe.
> * Which file system does the spooling directory use? NTFS, FAT32, ...?
> * Is there a Virus scanner that could still read the spooling files
> while the directory is to be deleted?
No. And, please note that error seems to happen while transferring the job from master node to execution host.
> * Is there an indexing service or something similar running that
> automatically accesses new directories?
I consulted with windows guys, indexing
was turned off.
> You could also use the tool "Handle"
> (http://technet.microsoft.com/en-us/sysinternals/bb896655.aspx) to see
> if some other process has an open file in this directory, or
> (http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx) to log
> all file operations on the host during the job run.
> Sun Microsystems GmbH Harald Pollinger
> Dr.-Leo-Ritter-Str. 7 Sun Grid Engine Engineering
> D-93049 Regensburg Phone: +49 (0)941 3075-209 (x60209)
> Germany Fax: +49 (0)941 3075-222 (x60222)
> mailto:harald.pollinger at sun.com
> Sitz der Gesellschaft:
> Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
> Amtsgericht Muenchen: HRB 161028
> Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
> Vorsitzender des Aufsichtsrates: Martin Haering
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users