[GE users] sge 6.2u4 issue on Windows execution host
harald.pollinger at sun.com
Tue Dec 22 11:11:58 GMT 2009
> We have SGE 6.2u4 installation that includes a group of Windows XP
> SP2/SFU 3.5 hosts. Quite often, queues get into the "E" state. sge_execd
> log contains the following
> 12/14/2009 16:33:09| main|avexec4|E|ERROR: unlinking "jobs/00/0002/9728.1": Device busy
> 12/14/2009 16:33:09| main|avexec4|E|can not remove file job spool file: jobs/00/0002/9728.1
> 12/14/2009 16:33:09| main|avexec4|E|can't remove directory "active_jobs/29728.1": opendir(active_jobs/29728.1) failed: No such file or directory
> The admin gets notification by email (for another job id,
> just for illustration):
> failed assumedly before job:can't open jobs/00/0002/.2808.1 for writing of job: Device busy
> Note the "Device busy" part. I did searched for "Device busy" for SFU,
> found http://www.suacommunity.com/forum/tm.aspx?m=5580&mpage=3#16800
> but this seems not to help.
> Any idea what is going on there ? What information shall I gather to
> diagnose the problem ?
"Device busy" normally means that you want to remove a removable disc
while still some process has files of this disc open.
I guess here it means there is still a file in use in this directory
while this directory is deleted. Unlike UNIX, Windows doesn't allow
this. However, I don't quite understand why writing to .../0002/.2808.1
I'm not sure what the error reason is, but perhaps answering these
questions can point us to the right direction:
* Do the exec daemons spool on a normal HD or on a SSD or something else
which is somehow dynamically 'mounted'?
* Which file system does the spooling directory use? NTFS, FAT32, ...?
* Is there a Virus scanner that could still read the spooling files
while the directory is to be deleted?
* Is there an indexing service or something similar running that
automatically accesses new directories?
You could also use the tool "Handle"
(http://technet.microsoft.com/en-us/sysinternals/bb896655.aspx) to see
if some other process has an open file in this directory, or
(http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx) to log
all file operations on the host during the job run.
Sun Microsystems GmbH Harald Pollinger
Dr.-Leo-Ritter-Str. 7 Sun Grid Engine Engineering
D-93049 Regensburg Phone: +49 (0)941 3075-209 (x60209)
Germany Fax: +49 (0)941 3075-222 (x60222)
mailto:harald.pollinger at sun.com
Sitz der Gesellschaft:
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Haering
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users