[GE users] Removal of $TMPDIR on slave nodes of a parallel job

Reuti reuti at staff.uni-marburg.de
Mon Dec 10 11:52:14 GMT 2007


Hi,

question: when is the $TMPDIR removed on the slave nodes of a tightly  
integrated parallel job? In the past I saw e.g. for Gaussian (with  
many parallel steps with serial steps in between) a bunch of  
pid.xy.nodexy and qrsh_exit_code. And now there is only one entry  
with the pid always, hence after each qrsh-call the remote $TMPDIR is  
removed? While Gaussian can live with that, we have another  
application Molcas, which will need the created files from the last  
qrsh-call on the slave nodes. Because of issue:

http://gridengine.sunsource.net/issues/show_bug.cgi?id=2358

KEEP_ACTIVE is not working for us. In the reaper_execd.c I found some  
unconditional rmdir's, which may bypass the set KEEP_ACTIVE. Best  
would be, to remove the $TMPDIRs on the slave nodes after the job,  
not after the actual qrsh-call. Any hint how this could be  
implemented, or debugged in case the $TMPDIR should stay there up to  
the job's end?

-- Reuti

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list