[GE users] Cleaning scratch files.

Reuti reuti at staff.uni-marburg.de
Thu Jun 29 06:59:55 BST 2006


Am 29.06.2006 um 07:47 schrieb Baudilio Tejerina:

> Hi,
>
> I am using SGE 6.0, however my script does not contain any 'qrsh'.  
> I  just inserted
>
> ...
> #$ -V
> #
> setenv RCMD_PREFIX TMPDIR=$TMPDIR
> #
> /opt/local/progs/runscr  data.in

What is runscr doing? Is it MPICH, MPICH2, Linda, ... - Reuti


> ....
>
> into the script but, TMPDIR is unknown to the slaves nodes yet.
>
>
> Baudilio
>
>
> On Jun 29, 2006, at 12:08 AM, Reuti wrote:
>
>> Am 29.06.2006 um 04:59 schrieb Baudilio Tejerina:
>>
>>> Hi,
>>>
>>> I had considered, and actually I tried, using $TMPDIR but, as you  
>>> mentioned, I have observed that the variable is only used by the  
>>> 'master' node. I conclude that my job is not 'tightly  
>>> integrated'; isn't it?
>>>
>>> Is there any procedure so that I can pass $TMPDIR to the slave  
>>> nodes? I've tried via scripts but, as I suspected, it doesn't work.
>>
>> If you are using SGE 6.0, just attach the PE to only one queue,  
>> hence all the $TMPDIRs will have the same name. Then in the rsh- 
>> wrapper add the -V switch to get "qrsh -inherit" call. So all  
>> slave processes will also know bout it.
>>
>> Another option is setting
>>
>> export RCMD_PREFIX="TMPDIR=$TMPDIR"
>>
>> in your job script before the first qrsh call.
>>
>> HTH - Reuti
>>
>>
>>> Thanks so much for all,...
>>>
>>> Baudilio
>>>
>>>
>>> On Jun 28, 2006, at 6:30 PM, Reuti wrote:
>>>
>>>> Hi,
>>>>
>>>> Am 29.06.2006 um 00:52 schrieb Baudilio Tejerina:
>>>>
>>>>> Hi,
>>>>>
>>>>> If in a parallel job, where several scratch files are created  
>>>>> in the respective local-node disks, the job is deleted (qdel),  
>>>>> the scratch files remain there. Is there any way to have 'sge'  
>>>>> taking care of them (automatically removed them)?
>>>>>
>>>>> I've tried by inserting the appropriate 'rm' statement in the  
>>>>> executing script but, these only work if the jobs successfully  
>>>>> finish, obviously.
>>>>
>>>> in a tightly integrated parallel job the locally created $TMPDIR  
>>>> with all stuff inside should also be deleted. The $TMPDIR will  
>>>> be created by the first qrsh to the node. You just have to make  
>>>> sure, that the $TMPDIR is used by all slave processes.
>>>>
>>>> If you want to do it for any reason on your own, you could use  
>>>> the -notify in your qsub command and trap the warning, which the  
>>>> job will get in this case. In the remaining time (defaults to 60  
>>>> sec.) you could also delete all the stuff in a custom way. Only  
>>>> a new qrsh isn't allowed in this case, so you will end up with a  
>>>> loose integration.
>>>>
>>>> HTH - Reuti
>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list