[GE users] File copy from local /scratch on job termination

reuti reuti at staff.uni-marburg.de
Mon Dec 15 14:27:57 GMT 2008


Am 15.12.2008 um 15:22 schrieb Olesen, Mark:

> Hi Reuti,
>
> Do you know if -notify now works with openmpi? There used to be a
> problem of USR1/USR2 killing the daemons.

Dunno, but I can test it. - Reuti


> /mark
>
>> -----Original Message-----
>> From: reuti [mailto:reuti at staff.uni-marburg.de]
>> Sent: Monday, December 15, 2008 3:20 PM
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] File copy from local /scratch on job
>> termination
>>
>> Hi,
>>
>> Am 15.12.2008 um 15:09 schrieb Bart Willems:
>>
>>> I recently urged our cluster users to use local scratch space on the
>>> cluster nodes instead of the NFS mounted RAID during their
>>> calculations.
>>> In the example submission file below all required files for the job
>>> are
>>> copied over to the node's local hard disk ($TMPDIR is /scratch) and
>>> copied
>>> back when the job completes. However, the files only get copied
>>> back when
>>> the job exits normally. If SGE terminates the job because it
>>> exceeds the
>>> requested CPU time or if a user manually terminates a job with
>>> qdel, the
>>> files are not copied back from the node's local hard disk to the
>>> RAID. Is
>>> there any way around this?
>>
>> yes.
>>
>>> Thanks,
>>> Bart
>>>
>>
>> You have to submit the job with -notify
>>
>>
>>> #!/bin/bash
>>>
>>> #$ -S /bin/bash
>>> #$ -j y
>>> #$ -N helloworld_test
>>> #$ -l h_cpu=00:02:00
>>> #$ -cwd
>>
>> # Two single quotes
>> trap '' usr1 usr2
>>
>>
>>> # Copy job files to local scratch space
>>> JOBFILE=jobfiles.job-id-$JOB_ID.tgz
>>> tar cfz $JOBFILE helloworld
>>> cp $JOBFILE $TMPDIR
>>> rm -rf $JOBFILE
>>> cd $TMPDIR
>>> tar xfz $JOBFILE
>>> rm -rf $JOBFILE
>>
>> Maybe you can avoid the local file:
>>
>> tar cj helloworld | tar xj -C $TMPDIR
>> cd $TMPDIR
>>
>>> # Computational command to run
>>> ./helloworld
>>
>> replace with:
>>
>> (trap - usr1 usr2; exec ./helloworld)
>>
>>
>>> # Copy all files back.
>>> OUTFILE=outfiles.job-id-$JOB_ID.tgz
>>> tar cfz $OUTFILE *
>>> cp $OUTFILE $SGE_CWD_PATH
>>> cd $SGE_CWD_PATH
>>> tar xfz $OUTFILE
>>> rm -rf $OUTFILE
>>
>> tar -cj $OUTFILE * | tar xj -C $SGE_CWD_PATH
>>
>>
>> HTH - Reuti
>>
>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>> dsForumId=38&dsMessageId=92671
>>>
>>> To unsubscribe from this discussion, e-mail: [users-
>>> unsubscribe at gridengine.sunsource.net].
>>>
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessa
>> geId=92675
>>
>> To unsubscribe from this discussion, e-mail: [users-
>> unsubscribe at gridengine.sunsource.net].
> This e-mail message and any attachments may contain
> legally privileged, confidential or proprietary Information,
> or information otherwise protected by law of EMCON
> Technologies, its affiliates, or third parties. This notice
> serves as marking of its "Confidential" status as defined
> in any confidentiality agreements concerning the sender
> and recipient. If you are not the intended recipient(s),
> or the employee or agent responsible for delivery of this
> message to the intended recipient(s), you are hereby
> notified that any dissemination, distribution or copying
> of this e-mail message is strictly prohibited.
> If you have received this message in error, please
> immediately notify the sender and delete this e-mail
> message from your computer.
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=92677
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=92678

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list