[GE users] intensive job

Reuti reuti at staff.uni-marburg.de
Sat Oct 25 18:30:13 BST 2008


Am 25.10.2008 um 16:20 schrieb Mag Gam:

> Reuti:
>
> As usual, thankyou! This is very help, but perhaps I should backup  
> a little.
>
> "qsub -l virtual_free=40g" does that reserve space or does it wait for
> that space?

As long as there are only SGE's jobs: both.

> Also, what if a user (non GRID) is using the servers. I
> assume SGE will not account for that, or will it?

This is always unpredictable. Can you force your interactive users to  
go through SGE by requesting a an interactive job? Then yoiu would  
need h_vmem instead of virtual_free to enforce the limits. for both  
typers of jobs.

> My intention is this:
> I have 1000000 file
>
> I split it into 10 blocks
> 100000.a
> 100000.b
> 100000.c
> ....
> 100000.j

when you have split them already, you will need to rename them to  
100000.1 ... 100000.10

> I also have a wrapper script like this.
>
> #!/bin/ksh
> #wrapper script -- wrapper.sh <filename>
> #$ -cwd
> #$ -V
> #$ -N fluid
> #$ -S /bin/ksh
>
> file=$1
> cat $file | java -XmX 40000m fluid0 > out.$SGE_TASK_ID.dat
>
> I invoke the script like this:
> qsub -l virtual_free=40g ./wrapper.sh 10000.a
> qsub -l virtual_free=40g ./wrapper.sh 10000.b
> ...
> qsub -l virtual_free=40g ./wrapper.sh 10000.j

Please try first a simple job, to see how array jobs are handled:

#!/bin/sh
echo "I'm $SGE_TASK_ID and will read 10000.$SGE_TASK_ID to produce  
out.$SGE_TASK_ID"
sleep 60
exit 0

and start it with:

qsub -t 10 script.sh

-- Reuti


>
> I have tried to use the -t option for an array job, but it was not
> working for some reason.
>
> Any thoughts about this method?
>
> TIA
>
>
> On Sat, Oct 25, 2008 at 7:14 AM, Reuti <reuti at staff.uni-marburg.de>  
> wrote:
>> Hi Mag,
>>
>> Am 25.10.2008 um 02:40 schrieb Mag Gam:
>>
>>> Hello All.
>>>
>>> We have a professor who is notorious for bring down our engineering
>>> GRID (64 servers) servers due to his direct numerical  
>>> simulations. He
>>> basically runs a Java program with -Xmx 40000m (40 gigs). This
>>> preallocates 40 gigs of memory and then crashes the box because  
>>> there
>>
>> this looks more like that you have to setup SGE to manage the  
>> memory and
>> request the necessary amount of memory for the job and submit it  
>> with "qsub
>> -l virtual_free=40g ..."
>>
>> http://gridengine.sunsource.net/servlets/ReadMsg? 
>> listName=users&msgNo=15079
>>
>>> are other processes running on the box. Each box has 128G of  
>>> Physical
>>> memory. He runs the application like this:
>>> cat series | java -Xmx 40000m fluid0 > out.dat
>>>
>>> the "series" file has over 10 million records.
>>>
>>> I was thinking of something like this: split the 10 million records
>>> into 10 files (each file has 1 million record), submit 10 array  
>>> jobs,
>>> and then output to out.dat. But the order for 'out.dat' matters! I
>>> would like to run these 10 jobs independently, but how can I  
>>> maintain
>>> order?  Or is there a better way to do this?
>>>
>>> By him submitting his current job it would not be wise...
>>
>> You mean: one array job with 10 tasks - right? So "qsub -t 1-10  
>> my_job".
>>
>> In each jobscript you can use (adjust for the usual +/- 1 problem  
>> at the
>> beginning and end):
>>
>> sed -n -e $[(SGE_TASK_ID-1)*1000000],$[SGE_TASK_ID*1000000]p |  
>> java -Xmx
>> 40000m fluid0 > out${SGE_TASK_ID}.dat
>>
>> hence output only the necessary lines of the input file and create  
>> a unique
>> output file for each task of an array job. Also for the output  
>> file, maybe
>> it's not necessary to concat them into one file, as you can  
>> sometimes use a
>> construct like:
>>
>> cat out*.dat | my_pgm
>>
>> for further processing. More than 9 tasks this would lead to the  
>> wrong order
>> 1, 10, 2, 3, ... and you need a variant from the above command:
>>
>> sed -n -e $[(SGE_TASK_ID-1)*1000000],$[SGE_TASK_ID*1000000]p |  
>> java -Xmx
>> 40000m fluid0 > out$(printf "%02d" $SGE_TASK_ID).dat
>>
>> for having leading zeros for the index in the name of the output  
>> file.
>>
>> -- Reuti
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list