[GE users] intensive job

Mag Gam magawake at gmail.com
Sat Oct 25 15:20:36 BST 2008


    [ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Reuti:

As usual, thankyou! This is very help, but perhaps I should backup a little.

"qsub -l virtual_free=40g" does that reserve space or does it wait for
that space? Also, what if a user (non GRID) is using the servers. I
assume SGE will not account for that, or will it?


My intention is this:
I have 1000000 file

I split it into 10 blocks
100000.a
100000.b
100000.c
....
100000.j


I also have a wrapper script like this.

#!/bin/ksh
#wrapper script -- wrapper.sh <filename>
#$ -cwd
#$ -V
#$ -N fluid
#$ -S /bin/ksh

file=$1
cat $file | java -XmX 40000m fluid0 > out.$SGE_TASK_ID.dat

I invoke the script like this:
qsub -l virtual_free=40g ./wrapper.sh 10000.a
qsub -l virtual_free=40g ./wrapper.sh 10000.b
...
qsub -l virtual_free=40g ./wrapper.sh 10000.j


I have tried to use the -t option for an array job, but it was not
working for some reason.

Any thoughts about this method?

TIA


On Sat, Oct 25, 2008 at 7:14 AM, Reuti <reuti at staff.uni-marburg.de> wrote:
> Hi Mag,
>
> Am 25.10.2008 um 02:40 schrieb Mag Gam:
>
>> Hello All.
>>
>> We have a professor who is notorious for bring down our engineering
>> GRID (64 servers) servers due to his direct numerical simulations. He
>> basically runs a Java program with -Xmx 40000m (40 gigs). This
>> preallocates 40 gigs of memory and then crashes the box because there
>
> this looks more like that you have to setup SGE to manage the memory and
> request the necessary amount of memory for the job and submit it with "qsub
> -l virtual_free=40g ..."
>
> http://gridengine.sunsource.net/servlets/ReadMsg?listName=users&msgNo=15079
>
>> are other processes running on the box. Each box has 128G of Physical
>> memory. He runs the application like this:
>> cat series | java -Xmx 40000m fluid0 > out.dat
>>
>> the "series" file has over 10 million records.
>>
>> I was thinking of something like this: split the 10 million records
>> into 10 files (each file has 1 million record), submit 10 array jobs,
>> and then output to out.dat. But the order for 'out.dat' matters! I
>> would like to run these 10 jobs independently, but how can I maintain
>> order?  Or is there a better way to do this?
>>
>> By him submitting his current job it would not be wise...
>
> You mean: one array job with 10 tasks - right? So "qsub -t 1-10 my_job".
>
> In each jobscript you can use (adjust for the usual +/- 1 problem at the
> beginning and end):
>
> sed -n -e $[(SGE_TASK_ID-1)*1000000],$[SGE_TASK_ID*1000000]p | java -Xmx
> 40000m fluid0 > out${SGE_TASK_ID}.dat
>
> hence output only the necessary lines of the input file and create a unique
> output file for each task of an array job. Also for the output file, maybe
> it's not necessary to concat them into one file, as you can sometimes use a
> construct like:
>
> cat out*.dat | my_pgm
>
> for further processing. More than 9 tasks this would lead to the wrong order
> 1, 10, 2, 3, ... and you need a variant from the above command:
>
> sed -n -e $[(SGE_TASK_ID-1)*1000000],$[SGE_TASK_ID*1000000]p | java -Xmx
> 40000m fluid0 > out$(printf "%02d" $SGE_TASK_ID).dat
>
> for having leading zeros for the index in the name of the output file.
>
> -- Reuti
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list