[GE users] intensive job

Mag Gam magawake at gmail.com
Sun Oct 26 13:10:50 GMT 2008


    [ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hello Reuti:

Would it help if I started at 10 instead of 1?

#!/bin/sh
echo "I'm $SGE_TASK_ID and will read 10000.$SGE_TASK_ID to produce
out.$SGE_TASK_ID"
sleep 60
exit 0

and start it with:
qsub -t 10 script.sh

Works.



On Sat, Oct 25, 2008 at 1:30 PM, Reuti <reuti at staff.uni-marburg.de> wrote:
> Am 25.10.2008 um 16:20 schrieb Mag Gam:
>
>> Reuti:
>>
>> As usual, thankyou! This is very help, but perhaps I should backup a
>> little.
>>
>> "qsub -l virtual_free=40g" does that reserve space or does it wait for
>> that space?
>
> As long as there are only SGE's jobs: both.
>
>> Also, what if a user (non GRID) is using the servers. I
>> assume SGE will not account for that, or will it?
>
> This is always unpredictable. Can you force your interactive users to go
> through SGE by requesting a an interactive job? Then yoiu would need h_vmem
> instead of virtual_free to enforce the limits. for both typers of jobs.
>
>> My intention is this:
>> I have 1000000 file
>>
>> I split it into 10 blocks
>> 100000.a
>> 100000.b
>> 100000.c
>> ....
>> 100000.j
>
> when you have split them already, you will need to rename them to 100000.1
> ... 100000.10
>
>> I also have a wrapper script like this.
>>
>> #!/bin/ksh
>> #wrapper script -- wrapper.sh <filename>
>> #$ -cwd
>> #$ -V
>> #$ -N fluid
>> #$ -S /bin/ksh
>>
>> file=$1
>> cat $file | java -XmX 40000m fluid0 > out.$SGE_TASK_ID.dat
>>
>> I invoke the script like this:
>> qsub -l virtual_free=40g ./wrapper.sh 10000.a
>> qsub -l virtual_free=40g ./wrapper.sh 10000.b
>> ...
>> qsub -l virtual_free=40g ./wrapper.sh 10000.j
>
> Please try first a simple job, to see how array jobs are handled:
>
> #!/bin/sh
> echo "I'm $SGE_TASK_ID and will read 10000.$SGE_TASK_ID to produce
> out.$SGE_TASK_ID"
> sleep 60
> exit 0
>
> and start it with:
>
> qsub -t 10 script.sh
>
> -- Reuti
>
>
>>
>> I have tried to use the -t option for an array job, but it was not
>> working for some reason.
>>
>> Any thoughts about this method?
>>
>> TIA
>>
>>
>> On Sat, Oct 25, 2008 at 7:14 AM, Reuti <reuti at staff.uni-marburg.de> wrote:
>>>
>>> Hi Mag,
>>>
>>> Am 25.10.2008 um 02:40 schrieb Mag Gam:
>>>
>>>> Hello All.
>>>>
>>>> We have a professor who is notorious for bring down our engineering
>>>> GRID (64 servers) servers due to his direct numerical simulations. He
>>>> basically runs a Java program with -Xmx 40000m (40 gigs). This
>>>> preallocates 40 gigs of memory and then crashes the box because there
>>>
>>> this looks more like that you have to setup SGE to manage the memory and
>>> request the necessary amount of memory for the job and submit it with
>>> "qsub
>>> -l virtual_free=40g ..."
>>>
>>>
>>> http://gridengine.sunsource.net/servlets/ReadMsg?listName=users&msgNo=15079
>>>
>>>> are other processes running on the box. Each box has 128G of Physical
>>>> memory. He runs the application like this:
>>>> cat series | java -Xmx 40000m fluid0 > out.dat
>>>>
>>>> the "series" file has over 10 million records.
>>>>
>>>> I was thinking of something like this: split the 10 million records
>>>> into 10 files (each file has 1 million record), submit 10 array jobs,
>>>> and then output to out.dat. But the order for 'out.dat' matters! I
>>>> would like to run these 10 jobs independently, but how can I maintain
>>>> order?  Or is there a better way to do this?
>>>>
>>>> By him submitting his current job it would not be wise...
>>>
>>> You mean: one array job with 10 tasks - right? So "qsub -t 1-10 my_job".
>>>
>>> In each jobscript you can use (adjust for the usual +/- 1 problem at the
>>> beginning and end):
>>>
>>> sed -n -e $[(SGE_TASK_ID-1)*1000000],$[SGE_TASK_ID*1000000]p | java -Xmx
>>> 40000m fluid0 > out${SGE_TASK_ID}.dat
>>>
>>> hence output only the necessary lines of the input file and create a
>>> unique
>>> output file for each task of an array job. Also for the output file,
>>> maybe
>>> it's not necessary to concat them into one file, as you can sometimes use
>>> a
>>> construct like:
>>>
>>> cat out*.dat | my_pgm
>>>
>>> for further processing. More than 9 tasks this would lead to the wrong
>>> order
>>> 1, 10, 2, 3, ... and you need a variant from the above command:
>>>
>>> sed -n -e $[(SGE_TASK_ID-1)*1000000],$[SGE_TASK_ID*1000000]p | java -Xmx
>>> 40000m fluid0 > out$(printf "%02d" $SGE_TASK_ID).dat
>>>
>>> for having leading zeros for the index in the name of the output file.
>>>
>>> -- Reuti
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list