[GE users] intensive job

Reuti reuti at staff.uni-marburg.de
Sun Oct 26 13:57:00 GMT 2008


Hi,

Am 26.10.2008 um 14:10 schrieb Mag Gam:

> Hello Reuti:
>
> Would it help if I started at 10 instead of 1?

sure, in this case you would just need the files *.10 to *.19 when  
you want to avoid the computation of canonical names for *.01 to *.10.

qsub -t 10-19 ...

-- Reuti


> #!/bin/sh
> echo "I'm $SGE_TASK_ID and will read 10000.$SGE_TASK_ID to produce
> out.$SGE_TASK_ID"
> sleep 60
> exit 0
>
> and start it with:
> qsub -t 10 script.sh
>
> Works.
>
>
>
> On Sat, Oct 25, 2008 at 1:30 PM, Reuti <reuti at staff.uni-marburg.de>  
> wrote:
>> Am 25.10.2008 um 16:20 schrieb Mag Gam:
>>
>>> Reuti:
>>>
>>> As usual, thankyou! This is very help, but perhaps I should backup a
>>> little.
>>>
>>> "qsub -l virtual_free=40g" does that reserve space or does it  
>>> wait for
>>> that space?
>>
>> As long as there are only SGE's jobs: both.
>>
>>> Also, what if a user (non GRID) is using the servers. I
>>> assume SGE will not account for that, or will it?
>>
>> This is always unpredictable. Can you force your interactive users  
>> to go
>> through SGE by requesting a an interactive job? Then yoiu would  
>> need h_vmem
>> instead of virtual_free to enforce the limits. for both typers of  
>> jobs.
>>
>>> My intention is this:
>>> I have 1000000 file
>>>
>>> I split it into 10 blocks
>>> 100000.a
>>> 100000.b
>>> 100000.c
>>> ....
>>> 100000.j
>>
>> when you have split them already, you will need to rename them to  
>> 100000.1
>> ... 100000.10
>>
>>> I also have a wrapper script like this.
>>>
>>> #!/bin/ksh
>>> #wrapper script -- wrapper.sh <filename>
>>> #$ -cwd
>>> #$ -V
>>> #$ -N fluid
>>> #$ -S /bin/ksh
>>>
>>> file=$1
>>> cat $file | java -XmX 40000m fluid0 > out.$SGE_TASK_ID.dat
>>>
>>> I invoke the script like this:
>>> qsub -l virtual_free=40g ./wrapper.sh 10000.a
>>> qsub -l virtual_free=40g ./wrapper.sh 10000.b
>>> ...
>>> qsub -l virtual_free=40g ./wrapper.sh 10000.j
>>
>> Please try first a simple job, to see how array jobs are handled:
>>
>> #!/bin/sh
>> echo "I'm $SGE_TASK_ID and will read 10000.$SGE_TASK_ID to produce
>> out.$SGE_TASK_ID"
>> sleep 60
>> exit 0
>>
>> and start it with:
>>
>> qsub -t 10 script.sh
>>
>> -- Reuti
>>
>>
>>>
>>> I have tried to use the -t option for an array job, but it was not
>>> working for some reason.
>>>
>>> Any thoughts about this method?
>>>
>>> TIA
>>>
>>>
>>> On Sat, Oct 25, 2008 at 7:14 AM, Reuti <reuti at staff.uni- 
>>> marburg.de> wrote:
>>>>
>>>> Hi Mag,
>>>>
>>>> Am 25.10.2008 um 02:40 schrieb Mag Gam:
>>>>
>>>>> Hello All.
>>>>>
>>>>> We have a professor who is notorious for bring down our  
>>>>> engineering
>>>>> GRID (64 servers) servers due to his direct numerical  
>>>>> simulations. He
>>>>> basically runs a Java program with -Xmx 40000m (40 gigs). This
>>>>> preallocates 40 gigs of memory and then crashes the box because  
>>>>> there
>>>>
>>>> this looks more like that you have to setup SGE to manage the  
>>>> memory and
>>>> request the necessary amount of memory for the job and submit it  
>>>> with
>>>> "qsub
>>>> -l virtual_free=40g ..."
>>>>
>>>>
>>>> http://gridengine.sunsource.net/servlets/ReadMsg? 
>>>> listName=users&msgNo=15079
>>>>
>>>>> are other processes running on the box. Each box has 128G of  
>>>>> Physical
>>>>> memory. He runs the application like this:
>>>>> cat series | java -Xmx 40000m fluid0 > out.dat
>>>>>
>>>>> the "series" file has over 10 million records.
>>>>>
>>>>> I was thinking of something like this: split the 10 million  
>>>>> records
>>>>> into 10 files (each file has 1 million record), submit 10 array  
>>>>> jobs,
>>>>> and then output to out.dat. But the order for 'out.dat' matters! I
>>>>> would like to run these 10 jobs independently, but how can I  
>>>>> maintain
>>>>> order?  Or is there a better way to do this?
>>>>>
>>>>> By him submitting his current job it would not be wise...
>>>>
>>>> You mean: one array job with 10 tasks - right? So "qsub -t 1-10  
>>>> my_job".
>>>>
>>>> In each jobscript you can use (adjust for the usual +/- 1  
>>>> problem at the
>>>> beginning and end):
>>>>
>>>> sed -n -e $[(SGE_TASK_ID-1)*1000000],$[SGE_TASK_ID*1000000]p |  
>>>> java -Xmx
>>>> 40000m fluid0 > out${SGE_TASK_ID}.dat
>>>>
>>>> hence output only the necessary lines of the input file and  
>>>> create a
>>>> unique
>>>> output file for each task of an array job. Also for the output  
>>>> file,
>>>> maybe
>>>> it's not necessary to concat them into one file, as you can  
>>>> sometimes use
>>>> a
>>>> construct like:
>>>>
>>>> cat out*.dat | my_pgm
>>>>
>>>> for further processing. More than 9 tasks this would lead to the  
>>>> wrong
>>>> order
>>>> 1, 10, 2, 3, ... and you need a variant from the above command:
>>>>
>>>> sed -n -e $[(SGE_TASK_ID-1)*1000000],$[SGE_TASK_ID*1000000]p |  
>>>> java -Xmx
>>>> 40000m fluid0 > out$(printf "%02d" $SGE_TASK_ID).dat
>>>>
>>>> for having leading zeros for the index in the name of the output  
>>>> file.
>>>>
>>>> -- Reuti
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>>
>>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list