[GE users] intensive job

Reuti reuti at staff.uni-marburg.de
Mon Oct 27 10:40:02 GMT 2008


Hi Mag,

Am 26.10.2008 um 23:26 schrieb Mag Gam:

>> -) If also other jobs should run there: implement virtual_free or  
>> h_vmem to be consumable and request the proper amount like I  
>> mentioned in my first reply. When the memory is used up, no other  
>> jobs will be scheduled thereto. All jobs must request either  
>> virtual_free or h_vmem, so you will have to define a sensible  
>> default for it in the complex configuration (qconf -mc)
>
> I am going with this option.
>
> I am submitting jobs like this:
> qsub -l h_vmem=40g script.sh

did you read the first sentence carefully: "...implement virtual_free  
or h_vmem to be consumable ..."? Otherwise the limit is just a limit,  
and you can run too many jobs per machine.

-- Reuti


> The problem is during the array (-t 10 50) , I can see (qstat -f) 5 to
> 10 jobs running on the same box. This is naturally going to cause the
> job to fail. It seems the memory limits and 1 job per host is not
> working. This job is multihreaded - it uses 8 CPUs :-)
>
> So, basically I want this: run only ONE (1) instance of this program
> on a server. Once that job is completed do the next job. I don't want
> to run more than 1 instance of this job (if thats possible to do with
> array jobs).
>
> Any ideas?
>
> TIA
>
>
>
> On Sun, Oct 26, 2008 at 5:15 PM, Reuti <reuti at staff.uni-marburg.de>  
> wrote:
>> Am 26.10.2008 um 20:57 schrieb Mag Gam:
>>
>>> Reuti:
>>>
>>> You are right! I did have a memory limit. I removed it and his
>>> application works! Thankyou very much.
>>>
>>> Since these are intensive processes, we want to run only 1  
>>> process per
>>> host. To be safe, we can even wait for the process to complete and
>>> then submit a subtask. Is it possible to do that?
>>
>> There are two options:
>>
>> -) If you have just these type of jobs you could define the queue  
>> having
>> only one slot per machine (entry "slot" in the queue definition).  
>> This way
>> all can be submitted, and start only one after another on each  
>> machine.
>>
>> -) If also other jobs should run there: implement virtual_free or  
>> h_vmem to
>> be consumable and request the proper amount like I mentioned in my  
>> first
>> reply. When the memory is used up, no other jobs will be scheduled  
>> thereto.
>> All jobs must request either virtual_free or h_vmem, so you will  
>> have to
>> define a sensible default for it in the complex configuration  
>> (qconf -mc).
>>
>> -- Reuti
>>
>>
>>> I am asking this because I am getting random out of memory messages.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Oct 26, 2008 at 1:44 PM, Reuti <reuti at staff.uni- 
>>> marburg.de> wrote:
>>>>
>>>> Am 26.10.2008 um 18:08 schrieb Mag Gam:
>>>>
>>>>> I am certain I don't have any quotas regarding this.
>>>>>
>>>>>
>>>>> qconf -srqs
>>>>> {
>>>>>  name         cpu_limit
>>>>>  description  NONE
>>>>>  enabled      TRUE
>>>>>  limit        users mathprof to slots=8
>>>>> }
>>>>
>>>> Not the resource quotas, the queue configuration (qconf -sq  
>>>> myq). But it
>>>> seems, that there are some limits defined, as stack and virtual  
>>>> memory
>>>> are
>>>> defined as 15G.
>>>>
>>>> Only the soft-quotas are in effect, means what is an interactive  
>>>> "ulimit
>>>> -aS" showing in addition?
>>>>
>>>> The user is only allowed to change the limits in effect (i.e. the
>>>> doft-limit) between the hard-limti and zero. He can also lower the
>>>> hard-limit. But once it's lowered, it can't be risen again  
>>>> (unless root
>>>> is
>>>> executing these commands).
>>>>
>>>> -- Reuti
>>>>
>>>>
>>>>>
>>>>>
>>>>> Here is there output for the job
>>>>>
>>>>> core file size          (blocks, -c) unlimited
>>>>> data seg size           (kbytes, -d) unlimited
>>>>> scheduling priority             (-e) 0
>>>>> file size               (blocks, -f) unlimited
>>>>> pending signals                 (-i) 530431
>>>>> max locked memory       (kbytes, -l) 32
>>>>> max memory size         (kbytes, -m) unlimited
>>>>> open files                      (-n) 1024
>>>>> pipe size            (512 bytes, -p) 8
>>>>> POSIX message queues     (bytes, -q) 819200
>>>>> real-time priority              (-r) 0
>>>>> stack size              (kbytes, -s) unlimited
>>>>> cpu time               (seconds, -t) unlimited
>>>>> max user processes              (-u) 530431
>>>>> virtual memory          (kbytes, -v) unlimited
>>>>> file locks                      (-x) unlimited
>>>>>
>>>>> core file size          (blocks, -c) 0
>>>>> data seg size           (kbytes, -d) 15625000
>>>>> scheduling priority             (-e) 0
>>>>> file size               (blocks, -f) unlimited
>>>>> pending signals                 (-i) 530431
>>>>> max locked memory       (kbytes, -l) 32
>>>>> max memory size         (kbytes, -m) unlimited
>>>>> open files                      (-n) 1024
>>>>> pipe size            (512 bytes, -p) 8
>>>>> POSIX message queues     (bytes, -q) 819200
>>>>> real-time priority              (-r) 0
>>>>> stack size              (kbytes, -s) 15625000
>>>>> cpu time               (seconds, -t) unlimited
>>>>> max user processes              (-u) 530431
>>>>> virtual memory          (kbytes, -v) 15625000
>>>>> file locks                      (-x) unlimited
>>>>>
>>>>>
>>>>> See anything else?
>>>>>
>>>>>
>>>>> On Sun, Oct 26, 2008 at 12:37 PM, Reuti <reuti at staff.uni- 
>>>>> marburg.de>
>>>>> wrote:
>>>>>>
>>>>>> Am 26.10.2008 um 16:16 schrieb Mag Gam:
>>>>>>
>>>>>>> Thanks Reuti as usual!
>>>>>>>
>>>>>>> I have came to this problem now. My java application is  
>>>>>>> giving me this
>>>>>>> error:
>>>>>>>
>>>>>>> Error occurred during initialization of VM
>>>>>>> Could not reserve enough space for object heap
>>>>>>>
>>>>>>> All of the servers are free of memory, so there is no memory
>>>>>>> contention.
>>>>>>>
>>>>>>> I am submitting the job as qsub script.sh (without any -l  
>>>>>>> options)
>>>>>>>
>>>>>>> However, if I run it via ssh I get the correct results. I am  
>>>>>>> not sure
>>>>>>> why I am getting this error.
>>>>>>>
>>>>>>> I tried to look at this and it seems you are giving some  
>>>>>>> replies here,
>>>>>>> but still not helpful :-(
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> http://fossplanet.com/clustering.gridengine.users/ 
>>>>>>> message-1123088-strange-consequence-changing-n1ge/
>>>>>>
>>>>>> Mag,
>>>>>>
>>>>>> this can really be related. Can you please post your queue
>>>>>> configuration
>>>>>> -
>>>>>> did you define any limits there?
>>>>>>
>>>>>> Another hint would be to submit a job listing the limits  
>>>>>> inside a job,
>>>>>> i.e.:
>>>>>>
>>>>>> #!/bin/sh
>>>>>> ulimit -aH
>>>>>> echo
>>>>>> ulimit -aS
>>>>>>
>>>>>> -- Reuti
>>>>>>
>>>>>>>
>>>>>>> Any ideas?
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Oct 26, 2008 at 9:57 AM, Reuti <reuti at staff.uni- 
>>>>>>> marburg.de>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Am 26.10.2008 um 14:10 schrieb Mag Gam:
>>>>>>>>
>>>>>>>>> Hello Reuti:
>>>>>>>>>
>>>>>>>>> Would it help if I started at 10 instead of 1?
>>>>>>>>
>>>>>>>> sure, in this case you would just need the files *.10 to *. 
>>>>>>>> 19 when
>>>>>>>> you
>>>>>>>> want
>>>>>>>> to avoid the computation of canonical names for *.01 to *.10.
>>>>>>>>
>>>>>>>> qsub -t 10-19 ...
>>>>>>>>
>>>>>>>> -- Reuti
>>>>>>>>
>>>>>>>>
>>>>>>>>> #!/bin/sh
>>>>>>>>> echo "I'm $SGE_TASK_ID and will read 10000.$SGE_TASK_ID to  
>>>>>>>>> produce
>>>>>>>>> out.$SGE_TASK_ID"
>>>>>>>>> sleep 60
>>>>>>>>> exit 0
>>>>>>>>>
>>>>>>>>> and start it with:
>>>>>>>>> qsub -t 10 script.sh
>>>>>>>>>
>>>>>>>>> Works.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Oct 25, 2008 at 1:30 PM, Reuti <reuti at staff.uni- 
>>>>>>>>> marburg.de>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Am 25.10.2008 um 16:20 schrieb Mag Gam:
>>>>>>>>>>
>>>>>>>>>>> Reuti:
>>>>>>>>>>>
>>>>>>>>>>> As usual, thankyou! This is very help, but perhaps I  
>>>>>>>>>>> should backup
>>>>>>>>>>> a
>>>>>>>>>>> little.
>>>>>>>>>>>
>>>>>>>>>>> "qsub -l virtual_free=40g" does that reserve space or  
>>>>>>>>>>> does it wait
>>>>>>>>>>> for
>>>>>>>>>>> that space?
>>>>>>>>>>
>>>>>>>>>> As long as there are only SGE's jobs: both.
>>>>>>>>>>
>>>>>>>>>>> Also, what if a user (non GRID) is using the servers. I
>>>>>>>>>>> assume SGE will not account for that, or will it?
>>>>>>>>>>
>>>>>>>>>> This is always unpredictable. Can you force your  
>>>>>>>>>> interactive users
>>>>>>>>>> to
>>>>>>>>>> go
>>>>>>>>>> through SGE by requesting a an interactive job? Then yoiu  
>>>>>>>>>> would
>>>>>>>>>> need
>>>>>>>>>> h_vmem
>>>>>>>>>> instead of virtual_free to enforce the limits. for both  
>>>>>>>>>> typers of
>>>>>>>>>> jobs.
>>>>>>>>>>
>>>>>>>>>>> My intention is this:
>>>>>>>>>>> I have 1000000 file
>>>>>>>>>>>
>>>>>>>>>>> I split it into 10 blocks
>>>>>>>>>>> 100000.a
>>>>>>>>>>> 100000.b
>>>>>>>>>>> 100000.c
>>>>>>>>>>> ....
>>>>>>>>>>> 100000.j
>>>>>>>>>>
>>>>>>>>>> when you have split them already, you will need to rename  
>>>>>>>>>> them to
>>>>>>>>>> 100000.1
>>>>>>>>>> ... 100000.10
>>>>>>>>>>
>>>>>>>>>>> I also have a wrapper script like this.
>>>>>>>>>>>
>>>>>>>>>>> #!/bin/ksh
>>>>>>>>>>> #wrapper script -- wrapper.sh <filename>
>>>>>>>>>>> #$ -cwd
>>>>>>>>>>> #$ -V
>>>>>>>>>>> #$ -N fluid
>>>>>>>>>>> #$ -S /bin/ksh
>>>>>>>>>>>
>>>>>>>>>>> file=$1
>>>>>>>>>>> cat $file | java -XmX 40000m fluid0 > out.$SGE_TASK_ID.dat
>>>>>>>>>>>
>>>>>>>>>>> I invoke the script like this:
>>>>>>>>>>> qsub -l virtual_free=40g ./wrapper.sh 10000.a
>>>>>>>>>>> qsub -l virtual_free=40g ./wrapper.sh 10000.b
>>>>>>>>>>> ...
>>>>>>>>>>> qsub -l virtual_free=40g ./wrapper.sh 10000.j
>>>>>>>>>>
>>>>>>>>>> Please try first a simple job, to see how array jobs are  
>>>>>>>>>> handled:
>>>>>>>>>>
>>>>>>>>>> #!/bin/sh
>>>>>>>>>> echo "I'm $SGE_TASK_ID and will read 10000.$SGE_TASK_ID to  
>>>>>>>>>> produce
>>>>>>>>>> out.$SGE_TASK_ID"
>>>>>>>>>> sleep 60
>>>>>>>>>> exit 0
>>>>>>>>>>
>>>>>>>>>> and start it with:
>>>>>>>>>>
>>>>>>>>>> qsub -t 10 script.sh
>>>>>>>>>>
>>>>>>>>>> -- Reuti
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I have tried to use the -t option for an array job, but  
>>>>>>>>>>> it was not
>>>>>>>>>>> working for some reason.
>>>>>>>>>>>
>>>>>>>>>>> Any thoughts about this method?
>>>>>>>>>>>
>>>>>>>>>>> TIA
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Oct 25, 2008 at 7:14 AM, Reuti
>>>>>>>>>>> <reuti at staff.uni-marburg.de>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Mag,
>>>>>>>>>>>>
>>>>>>>>>>>> Am 25.10.2008 um 02:40 schrieb Mag Gam:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hello All.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We have a professor who is notorious for bring down our
>>>>>>>>>>>>> engineering
>>>>>>>>>>>>> GRID (64 servers) servers due to his direct numerical
>>>>>>>>>>>>> simulations.
>>>>>>>>>>>>> He
>>>>>>>>>>>>> basically runs a Java program with -Xmx 40000m (40  
>>>>>>>>>>>>> gigs). This
>>>>>>>>>>>>> preallocates 40 gigs of memory and then crashes the box  
>>>>>>>>>>>>> because
>>>>>>>>>>>>> there
>>>>>>>>>>>>
>>>>>>>>>>>> this looks more like that you have to setup SGE to  
>>>>>>>>>>>> manage the
>>>>>>>>>>>> memory
>>>>>>>>>>>> and
>>>>>>>>>>>> request the necessary amount of memory for the job and  
>>>>>>>>>>>> submit it
>>>>>>>>>>>> with
>>>>>>>>>>>> "qsub
>>>>>>>>>>>> -l virtual_free=40g ..."
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> http://gridengine.sunsource.net/servlets/ReadMsg? 
>>>>>>>>>>>> listName=users&msgNo=15079
>>>>>>>>>>>>
>>>>>>>>>>>>> are other processes running on the box. Each box has  
>>>>>>>>>>>>> 128G of
>>>>>>>>>>>>> Physical
>>>>>>>>>>>>> memory. He runs the application like this:
>>>>>>>>>>>>> cat series | java -Xmx 40000m fluid0 > out.dat
>>>>>>>>>>>>>
>>>>>>>>>>>>> the "series" file has over 10 million records.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was thinking of something like this: split the 10  
>>>>>>>>>>>>> million
>>>>>>>>>>>>> records
>>>>>>>>>>>>> into 10 files (each file has 1 million record), submit  
>>>>>>>>>>>>> 10 array
>>>>>>>>>>>>> jobs,
>>>>>>>>>>>>> and then output to out.dat. But the order for 'out.dat'  
>>>>>>>>>>>>> matters!
>>>>>>>>>>>>> I
>>>>>>>>>>>>> would like to run these 10 jobs independently, but how  
>>>>>>>>>>>>> can I
>>>>>>>>>>>>> maintain
>>>>>>>>>>>>> order?  Or is there a better way to do this?
>>>>>>>>>>>>>
>>>>>>>>>>>>> By him submitting his current job it would not be wise...
>>>>>>>>>>>>
>>>>>>>>>>>> You mean: one array job with 10 tasks - right? So "qsub - 
>>>>>>>>>>>> t 1-10
>>>>>>>>>>>> my_job".
>>>>>>>>>>>>
>>>>>>>>>>>> In each jobscript you can use (adjust for the usual +/-  
>>>>>>>>>>>> 1 problem
>>>>>>>>>>>> at
>>>>>>>>>>>> the
>>>>>>>>>>>> beginning and end):
>>>>>>>>>>>>
>>>>>>>>>>>> sed -n -e $[(SGE_TASK_ID-1)*1000000],$ 
>>>>>>>>>>>> [SGE_TASK_ID*1000000]p |
>>>>>>>>>>>> java
>>>>>>>>>>>> -Xmx
>>>>>>>>>>>> 40000m fluid0 > out${SGE_TASK_ID}.dat
>>>>>>>>>>>>
>>>>>>>>>>>> hence output only the necessary lines of the input file and
>>>>>>>>>>>> create
>>>>>>>>>>>> a
>>>>>>>>>>>> unique
>>>>>>>>>>>> output file for each task of an array job. Also for the  
>>>>>>>>>>>> output
>>>>>>>>>>>> file,
>>>>>>>>>>>> maybe
>>>>>>>>>>>> it's not necessary to concat them into one file, as you can
>>>>>>>>>>>> sometimes
>>>>>>>>>>>> use
>>>>>>>>>>>> a
>>>>>>>>>>>> construct like:
>>>>>>>>>>>>
>>>>>>>>>>>> cat out*.dat | my_pgm
>>>>>>>>>>>>
>>>>>>>>>>>> for further processing. More than 9 tasks this would  
>>>>>>>>>>>> lead to the
>>>>>>>>>>>> wrong
>>>>>>>>>>>> order
>>>>>>>>>>>> 1, 10, 2, 3, ... and you need a variant from the above  
>>>>>>>>>>>> command:
>>>>>>>>>>>>
>>>>>>>>>>>> sed -n -e $[(SGE_TASK_ID-1)*1000000],$ 
>>>>>>>>>>>> [SGE_TASK_ID*1000000]p |
>>>>>>>>>>>> java
>>>>>>>>>>>> -Xmx
>>>>>>>>>>>> 40000m fluid0 > out$(printf "%02d" $SGE_TASK_ID).dat
>>>>>>>>>>>>
>>>>>>>>>>>> for having leading zeros for the index in the name of  
>>>>>>>>>>>> the output
>>>>>>>>>>>> file.
>>>>>>>>>>>>
>>>>>>>>>>>> -- Reuti
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ----------------------------------------------------------- 
>>>>>>>>>>>> ----------
>>>>>>>>>>>> To unsubscribe, e-mail:
>>>>>>>>>>>> users-unsubscribe at gridengine.sunsource.net
>>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>>> users-help at gridengine.sunsource.net
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ------------------------------------------------------------ 
>>>>>>>>>>> ---------
>>>>>>>>>>> To unsubscribe, e-mail: users- 
>>>>>>>>>>> unsubscribe at gridengine.sunsource.net
>>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>>> users-help at gridengine.sunsource.net
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ------------------------------------------------------------- 
>>>>>>>>>> --------
>>>>>>>>>> To unsubscribe, e-mail: users- 
>>>>>>>>>> unsubscribe at gridengine.sunsource.net
>>>>>>>>>> For additional commands, e-mail:
>>>>>>>>>> users-help at gridengine.sunsource.net
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -------------------------------------------------------------- 
>>>>>>>>> -------
>>>>>>>>> To unsubscribe, e-mail: users- 
>>>>>>>>> unsubscribe at gridengine.sunsource.net
>>>>>>>>> For additional commands, e-mail: users- 
>>>>>>>>> help at gridengine.sunsource.net
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------- 
>>>>>>>> ------
>>>>>>>> To unsubscribe, e-mail: users- 
>>>>>>>> unsubscribe at gridengine.sunsource.net
>>>>>>>> For additional commands, e-mail: users- 
>>>>>>>> help at gridengine.sunsource.net
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------- 
>>>>>>> -----
>>>>>>> To unsubscribe, e-mail: users- 
>>>>>>> unsubscribe at gridengine.sunsource.net
>>>>>>> For additional commands, e-mail: users- 
>>>>>>> help at gridengine.sunsource.net
>>>>>>>
>>>>>>
>>>>>>
>>>>>> ----------------------------------------------------------------- 
>>>>>> ----
>>>>>> To unsubscribe, e-mail: users- 
>>>>>> unsubscribe at gridengine.sunsource.net
>>>>>> For additional commands, e-mail: users- 
>>>>>> help at gridengine.sunsource.net
>>>>>>
>>>>>>
>>>>>
>>>>> ------------------------------------------------------------------ 
>>>>> ---
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail: users- 
>>>>> help at gridengine.sunsource.net
>>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>>
>>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list