[GE users] multiplied resource limits with pe

Reuti reuti at staff.uni-marburg.de
Thu Aug 24 17:27:16 BST 2006


Am 24.08.2006 um 17:15 schrieb Gerd Marquardt:

> Hello,
> we have a cluster with 4-way-nodes.
> Running a simple Job which use a parallel environment and restrict  
> some resource limits, the limits are multiplied.
> For example the Job:
> #$ -pe mpi 4
> #$ -l h_vmem=128M
> #$ -l h_stack=32M
> echo ulimit -d
> ulimit -d
> echo ulimit -s
> ulimit -s
>
> Produce this output:
> ulimit -d
> 524288
> ulimit -s
> 131072
>
> The limits are multiplied by 4.
> How can I avoid this behavior?
>
> We have Red Hat EL4, we have defined a pe for our shared memory  
> programs ( OpenMP). Here also a defined time limit (h_cpu) is  
> multiplied. The limit have no affect on the process but on the  
> threads. So mostly the process run 4 times too long.

AFAIK:

Using Forks:

You are right, that each fork will get the multiplied limit, but SGE  
will take care of the accumulated consumption and kill the job.

Using Threads:

All CPU time is accumulated by the main thread, so the SGE and kernel  
limits should be the same, and the job will also be killed. This you  
could test by running the program interactively and watch in top the  
used up CPU time, which should run faster than your watch.

-- Reuti 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list