Opened 16 years ago
Last modified 10 years ago
#197 new enhancement
IZ1254: Entry in PE to change multiplication of resource limits
Reported by: | reuti | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | sge | Version: | 6.0 |
Severity: | Keywords: | Linux scheduling | |
Cc: |
Description
[Imported from gridengine issuezilla http://gridengine.sunsource.net/issues/show_bug.cgi?id=1254]
Issue #: 1254 Platform: Other Reporter: reuti (reuti) Component: gridengine OS: Linux Subcomponent: scheduling Version: 6.0 CC: [_] uddeborg [_] Remove selected CCs Status: NEW Priority: P3 Resolution: Issue type: ENHANCEMENT Target milestone: --- Assigned to: andreas (andreas) QA Contact: andreas URL: * Summary: Entry in PE to change multiplication of resource limits Status whiteboard: Attachments: Issue 1254 blocks: Votes for issue 1254: Opened: Fri Aug 27 02:10:00 -0700 2004 ------------------------ In the current implementation, the RESOURCE LIMITS to a queue/job is multiplied by the number of slots taken on the master machine (and it seems not for qrsh processes). There should be two switches in the configuration of a PE: multiply_limits_for_master_process multiply_limits_for_slave_processes On the one hand, you have to multiply the resource limits to get the correct limits for the master processes (also I found that there is a time delay, until the resource consumption of all child processes are accounted by the mother process in contrast to the immediate enforcement of the limits for a single process). On the other hand the multiplication maybe wrong for processes creating child processes by (q)rsh. E.g. Gaussian03 with Linda. You can request 8 processes on 4 machines and decide in the Gaussian inputfile to make 7 times (q)rsh, or to make only 3 times (q)rsh and create the other tasks as threads. You have to decide this from job to job, because some calculation types are only Linda parallel, others are only thread parallel. With the availability of the switches for the PEs, I would just create two PEs and would get the correct limits for each job type. ------- Additional comments from reuti Fri Aug 27 05:21:45 -0700 2004 ------- To limit the amount of (q)rsh commands allowed by SGE, maybe it would be better to have an entry: limit_to_one_qrsh_per_host yes/no instead of the suggested: multiply_limits_for_slave_processes yes/no. The latter should also be applied, but can be derived from the first one. ------- Additional comments from sgrell Mon Dec 12 02:44:19 -0700 2005 ------- Changed subcomponent. Stephan ------- Additional comments from reuti Thu Aug 24 12:33:46 -0700 2006 ------- A similar feature would be to allow or disallow the multiplication of resource requests: multiply_resource_requests ------- Additional comments from uddeborg Thu May 31 09:17:55 -0700 2007 ------- I find it a bit silly to have to add a comment just to add yourself to the CC list. :-) ------- Additional comments from reuti Fri Apr 25 05:13:18 -0700 2008 ------- Over time I think now it's better to specify it in addition on the complex level with an additonal column "multiply yes/now". Reason is, that e.g. for a memory limit it might be necessary to have it per slot while at the same time the license is per job. OTOH, different parallel jobs might need or need not the multiplication of a memory limit in the same cluster (OpenMP jobs work on the same memory area, while MPI ones don't do it). Hence the entry in the PE would still be advantageous. ------- Additional comments from reuti Fri Apr 25 05:25:38 -0700 2008 ------- Or an entry in the PE listing the not to be multiplied complexes. ------- Additional comments from roland Fri Jan 30 06:02:22 -0700 2009 ------- Reuti's note from Apr 25 2008 will be implemented in 6.2u2 by the non-multiplied resource requests. For more informations please see: http://gridengine.sunsource.net/nonav/source/browse/~checkout~/gridengine/doc/devel/rfe/non-multiplied-pe-requests.txt ------- Additional comments from reuti Fri Jan 30 06:46:04 -0700 2009 ------- It is nice of course to get this feature as it will solve the odd handling of licenses where only one is needed per job, but it will not be enough to cover a mix of jobs in the cluster. Therefore I wrote, that an entry in the PE would still be advantageous: h_vmem can only be JOBS or YES See also: http://gridengine.sunsource.net/ds/viewMessage.do?dsMessageId=98580&dsForumId=38 Shall I enter a new issue for this? ------- Additional comments from roland Fri Jan 30 06:55:59 -0700 2009 ------- There is no need to create a new issue because I did not change the state of this one.
Note: See
TracTickets for help on using
tickets.