Custom Query (431 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (13 - 15 of 431)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Ticket Resolution Summary Owner Reporter
#1550 fixed Better job scheduling within a sharetree node Mark Dixon <m.c.dixon@…> markdixon
Description

When using the sharetree policy, jobs are assigned a priority based upon a hierarchical tree. Pending jobs located in the same sharetree node are currently sorted by a very simple algorithm - this enhancement is an attempt to help it take parallel jobs into consideration.

Hoping this will improve scheduling on a cluster with job sizes that vary by x10e3. Will be trying it out over the next few months. Presumably the functional policy might need a similar modification.

From the patch:

Enhmt #xxx sharetree node priority scaled by slot (not job) count

When a sharetree node has pending jobs, each job was assigned the number of sharetree tickets (stcks) due to the node and then scaled based on how many running and pending jobs that the node had ahead of it - sum(job_ahead).

This changes it to be related to the number of assigned slots that the node has ahead of the job - sum(job_ahead*slots).

e.g. If there are no jobs running and a single job pending, the pending job will still receive the full number of stcks due to the node. If there is one 8 slot job running and one pending, the pending job will receive 1/9 of the stcks due to the node, instead of 1/2.

There are no doubt more accurate maths it could be based on, such as something based on the usage_weight_list config option, and more accurate measures of slots (we simply take the minimum of the first PE range in the job request here). This is an attempt to make a 1st order correction, allowing more complicated calculations later if necessary.

It is hoped that this change will make the sharetree policy fairer for nodes with a job mix containing jobs with a variety of slot counts.

Feedback welcome!

Thanks,

Mark

#1526 invalid Bug writing pe_hostfile binding strategy ? Didier.Rebeix@…
Description

Hi there,

Trying to use SGE core binding feature, I'm facing strange binding strategy in the generated pe_hostfile.

If I submit a dmp job with " -binding pe linear:slots " every node in the pe_hostfile seems to get the same binding strategy.

Below are 2 examples of strange pe_hostfiles and corresponding qsub options :

#################### example 1 #################### qsub -q batch -pe dmp* 64 -binding pe linear:slots simple.job

part061.u-bourgogne.fr 12 batch@… 0,0:0,1:0,2:0,3:0,4:0,5:1,0:1,1:1,2:1,3:1,4:1,5 part081.u-bourgogne.fr 12 batch@… 0,0:0,1:0,2:0,3:0,4:0,5:1,0:1,1:1,2:1,3:1,4:1,5 part065.u-bourgogne.fr 12 batch@… 0,0:0,1:0,2:0,3:0,4:0,5:1,0:1,1:1,2:1,3:1,4:1,5 part060.u-bourgogne.fr 11 batch@… 0,0:0,1:0,2:0,3:0,4:0,5:1,0:1,1:1,2:1,3:1,4:1,5 part083.u-bourgogne.fr 10 batch@… 0,0:0,1:0,2:0,3:0,4:0,5:1,0:1,1:1,2:1,3:1,4:1,5 part082.u-bourgogne.fr 7 batch@… 0,0:0,1:0,2:0,3:0,4:0,5:1,0:1,1:1,2:1,3:1,4:1,5

#################### example 2 #################### qsub -q batch -pe dmp* 18 -binding pe linear:slots simple.job

part065.u-bourgogne.fr 6 batch@… 1,0:1,1:1,2:1,3:1,4:1,5 part061.u-bourgogne.fr 12 batch@… 1,0:1,1:1,2:1,3:1,4:1,5

It looks like the first binding strategy for the first host is calculated right but is wrongly applied to all other nodes.

I'm using sge-8.1.8.

All my dmp PEs (1 per IB switch) are configured the same way :

# qconf -sp dmp_swib1 pe_name dmp_swib1 slots 1000 user_lists NONE xuser_lists NONE start_proc_args /usr/ccub/sge/pe/dmp/startdmp.sh -catch_rsh $pe_hostfile stop_proc_args /usr/ccub/sge/pe/dmp/stopdmp.sh allocation_rule $fill_up control_slaves TRUE job_is_first_task FALSE urgency_slots min accounting_summary FALSE qsort_args NONE

Feature or bug ?

Thank !

--

Didier Rebeix

Centre de Calcul et Messageries Université de Bourgogne Maison de l’université Esplanade Erasme - BP 27877 21078 Dijon Cedex

TEL : 03.80.39.52.05 / FAX : 03.80.39.52.69

#1484 fixed CSP initialisation broken Dave Love <d.love@…> markdixon
Description

Hi,

I think this commit has broken "sge_ca -init", called during installation to initialise an x509 CA when gridengine runs in CSP mode:

commit 3b435c132b22bc9499db7106074027a65aef6ecc Author: Dave Love <d.love@…> Date: Wed Jun 19 18:37:59 2013 +0000

Remove sge_ssl.cnf

It seems to be down to the fact that sge_ssl.cnf and sge_ssl_template.cnf differ in this respect: one has "prompt=yes" set, the other "prompt=no".

The "prompt" directive alters how some other options are interpreted (see http://www.openssl.org/docs/apps/req.html for details), causing the problem.

Mark

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Note: See TracQuery for help on using queries.