[GE users] Shre-base policy problem

jeromeverleyen jerome at ibt.unam.mx
Fri May 7 18:32:56 BST 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Dear all

I've a cluster whith SGE verion "GE 6.2u2_1".
I've put a Share-based policy, where all users have the same right.

But i have some doubt about how it is running in my system: A user seems 
not to be actualized in the Share tree:

This user, "cmp_uaem" send 9 days before 2 parrallel jobs about 16 cores 
each ones. Others users runs jobs on one core, and send them recently. I 
do the "clear" of the Share-tree, and surprinsigly, the user of 2 
parrallel jobs have a very few usage, compared with the others:

List of the jobs:

$ qstat -u \*
job-ID  prior   name       user         state submit/start at     queue 
                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
   38865 0.90000 QRLOGIN    illumina     r     05/06/2010 18:17:27 
secuencia.q at compute-1-12.local     1
   38617 0.84933 a573_15d   cmp_uaem     r     04/29/2010 15:20:41 
uaem.q at compute-2-8.local          16
   38857 0.54000 npt_pf_wt  lucioric     r     05/06/2010 14:13:14 
all.q at compute-1-14.local           4
   38825 0.50000 Blasts     merino       r     05/06/2010 15:40:28 
all.q at compute-1-5.local            1 872
   38825 0.50000 Blasts     merino       r     05/06/2010 13:38:29 
all.q at compute-1-11.local           1 659
   38825 0.50000 Blasts     merino       r     05/07/2010 00:31:06 
all.q at compute-1-15.local           1 4144
   38825 0.50000 Blasts     merino       r     05/07/2010 00:45:21 
all.q at compute-1-15.local           1 4302
   38851 0.54000 npt_pf_pho lucioric     r     05/06/2010 12:32:55 
all.q at compute-0-7.local            4
   36088 0.84933 a298b_sal_ cmp_uaem     r     04/18/2010 16:43:03 
uaem.q at compute-2-5.local          16
   38875 0.50000 QLOGIN     lucioric     r     05/07/2010 12:19:43 
all.q at compute-1-10.local           1


After reseting the Share-tree user, i've got this information with 
sge_share_mon:

/opt/gridengine/utilbin/lx26-amd64/sge_share_mon -h -c 1 |cut -f4,13,14
user_name       usage   cpu

         211483.092510   0.000000
         211483.092510   0.000000
merino  193477.559436   196064.949781
gcuellar_lcg    0.000000        0.000000
illumina        0.000000        0.000000
siguel  0.000000        0.000000
jerome  0.000000        0.000000
cmp_uaem        0.374213        0.379882
lucioric        18005.158862    18268.681271


So the user "merino" and "lucioric" running at all on 4 cores and 8 
cores respectively, have an usage about 211482, and at the same time, 
the user with 32 cores have an poorly usage about 0.374213 !!!

I check out what are this jobs, and the cmp_uaem user is running a 
really heavy program (NAMD) that consume a lot of cpus.

So, the problem is that right now, the user that use the more cpus have 
the more priority! I decide to use de Share-base policy to avoid this fact!

Where am i wrong? Do i have define a bad value in my configuration?

Regards


-- 
-- Jérôme
Mieux vaut réaliser son souhait que souhaiter l'avoir fait.
	(Woody Allen)

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=256556

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list