[GE users] user loads

Mag Gam magawake at gmail.com
Mon Sep 22 13:04:42 BST 2008


    [ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

I think I have implemented the fair scheduling setting appropriately.
However, can someone recommend test to see if our fair scheduling is
working, because it seems some students and professors are taking up
too much resource :-)

TIA


On Tue, Sep 16, 2008 at 9:33 PM, Mag Gam <magawake at gmail.com> wrote:
> Reuti:
>
> THANKYOU! You are very helpful.
>
>
> On Tue, Sep 16, 2008 at 8:54 AM, Reuti <reuti at staff.uni-marburg.de> wrote:
>> Hi Mag,
>>
>> Am 16.09.2008 um 13:35 schrieb Mag Gam:
>>
>>> Reuti:
>>>
>>> Yes, you are right. When I submit jobs I noticed they are being
>>> distributed :-) This is great!
>>>
>>> Couple of question:
>>>
>>>  How do I know it used some algorithm to pick the server? Would qstat
>>> -f show it?
>>>  Is it possible for me to pick and choose the algorithm?
>>
>> there is an entry in the scheduler configuration for it, but the only
>> allowed value is "default" and AFAIK this entry will be removed in never
>> versions of SGE anyway.
>>
>> $ qconf -ssconf
>> algorithm                         default
>> ...
>>
>> More details are explained here:
>>
>> http://docs.sun.com/app/docs/doc/817-5677/chp9-1?q=N1GE&a=view (or the PDF
>> version: http://docs.sun.com/app/docs/doc/817-5677?a=load)
>>
>>
>>>  How can I see where the user stand, such as is he a cpu hog, memory hog,
>>> etc..
>>
>> For this you have to look into the command `qacct`, see `man accouting` or
>> the relevant chapters in the Administration Guide.
>>
>>>  Is there a way to account for what users are running? I would like to
>>> graph these results.
>>
>> You can use this script:
>> http://gridengine.sunsource.net/files/documents/7/8/status-1.2.tgz and
>> reparse the output or change it according to your needs. It's awk in the
>> inside, while the shell script handles only the parameters. The output you
>> request you will get with these options:
>>
>> $ status -acl
>>
>>                running #jobs/#slots
>> Owner        serial   parallel    total
>> ---------------------------------------
>> user1         0/  0    12/ 24    12/ 24
>> user2         3/  3     0/  0     3/  3
>> user3         1/  1     0/  0     1/  1
>> ---------------------------------------
>> Sum           4/  4    12/ 24    16/ 28
>>
>> If there were waiting jobs, they would be displayed separately.
>>
>>>  Is it possible to pick and choose what is high priority and low priority?
>>
>> This you can find herein:
>>
>> http://www.sun.com/blueprints/1005/819-4325.html
>>
>>
>>> Also where are the logs kept? I preassume they are kept on the qmaster
>>> but I can't seem to find a directory for it. I would like to see whats
>>> going on realtime, by tail -f somelog :-)
>>
>> There are indeed several files, but they are not of much relevance for every
>> day operation, but more important for debugging. See `man sge_conf`section
>> reporting_params or the relevant chapters in the Administration Guide. The
>> files you will find in $SGE_ROOT/default/spool subdirectories (unless you
>> configured local spool directories for all exec hosts or a different general
>> location).
>>
>> -- Reuti
>>
>>
>>> TIA
>>>
>>>
>>>
>>> On Tue, Sep 16, 2008 at 6:40 AM, Reuti <reuti at staff.uni-marburg.de> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Am 16.09.2008 um 02:14 schrieb Mag Gam:
>>>>
>>>>> Hello All,
>>>>>
>>>>> As many of you know we are putting together a GRID at my university's
>>>>> engineering lab. I wanted to know if we can throttle a user's job
>>>>> depending on the load of the system. Lets say I have 16 servers and I
>>>>> would like to submit a job.Each of these servers are a exec hosts.
>>>>>
>>>>> node5 $ qsub very_large_job.bash
>>>>
>>>> you are logged into node5 and not any login node or the master node? By
>>>> default this shouldn't matter, from where you submit the job and it
>>>> should
>>>> run on any of the defined exec nodes in SGE.
>>>>
>>>> You are observing, that jobs submitted on node5 are only running on
>>>> node5,
>>>> like as if every node has it's own qmaster installed and runs only
>>>> locally?
>>>>
>>>> -- Reuti
>>>>
>>>>
>>>>> The job gets executed on node5, but I would like it to do is: an
>>>>> inventory of the servers, find a server with the least load, memory
>>>>> consumption and then execute the job, 'very_large_job.bash' on that
>>>>> node. In the future I would like to distribute the load of
>>>>> 'very_large_job.bash' to all 16 servers and then get a result. I
>>>>> suppose for the later I would need to rewrite my application with MPI
>>>>> support. But I would like to get the first point capatlized before
>>>>> moving forward.
>>>>>
>>>>> I have been looking at the 'ticket weight' documentation but it looks
>>>>> extremely arcane
>>>>> (http://wikis.sun.com/display/GridEngine/Submitting+Jobs) . Does
>>>>> anyone have a much similar way to do this with command lines (its
>>>>> easier to see what is going on with command lines and I get a better
>>>>> perspective)?
>>>>>
>>>>>
>>>>> Any thoughts on how to do this?  I am sorry if this is a newbie
>>>>> question.
>>>>>
>>>>> TIA
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list