[GE users] user loads

Mag Gam magawake at gmail.com
Wed Sep 17 02:33:56 BST 2008


    [ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Reuti:

THANKYOU! You are very helpful.


On Tue, Sep 16, 2008 at 8:54 AM, Reuti <reuti at staff.uni-marburg.de> wrote:
> Hi Mag,
>
> Am 16.09.2008 um 13:35 schrieb Mag Gam:
>
>> Reuti:
>>
>> Yes, you are right. When I submit jobs I noticed they are being
>> distributed :-) This is great!
>>
>> Couple of question:
>>
>>  How do I know it used some algorithm to pick the server? Would qstat
>> -f show it?
>>  Is it possible for me to pick and choose the algorithm?
>
> there is an entry in the scheduler configuration for it, but the only
> allowed value is "default" and AFAIK this entry will be removed in never
> versions of SGE anyway.
>
> $ qconf -ssconf
> algorithm                         default
> ...
>
> More details are explained here:
>
> http://docs.sun.com/app/docs/doc/817-5677/chp9-1?q=N1GE&a=view (or the PDF
> version: http://docs.sun.com/app/docs/doc/817-5677?a=load)
>
>
>>  How can I see where the user stand, such as is he a cpu hog, memory hog,
>> etc..
>
> For this you have to look into the command `qacct`, see `man accouting` or
> the relevant chapters in the Administration Guide.
>
>>  Is there a way to account for what users are running? I would like to
>> graph these results.
>
> You can use this script:
> http://gridengine.sunsource.net/files/documents/7/8/status-1.2.tgz and
> reparse the output or change it according to your needs. It's awk in the
> inside, while the shell script handles only the parameters. The output you
> request you will get with these options:
>
> $ status -acl
>
>                running #jobs/#slots
> Owner        serial   parallel    total
> ---------------------------------------
> user1         0/  0    12/ 24    12/ 24
> user2         3/  3     0/  0     3/  3
> user3         1/  1     0/  0     1/  1
> ---------------------------------------
> Sum           4/  4    12/ 24    16/ 28
>
> If there were waiting jobs, they would be displayed separately.
>
>>  Is it possible to pick and choose what is high priority and low priority?
>
> This you can find herein:
>
> http://www.sun.com/blueprints/1005/819-4325.html
>
>
>> Also where are the logs kept? I preassume they are kept on the qmaster
>> but I can't seem to find a directory for it. I would like to see whats
>> going on realtime, by tail -f somelog :-)
>
> There are indeed several files, but they are not of much relevance for every
> day operation, but more important for debugging. See `man sge_conf`section
> reporting_params or the relevant chapters in the Administration Guide. The
> files you will find in $SGE_ROOT/default/spool subdirectories (unless you
> configured local spool directories for all exec hosts or a different general
> location).
>
> -- Reuti
>
>
>> TIA
>>
>>
>>
>> On Tue, Sep 16, 2008 at 6:40 AM, Reuti <reuti at staff.uni-marburg.de> wrote:
>>>
>>> Hi,
>>>
>>> Am 16.09.2008 um 02:14 schrieb Mag Gam:
>>>
>>>> Hello All,
>>>>
>>>> As many of you know we are putting together a GRID at my university's
>>>> engineering lab. I wanted to know if we can throttle a user's job
>>>> depending on the load of the system. Lets say I have 16 servers and I
>>>> would like to submit a job.Each of these servers are a exec hosts.
>>>>
>>>> node5 $ qsub very_large_job.bash
>>>
>>> you are logged into node5 and not any login node or the master node? By
>>> default this shouldn't matter, from where you submit the job and it
>>> should
>>> run on any of the defined exec nodes in SGE.
>>>
>>> You are observing, that jobs submitted on node5 are only running on
>>> node5,
>>> like as if every node has it's own qmaster installed and runs only
>>> locally?
>>>
>>> -- Reuti
>>>
>>>
>>>> The job gets executed on node5, but I would like it to do is: an
>>>> inventory of the servers, find a server with the least load, memory
>>>> consumption and then execute the job, 'very_large_job.bash' on that
>>>> node. In the future I would like to distribute the load of
>>>> 'very_large_job.bash' to all 16 servers and then get a result. I
>>>> suppose for the later I would need to rewrite my application with MPI
>>>> support. But I would like to get the first point capatlized before
>>>> moving forward.
>>>>
>>>> I have been looking at the 'ticket weight' documentation but it looks
>>>> extremely arcane
>>>> (http://wikis.sun.com/display/GridEngine/Submitting+Jobs) . Does
>>>> anyone have a much similar way to do this with command lines (its
>>>> easier to see what is going on with command lines and I get a better
>>>> perspective)?
>>>>
>>>>
>>>> Any thoughts on how to do this?  I am sorry if this is a newbie
>>>> question.
>>>>
>>>> TIA
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list