[GE users] license integration: basic question

sangamesh forum.san at gmail.com
Thu Mar 18 12:46:11 GMT 2010


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Thanks Reuti for your help.

Still the problem persist even after doing:

* Running global load sensor only on master node
* Using "global" instead of machine name.

But I found a basic issue with value of variable in load sensor bash script.

If the script is executed on command line, it shows output as:

global:acfd_par_proc:8

But this does not reflect in:

qhost -F acfd_par_proc
- No acfd_par_proc listed here

However, if I assign acfd_par_proc=8 directly in the load sensor shell script, then it works
i.e. the qhost -F acfd_par_proc shows
gc:acfd_par_proc=8.0000 for each of the nodes

So there is something else problem, with value of the variable.
I've attached the scripts herewith.


On Tue, Mar 9, 2010 at 9:21 PM, reuti <reuti at staff.uni-marburg.de<mailto:reuti at staff.uni-marburg.de>> wrote:
Hi,

Am 09.03.2010 um 16:31 schrieb sangamesh:

> Hello all,
>
>        I've a basic question for SGE(version 6.2u2_1) license
> integration with pure load-sensor approach.
> The problem I'm facing here is, SGE is not at all taking the values
> from load sensor. Its taking the license values which are defined
> "qconf -mattr ..." command i.e. its taking the Total number of
> licenses.
>
> Here are my settings
>
> # qconf -se global
> hostname              global
> load_scaling          NONE
> complex_values        fluentall=10,fluent=10,fluent-nox=10,geom-
> trans=10, \
>                       fluent-par=20,fluent-v2f=12,acfx_solver=10, \
>                       acfx_nolimit=10,acfx_parallel=10,acfx_mfr=10, \
>
> acfx_multiphase=10,acfx_combustion=10,acfx_radiation=10, \
>                       acfx_advanced_turbulence=10, \
>                       acfx_turbulence_transition=10,acfx_bldmdlr=2, \
>
> acfx_par_proc=15,acfd_par_proc=15,acfd_cfx_solver=10
> load_values           NONE
> processors            0
> user_lists            NONE
> xuser_lists           NONE
> projects              NONE
> xprojects             NONE
> usage_scaling         NONE
> report_variables      cpu,mem_free,np_load_avg,virtual_free
>
>
> # qconf -sconf
> #global:
> execd_spool_dir              /opt/gridengine/sge6_2u2_1/default/spool
> mailer                       /bin/mail
> xterm                        /usr/bin/X11/xterm
> load_sensor                  /opt/gridengine/sge6_2u2_1/
> flexlm_integrate/load-sensor.sh

it's sufficient when the global load sensor runs only on one machine
of your choice, not on each one.


> prolog                       none
> epilog                       none
> shell_start_mode             unix_behavior
> login_shells                 sh,ksh,csh,tcsh
>
> ...
> load_report_time             00:00:05
> max_unheard                  00:05:00
> reschedule_unknown           00:00:00
> ....
> reporting_params             accounting=true reporting=true \
>                              flush_time=00:00:15 joblog=true
> sharelog=00:00:00
> finished_jobs                100
> gid_range                    20000-20100
>
> Why SGE is not able get the values from loadsensor.sh script. (The
> output of the script shows the remaining licenses)
>
> The perl script shows values as follows:
>
> # perl /opt/gridengine/sge6_2u2_1/flexlm_integrate/loadsense_helper.pl<http://loadsense_helper.pl>
> suncluster:acfx_pre:9
> suncluster:acfd_cfx_solver:9
> suncluster:acfx_post:5
> suncluster:acfx_bldmdlr:2
> suncluster:acfd_par_proc:10

I think it's necessary to have line:

global:acfx_post:5

as output for a global load sensor, not the name of a machine.

-- Reuti


> But following command shows it as 15 only..
>
> # qstat -F acfd_par_proc
> queuename                      qtype resv/used/tot. load_avg
> arch          states
> ----------------------------------------------------------------------
> -----------
>
> ----------------------------------------------------------------------
> -----------
> all.q at sun01                 BIP   0/0/16         7.00     lx24-amd64
>         gc:acfd_par_proc=15
> ----------------------------------------------------------------------
> -----------
> all.q at sun02                 BIP   0/0/16         2.13     lx24-amd64
>         gc:acfd_par_proc=15
> ----------------------------------------------------------------------
> -
>
> In further, I've decided to use Oleson's method of integration. But
> here I want to know whether this is a bug with SGE or I've mis-
> configured something.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=247709

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net<mailto:users-unsubscribe at gridengine.sunsource.net>].



######################################################################
ISS has detected a compressed file attached to this message.
Please note that compressed files can be used to spread computer viruses.
If you were not expecting this file you should not open the attachment
even if you know that the sender is genuine.

ISS Helpdesk
helpdesk at leeds.ac.uk
+44 113 343 3333
######################################################################




    [ Part 2, "sge_lic.zip"  Application/ZIP (Name: "sge_lic.zip") 1.8 KB. ]
    [ Unable to print this part. ]



More information about the gridengine-users mailing list