[GE users] Jobs waiting due to loss of ressources

rayson rayrayson at gmail.com
Tue Dec 29 22:52:26 GMT 2009


I just checked the code in libs/sgeobj/sge_conf.c , the default setting is:

static bool inherit_env = true;

And then it parses the configuration settings:

 parse_bool_param(s, "INHERIT_ENV", &inherit_env)

So eventually, the setting gets to the shepherd config file, and the
shepherd will get the setting... and the manpage is correct that if
the default is to inherit the env variables.

BUT, a load sensor is not a Grid Engine job, so the path taken is
different. In daemons/execd/sge_load_sensor.c , the function for
starting a load sensor is sge_ls_start_ls():

We call sge_peopen() to start the load sensor. Note that sge_peopen()
calls fork() & execlp()... long story short, the environment is still
inherited from execd no matter what INHERIT_ENV is set to.

Rayson



On 12/29/09, templedf <dan.templeton at sun.com> wrote:
> I know that the man page still says that inheriting the env is still the
> default, but I could have sworn I checked into that a couple of months
> ago and found that it was wrong.  If inheriting the env is indeed still
> the default, that's an issue that we need to resolve.
>
> Daniel
>
> reuti wrote:
> > Am 29.12.2009 um 22:39 schrieb templedf:
> >
> >
> >> The environment is only inherited by default in older versions.  I
> >> forget exactly when that changed, but I think it was with 6.1.
> >>
> >
> > I don't experience that. The man page in sge6.2u4 still states (`man
> > sge_conf`, section INHERIT_ENV): "The default value is true."
> >
> > And this is what I see in a quick test.
> >
> > -- Reuti
> >
> >
> >
> >> Daniel
> >>
> >> reuti wrote:
> >>
> >>> Am 29.12.2009 um 19:20 schrieb fanou:
> >>>
> >>>
> >>>
> >>>>> "Load value" just means it's a value that was reported from a load
> >>>>> sensor.  I agree that it sounds like you have a load sensor that's
> >>>>> misbehaving.  The load sensor for a global value would not be set
> >>>>> in the
> >>>>> global host config (qconf -sconf).  It would instead be set for one
> >>>>> specific host.  (Yes, it sounds illogical, but it actually makes
> >>>>> sense
> >>>>> if you think it through.)  Check the "qconf -sconf <host>"
> >>>>> output for
> >>>>> all your machines, e.g.
> >>>>>
> >>>>> for host in `qconf -sel`; do
> >>>>>   echo $host
> >>>>>   qconf -sconf host | grep load_sensor
> >>>>> done
> >>>>>
> >>>>> And when you find it, write yourself a note so that you don't have
> >>>>> to go
> >>>>> looking for it again in the future. :)
> >>>>>
> >>>>>
> >>>>>
> >>>> True !
> >>>> qconf -sconf master.cluster reports a load sensor.
> >>>> This script file uses some environment variables whom disappeared...
> >>>> No, I have to find how to set them without restarting the system
> >>>> and give them to my 'sgeadmin' account who run sge_qmaster and
> >>>> sge_schedd.
> >>>>
> >>>>
> >>> When the sgeexecd ist started during boot, maybe these environment
> >>> variables aren't set. But when you stop/start the execd (just on this
> >>> particular node with the defined load sensor script) from the command
> >>> line, it might be defined and so available to the script as the
> >>> enviroment is inherited by default.
> >>>
> >>> - Reuti
> >>>
> >>>
> >>>
> >>>
> >>>> Thanks for your help !
> >>>>
> >>>> --
> >>>> Fanou
> >>>>
> >>>> ------------------------------------------------------
> >>>> http://gridengine.sunsource.net/ds/viewMessage.do?
> >>>> dsForumId=38&dsMessageId=235442
> >>>>
> >>>> To unsubscribe from this discussion, e-mail: [users-
> >>>> unsubscribe at gridengine.sunsource.net].
> >>>>
> >>>>
> >>> ------------------------------------------------------
> >>> http://gridengine.sunsource.net/ds/viewMessage.do?
> >>> dsForumId=38&dsMessageId=235465
> >>>
> >>> To unsubscribe from this discussion, e-mail: [users-
> >>> unsubscribe at gridengine.sunsource.net].
> >>>
> >>>
> >> ------------------------------------------------------
> >> http://gridengine.sunsource.net/ds/viewMessage.do?
> >> dsForumId=38&dsMessageId=235473
> >>
> >> To unsubscribe from this discussion, e-mail: [users-
> >> unsubscribe at gridengine.sunsource.net].
> >>
> >
> > ------------------------------------------------------
> > http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=235475
> >
> > To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
> >
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=235478
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=235485

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list