[GE users] checking mount points or any other user defined attributes

reuti reuti at staff.uni-marburg.de
Wed Nov 24 08:36:20 GMT 2010


Am 24.11.2010 um 06:31 schrieb llikethat:

> --- On Tue, 23/11/10, craffi <dag at sonsorol.org> wrote:
> 
> From: craffi <dag at sonsorol.org>
> Subject: Re: [GE users] checking mount points or any other user defined attributes
> To: users at gridengine.sunsource.net
> Date: Tuesday, 23 November, 2010, 5:30 PM
> 
> Missing mount points representing OS and cluster problems are usually 
> checked by non-SGE cluster tools although you could presumably write a 
> JSV or Prolog script that could check for these things.
> 
> Best implementation I saw was at a site where the admins had a script 
> that probed for every OS issue they had ever encountered in the past. 
> The script ran at node boot time and periodically afterwards. As soon as 
> any problem was detected the node gets put into disabled state 'd' and 
> the admins get notified. The same script also puts the node into 'd' 
> state for the first 5 minutes after boot to make sure that there is time 
> for problems to show up and be detected before jobs start landing on it.
> 
> If the mounts are supposed to be missing (perhaps because different 
> servers have different mounts configured by deesign) then you can attach 
> a Boolean true/false attribute to the exec hosts and users could submit 
> jobs like:  "qsub -l -hard fastScratch=true ./myJob.sh" or whatever.
> 
> For serious and transparent use a JSV might work. The JSV can examine 
> the user job script and make changes on the fly such as redirecting to a 
> different queue or queue instance.
> 
> License-aware scheduling is another matter. Google "Olesen FlexLM" to 
> see how it's done with SGE. Basically the modern method involves 
> declaring requestable/consumable resources for each license entitlement 
> and making it dynamic via a script that polls the license server and 
> constantly adjusts the value of the resource. This method has superseded 
> the load-sensor method.
> 
> Hi Craffi,
> 
> That's a lot of information. But i'm really not sure if i'll be able to set it up like this. Because we are currently using DRMAA for submitting array jobs. The DRMAA is in python, but it does not use any -l flag at the moment.

You can use a native specification in DRMAA to use an -l flag.

-- Reuti


> llikethat wrote:
> > Hi,
> >
> > Is there an option by which SGE can check for the mount points, licenses
> > etc before starting a job on a node?
> >
> > By doing this I want to restrict SGE not to submit jobs on the nodes
> > which do not satisfy this.
> >
> > Thanks,
> >
> >
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=297928
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=298257

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list