[GE users] Error "EH_xacl not found in element"

reuti reuti at staff.uni-marburg.de
Wed Dec 2 15:36:57 GMT 2009


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Hi,

Am 02.12.2009 um 14:50 schrieb pablorey:

>     Hi reuti,
>
>     We have a very special user: the regional weather forecast  
> service. Their jobs cannot wait for free slots as a normal job  
> because the forecast for the weather has to be published on time.  
> So we have to move into execution its pending jobs as soon as  
> possible.These are small jobs necessary to prepare all the files  
> used by the job (a big job) that will obtain the forecast running  
> in other machine.
>
>     To do it we follow these steps:
>     * ?Check if there are jobs of this user in error state and  
> clears this state if it is necessary.
>     * ?Hold all the pending jobs except its jobs.
>     * ?Change the priority for this user.
>     * ?Restrict access to the nodes while we are increasing  
> complex_values like num_proc or memory to avoid jobs of other users  
> to be executed in the selected nodes.

I always judge num_proc as a fixed feature and it shouldn't be  
touched. This can be done by "slots".


>     * ?After a short period of time the complex_values of the  
> selected nodes are restored.
>     * ?The last step is to remove the hold state of the pending jobs
>
>     How could we replace it by an RQS?. It sound very well but we  
> don't know how an RQS could help us to solve this problem.

It looks to me like you allow oversubscription by this user for a  
short time. What about a special queue to which only this user has  
access with one slot? You can also define a nice value of "0" in this  
queue (entry priority), while in the default queue it's "19" (or the  
jobs in the default queue even get suspended by subordination).

==

You can also have different limits for different users in an RQS:

limit name total hosts {*} to slots=9

and a second RQS with:

limit name default users !forecast hosts {*} to slots=8

and the limit in the queue definition is arbitrary (can be 9 or 42).  
So user forecast always has one slot more. You could use a third RQS  
if you want to limit user forecast not to fill a host with 9 slots  
alone.

==

Another option could be to submit a bunch of Advance Reservations.  
When they are granted you can submit a job into them and it will run  
for sure. Having this as a repeating feature (cronlike) is already an  
RFE:

http://gridengine.sunsource.net/issues/show_bug.cgi?id=2935

-- Reuti

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=230980

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list