[GE users] Error "EH_xacl not found in element"

reuti reuti at staff.uni-marburg.de
Wed Dec 2 13:26:44 GMT 2009


Hi,

can you explain the purpose of the script a little bit? Why are you  
first lowering and then increasing s_vmem after 2 minutes?

It looks like it could be replaced by an RQS. You could also an  
urgency policy to prioritize some jobs.

-- Reuti


Am 02.12.2009 um 14:02 schrieb pablorey:

>     Hi,
>
>     Yes we are using the classic spooling but I think that we  
> didn't removed any file from the spooling directory.
>
>     We have investigated the problem and we have associated this  
> problem with a cron job used to prioritize some jobs. This script  
> is executed at 10, 30 and 50 minutes each hour. Basically, the  
> script follow this schema:
>
>     for group in GROUP_1 GROUP_2 ... GROUP_N; do
>         for node in group; do
>             restrict_node_access $node >> $LOGFILE 2>&1
>             qconf -mattr exechost complex_values s_vmem=9.7G $node  
> >> $LOGFILE 2>&1
>         done
>
>         sleep 120
>
>         for node in group; do
>             qconf -mattr exechost complex_values s_vmem=8.6G $node  
> >> $LOGFILE 2>&1
>             restore_node_access $node >> $LOGFILE 2>&1
>         done
>     done
>
> restrict_node_access(){
>   node=$1
>   qconf -se $node | grep ^user_lists | awk '{print $2}' >  
> $STATUSDIR/$node
>   qconf -rattr exechost user_lists prey $node
> }
>
> restore_node_access(){
>   node=$1
>   if [ -r $STATUSDIR/$node ]; then
>     qconf -rattr exechost user_lists `cat $STATUSDIR/$node` $node
>   else
>     qconf -rattr exechost user_lists NONE $node
>   fi
> }
>
>     This script is running since several months ago without  
> problems until the last week. We have checked the log file and we  
> found out that this error appears only for GROUP_2 (with GROUP_1  
> works properly) and only if there is 1 pending job (if there are  
> more than 1 pending job the script works properly). The problematic  
> command is "qconf -rattr exechost user_lists prey $node" and these  
> are the errors detected:
>     * In the log file: error: commlib error: got read error  
> (closing "svgd.local/qmaster/1")
>     * In the qmaster messages file: 11/30/2009 06:14:03|worker|svgd| 
> C|!!!!!!!!!! EH_xacl not found in element !!!!!!!!!!
>
>     We have tried to reproduce the problem for other user but we  
> were not able so we are disconcerted.
>
>     Regards,
>     Pablo
>
>
>
> On 01/12/2009 18:02, aja wrote:
>>
>> Hi, this seems to be a broken configuration of some userset. Do  
>> you use classic spooling? If yes, didn't you remove accidentally  
>> any file from the spooling directory? Regards, aja pablorey wrote:
>>>
>>> Dear colleagues, In the last 24 hours we have suffered a very odd  
>>> behaviour of the GE master. It was stopped several times and we  
>>> found the following error in the qmaster messages: 11/26/2009  
>>> 06:14:06|worker|svgd|C|!!!!!!!!!! EH_xacl not found in  
>>> element !!!!!!!!!! 11/26/2009 18:14:03|worker|svgd|C|!!!!!!!!!!  
>>> EH_xacl not found in element !!!!!!!!!! 11/27/2009 06:14:04| 
>>> worker|svgd|C|!!!!!!!!!! EH_xacl not found in element !!!!!!!!!!  
>>> 11/27/2009 06:34:03|worker|svgd|C|!!!!!!!!!! EH_xacl not found in  
>>> element !!!!!!!!!! 11/27/2009 06:54:02|worker|svgd|C|!!!!!!!!!!  
>>> EH_xacl not found in element !!!!!!!!!! I was searching  
>>> information about this error but I did not find anything. Any  
>>> idea?. Could be it related to some kind of jobs?. Regards. --  
>>> Pablo Rey Mayo Tecnico de Sistemas Centro de Supercomputacion de  
>>> Galicia (CESGA) Avda. de Vigo s/n (Campus Sur) 15705 Santiago de  
>>> Compostela (Spain) Tel: +34 981 56 98 10 ext. 233; Fax: +34 981  
>>> 59 46 16 email: prey at cesga.es <mailto:prey at cesga.es>; http:// 
>>> www.cesga.es/ ------------------------------------------------  
>>> NOTA: Este mensaje ha sido redactado intencionadamente sin  
>>> utilizar acentos ni caracteres especiales, para que pueda ser  
>>> visualizado correctamente desde cualquier cliente de correo y  
>>> sistema. ------------------------------------------------
>
> -- 
> Pablo Rey Mayo
> Tecnico de Sistemas
> Centro de Supercomputacion de Galicia (CESGA)
> Avda. de Vigo s/n (Campus Sur)
> 15705 Santiago de Compostela (Spain)
> Tel: +34 981 56 98 10 ext. 233; Fax: +34 981 59 46 16
> email: prey at cesga.es; http://www.cesga.es/
> ------------------------------------------------
> NOTA: Este mensaje ha sido redactado intencionadamente sin utilizar
> acentos ni caracteres especiales, para que pueda ser visualizado
> correctamente desde cualquier cliente de correo y sistema.
> ------------------------------------------------
> <xacobeo.jpg>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=230957

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list