[GE users] Segmentation fault to do qconf -su group

Esteban Freire Garcia esfreire at cesga.es
Mon Aug 14 12:27:18 BST 2006


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]


Hi,

Sorry, I couldn't reply because I was on holidays, now I run qcon -su | -mu
under dbx and this is the result:
--------------------------------------------------------------------------
sc>dbx qconf.out
dbx version 5.1
Type 'help' for help.

(dbx) run
SGE 6.0
usage: qconf [options]
   [-aattr obj_nm attr_nm val obj_id_lst]   add to a list attribute of an
   object   [-Aattr obj_nm fname obj_id_lst]         add to a list attribute of an
   object   [-acal calendar_name]                    add a new calendar
   [-Acal fname]                            add a new calendar from file
   [-ackpt ckpt_name]                       add a ckpt interface definition
   [-Ackpt fname]                           add a ckpt interface definition
from file
   [-aconf host_list]                       add configurations
   [-Aconf file_list]                       add configurations from file_list
   [-ae [exec_server_template]]             add an exec host using a template
   [-Ae fname]                              add an exec host from file
   [-ah hostname]                           add an administrative host
   [-ahgrp group]                           add new host group entry
   [-Ahgrp file]                            add new host group entry from
   file   [-am user_list]                          add user to manager list
   [-ao user_list]                          add user to operator list
   [-ap pe-name]                            add a new parallel environment
   [-Ap fname]                              add a new parallel environment
from file
   [-aprj]                                  add project
   [-Aprj fname]                            add project from file
   [-aq ]                                   add a new cluster queue
   [-Aq fname]                              add a queue from file
   [-as hostname]                           add a submit host
   [-astnode node_shares_list]              add sharetree node(s)
   [-astree]                                create/modify the sharetree
   [-Astree fname]                          create/modify the sharetree from
file
   [-au user_list listname_list]            add user(s) to userset list(s)
   [-Au fname]                              add userset from file
   [-auser]                                 add user
   [-Auser fname]                           add user from file
   [-clearusage]                            clear all user/project sharetree
usage
   [-cq destin_id_list]                     clean queue
   [-dattr obj_nm attr_nm val obj_id_lst]   delete from a list attribute of
an object
   [-Dattr obj_nm fname obj_id_lst]         delete from a list attribute of
an object
   [-dcal calendar_name]                    remove a calendar
   [-dckpt ckpt_name]                       remove a ckpt interface
   definition   [-dconf host_list]                       delete local configurations
   [-de host_list]                          remove an exec server
   [-dh host_list]                          remove an administrative host
   [-dhgrp group]                           delete host group entry
   [-dm user_list]                          remove user from manager list
   [-do user_list]                          remove user from operator list
   [-dp pe-name]                            remove a parallel environment
   [-dprj project_list]                     delete project
   [-dq destin_id_list]                     remove a queue
   [-ds host_list]                          remove submit host
   [-dstnode node_list]                     remove sharetree node(s)
   [-dstree]                                delete the sharetree
   [-du user_list listname_list]            remove user(s) from userset
   list(s)   [-dul listname_list]                     remove userset list(s) completely
   [-duser user_list]                       delete user
   [-help]                                  print this help
   [-ke[j] host_list                        shutdown execution daemon(s)
   [-k{m|s}]                                shutdown master|scheduling daemon
   [-kec evid_list]                         kill event client
   [-mattr obj_nm attr_nm val obj_id_lst]   modify an attribute (or element
in a sublist) of an object
   [-Mattr obj_nm fname obj_id_lst]         modify an attribute (or element
in a sublist) of an object
   [-mc ]                                   modify complex attributes
   [-mckpt ckpt_name]                       modify a ckpt interface
   definition   [-Mc fname]                              modify complex attributes from
   file   [-mcal calendar_name]                    modify calendar
   [-Mcal fname]                            modify calendar from file
   [-Mckpt fname]                           modify a ckpt interface
definition from file
   [-mconf [host_list|global]]              modify configurations
   [-msconf]                                modify scheduler configuration
   [-Msconf fname]                          modify scheduler configuration
from file
   [-me server]                             modify exec server
   [-Me fname]                              modify exec server from file
   [-mhgrp group]                           modify host group entry
   [-Mhgrp file]                            modify host group entry from file
   [-mp pe-name]                            modify a parallel environment
   [-Mp fname]                              modify a parallel environment
from file
   [-mprj project]                          modify a project
   [-Mprj fname]                            modify project from file
   [-mq queue]                              modify a queue
   [-Mq fname]                              modify a queue from file
   [-mstnode node_shares_list]              modify sharetree node(s)
   [-Mstree fname]                          modify/create the sharetree from
file
   [-mstree]                                modify/create the sharetree
   [-mu listname_list]                      modify the given userset list
   [-Mu fname]                              modify userset from file
   [-muser user]                            modify a user
   [-Muser fname]                           modify a user from file
   [-rattr obj_nm attr_nm val obj_id_lst]   replace a list attribute of an
object
   [-Rattr obj_nm fname obj_id_lst]         replace a list attribute of an
object
   [-sc ]                                   show complex attributes
   [-scal calendar_name]                    show given calendar
   [-scall]                                 show a list of all calendar names
   [-sckpt ckpt_name]                       show ckpt interface definition
   [-sckptl]                                show all ckpt interface
   definitions   [-sconf [host_list|global]]              show configurations
   [-sconfl]                                show a list of all local
configurations
   [-se server]                             show given exec server
   [-secl]                                  show event client list
   [-sel]                                   show a list of all exec servers
   [-sep]                                   show a list of all licensed
processors
   [-sh]                                    show a list of all
administrative hosts
   [-shgrp group]                           show host group
   [-shgrp_tree group]                      show host group and used
hostgroups as tree
   [-shgrp_resolved group]                  show host group with resolved
hostlist
   [-shgrpl]                                show host group list
   [-sds]                                   show detached settings
   [-sm]                                    show a list of all managers
   [-so]                                    show a list of all operators
   [-sobjl obj_nm2 attr_nm val]             show objects which match the
given value
   [-sp pe-name]                            show a parallel environment
   [-spl]                                   show all parallel environments
   [-sprj project]                          show a project
   [-sprjl]                                 show a list of all projects
   [-sq [destin_id_list]]                   show the given queue
   [-sql]                                   show a list of all queues
   [-ss]                                    show a list of all submit hosts
   [-sss]                                   show scheduler state
   [-ssconf]                                show scheduler configuration
   [-sstnode node_list]                     show sharetree node(s)
   [-rsstnode node_list]                    show sharetree node(s) and its
children
   [-sstree]                                show the sharetree
   [-su listname_list]                      show the given userset list
   [-suser user_list]                       show user(s)
   [-sul]                                   show a list of all userset lists
   [-suserl]                                show a list of all users
   [-tsm]                                   trigger scheduler monitoring
complex_list            complex[,complex,...]
destin_id_list          queue[ queue ...]
listname_list           listname[,listname,...]
node_list               node_path[,node_path,...]
node_path               [/]node_name[[/.]node_name...]
node_shares_list        node_path=shares[,node_path=shares,...]
user_list               user|pattern[,user|pattern,...]
obj_nm                  "queue"|"exechost"|"pe"|"ckpt"|"hostgroup"
attr_nm                 (see man pages)
obj_id_lst              objectname [ objectname ...]
project_list            project[,project,...]
evid_list               all | evid[,evid,...]
host_list               all | hostname[,hostname,...]
obj_nm2                 "queue"|"queue_domain"|"queue_instance"|"exechost"

Program terminated normally

(dbx) run -su cesga
thread 0xb signal Segmentation fault at >*[strlen, 0x3ff800d2590]
ldq_u   t0, 0(a0)
(dbx) run -mu cesga
thread 0x8 signal Segmentation fault at >*[strlen, 0x3ff800d2590]
ldq_u   t0, 0(a0)
--------------------------------------------------------------------------

Thanks,
Esteban


> Hi Esteban,
>
> the attachment went not through and truss output unfortunately
> does not help either in this case.
>
> Can't you run qconf under control of dbx debugger? That would
> tell you directly in which C function the qconf crashes.
>
> Regards,
> Andreas
>
>
> On Fri, 28 Jul 2006, Esteban Freire Garcia wrote:
>
>>
>> Hi, I execute the commando 'truss -o qconf_strace -f qconf -su cesga',
>> and this is the exit for the 'stacktrace' before show "Incurred
>> fault.."
>> --------------------------------------------------------------------------
>> 659636: select(5, 0x000000011FFFAAD8, 0x00000000, 0x00000000,
>> 0x000000011FFFAAB0) = 1
>> 659636: read(4, " h", 1)                                = 1
>> 659636: gettimeofday(0x000000011FFFAAB8, 0x00000000)    = 0
>> 659636: select(5, 0x000000011FFFAAD8, 0x00000000, 0x00000000,
>> 0x000000011FFFAAB0) = 1
>> 659636: read(4, " >", 1)                                = 1
>> 659636: gettimeofday(0x000000011FFFAAC8, 0x00000000)    = 0
>> 659636: select(5, 0x000000011FFFAAE8, 0x00000000, 0x00000000,
>> 0x000000011FFFAAC0) = 1
>> 659636: read(4, " < m i h   v e r s i o n".., 97)       = 97
>> 659636: gettimeofday(0x000000011FFFAAC8, 0x00000000)    = 0
>> 659636: select(5, 0x000000011FFFAAE8, 0x00000000, 0x00000000,
>> 0x000000011FFFAAC0) = 1
>> 659636: read(4, " < a m   v e r s i o n =".., 35)       = 35
>> 659636: gettimeofday(0x00000001400C2A48, 0x00000000)    = 0
>> 659636: gettimeofday(0x000000011FFFAD20, 0x00000000)    = 0
>> 659636: gettimeofday(0x000000011FFFAEC8, 0x00000000)    = 0
>> 659636: gettimeofday(0x000000011FFFAED0, 0x00000000)    = 0
>> 659636: gettimeofday(0x000000011FFFAEC8, 0x00000000)    = 0
>> 659636: gettimeofday(0x000000011FFFAF38, 0x00000000)    = 0
>> 659636: select(5, 0x000000011FFFACD8, 0x000000011FFFAAD8, 0x00000000,
>> 0x000000011FFFAA90) = 1
>> 659636: gettimeofday(0x000000011FFFAEE0, 0x00000000)    = 0
>> 659636: gettimeofday(0x000000011FFFAEC0, 0x00000000)    = 0
>> 659636: gettimeofday(0x000000011FFFABC8, 0x00000000)    = 0
>> 659636: select(5, 0x000000011FFFABE8, 0x00000000, 0x00000000,
>> 0x000000011FFFABC0) = 1
>> 659636: read(4, " < g m s h > < d l > 9 9".., 22)       = 22
>> 659636: gettimeofday(0x000000011FFFABC8, 0x00000000)    = 0
>> 659636: select(5, 0x000000011FFFABE8, 0x00000000, 0x00000000,
>> 0x000000011FFFABC0) = 1
>> 659636: read(4, " h", 1)                                = 1
>> 659636: gettimeofday(0x000000011FFFABC8, 0x00000000)    = 0
>> 659636: select(5, 0x000000011FFFABE8, 0x00000000, 0x00000000,
>> 0x000000011FFFABC0) = 1
>> 659636: read(4, " >", 1)                                = 1
>> 659636: gettimeofday(0x000000011FFFABD8, 0x00000000)    = 0
>> 659636: select(5, 0x000000011FFFABF8, 0x00000000, 0x00000000,
>> 0x000000011FFFABD0) = 1
>> 659636: read(4, " < m i h   v e r s i o n".., 99)       = 99
>> 659636: gettimeofday(0x000000011FFFABD8, 0x00000000)    = 0
>> 659636: select(5, 0x000000011FFFABF8, 0x00000000, 0x00000000,
>> 0x000000011FFFABD0) = 1
>> 659636: read(4, "\0\0\0\01002\0\0\0\0\001".., 373)      = 373
>> 659636: gettimeofday(0x00000001400C29C8, 0x00000000)    = 0
>> 659636: gettimeofday(0x000000011FFFAE30, 0x00000000)    = 0
>> 659636: gettimeofday(0x000000011FFFAEC0, 0x00000000)    = 0
>> 659636:     Incurred fault #32, FLTBOUNDS  %pc = 0x000003FF800D2590
>> addr = 0x000000011FFF9420
>> 659636:     Received signal #11, SIGSEGV [caught]
>> 659636:       siginfo: SIGSEGV SEGV_MAPERR addr=0x0000000020746365
>> 659636: sigaltstack(0x00000000, 0x000000011FFF8860)     = 0
>> 659636: sigprocmask(SIG_BLOCK, 0x00000000, 0x00000000)  = -108655535
>> 659636: sigstack(0x00000000, 0x000000011FFF87D8)        = 0
>> 659636: sigprocmask(SIG_UNBLOCK, 0x00000400, 0x00000000) = -108655535
>> 659636: sigaction(SIGSEGV, 0x000000011FFF8648, 0x00000000) = 0
>> 659636:     Received signal #11, SIGSEGV [default]
>> 659636:       siginfo: SIGSEGV
>>                                                Err#139 Error 139
>>                                                occurred.
>> 659636:         *** process killed ***
>>
>> --------------------------------------------------------------------------
>> I send to you the complete exit for the 'stacktrace' as attached file.
>> However, the date it seems correct.
>> ----------------------------------
>> sc1/esfreire> date
>> Fri Jul 28 09:47:57 CEST 2006
>> ----------------------------------
>>
>> Thanks to answer me!!
>>
>>
>>> Hi Esteban,
>>>
>>> could you provide a stacktrace for the seg fault?
>>> That would help to understand the problem.
>>>
>>> Regards,
>>> Andreas
>>>
>>> On Thu, 27 Jul 2006, Esteban Freire Garcia wrote:
>>>
>>>>
>>>> Thanks to answer me. I am use the version SGE 6.0
>>>> ----------------------------------------------
>>>> sc1/esfreire> qconf -help | grep 6.0
>>>> SGE 6.0
>>>> ---------------------------------------------
>>>> The message 'segmentation fault' only is shown when I make a 'qconf
>>>> -mu esfreire'  or 'qconf -su esfreire', for the rest of commands
>>>> that I use for to administrate the SGE does not evidence this
>>>> message.
>>>>
>>>> It never had shown this message, began to show it does a month, and
>>>> ever since always it show the message when I make qconf -su | -mu
>>>> ,however with qmon I can see and edit the list . I believe that it
>>>> can should to any list that I create badly or that keep with
>>>> incorrect data.
>>>>
>>>> Thanks,
>>>> Esteban
>>>>
>>>>> Reuti wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Am 27.07.2006 um 09:14 schrieb Esteban Freire Garcia:
>>>>>>
>>>>>>>
>>>>>>> Hi everybody,
>>>>>>>
>>>>>>> I have a machine Compaq HPC320,  with UNIX installed ( Tru64
>>>>>>> V5.1A )
>>>>>>>  and SGE
>>>>>>> ( sge 6.0 - tru64 ). Now, when I do one:
>>>>>>> --------------------------------------------------------------------
>>>>>>> sc1/esfreire qconf -su esfreire
>>>>>>> Segmentation fault
>>>>>>>
>>>>>>> sc1/root> qconf -su esfreire
>>>>>>> Memory fault
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> Somebody of you would be able to tell me because show the message
>>>>>>> "Segmentation fault" and which is the solution. However, with
>>>>>>> qmon I
>>>>>>>  can see
>>>>>>> the list and to edit it.
>>>>>>
>>>>>>
>>>>>> this error message usually indicates a programming error, as the
>>>>>> software tries to access an illegal address. So it shouldn't
>>>>>> happen at  all. You get this error only with the "-su", or also
>>>>>> with other options? Was it working before at any time, and just
>>>>>> refuses now to operate?
>>>>>
>>>>> I've tried to reproduce the sgefault in our lab but for me it works
>>>>> fine. We have True64 V5.0.
>>>>>
>>>>> Can you please tell us what update version do you use? You can
>>>>> figure this out with 'qconf -help | grep 6.0'.
>>>>>
>>>>> Roland
>>>>>
>>>>>>
>>>>>> -- Reuti
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>> For additional commands, e-mail:
>>>>>> users-help at gridengine.sunsource.net
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>>>> Roland Dittel               Tel: +49 (0)941 3075-275 (x60275)
>>>>> Software Engineering        Fax: +49 (0)941 3075-222 (x60222) Sun
>>>>> Microsystems GmbH
>>>>> Dr.-Leo-Ritter-Str. 7       mailto:roland.dittel at sun.com
>>>>> D-93049 Regensburg          http://www.sun.com/gridware
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail:
>>>>> users-help at gridengine.sunsource.net
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net For
>> additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net For
> additional commands, e-mail: users-help at gridengine.sunsource.net



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list