[GE users] maya signal 11

bwillems b-willems at northwestern.edu
Mon Mar 9 14:04:24 GMT 2009


Hi Reuti,

sorry for the misunderstanding. Including it in the queue definition fixed
the problem!

Thanks a bunch!

Cheers,
Bart

> Hi,
>
> Am 09.03.2009 um 14:06 schrieb bwillems:
>
>> thanks! Setting h_stack does not solve the problem though. This is
>> the output from "qconf -sc":
>
> I didn't meant to make it consumable, but to include it with a fixed
> limit of 32M either in the queue definition or "qsub -l
> h_stack=32M ...".
>
> If it's still not working, it might of course be just an application
> error and unrelated to SGE.
>
> -- Reuti
>
>
>> #name               shortcut   type        relop requestable
>> consumable default  urgency
>> #---------------------------------------------------------------------
>> --------------------
>> arch                a          RESTRING    ==    YES
>> NO         NONE     0
>> calendar            c          RESTRING    ==    YES
>> NO         NONE     0
>> cpu                 cpu        DOUBLE      >=    YES
>> NO         0        0
>> display_win_gui     dwg        BOOL        ==    YES
>> NO         0        0
>> h_core              h_core     MEMORY      <=    YES
>> NO         0        0
>> h_cpu               h_cpu      TIME        <=    YES
>> NO         0:0:0    -1000
>> h_data              h_data     MEMORY      <=    YES
>> NO         0        0
>> h_fsize             h_fsize    MEMORY      <=    YES
>> NO         0        0
>> h_rss               h_rss      MEMORY      <=    YES
>> NO         0        0
>> h_rt                h_rt       TIME        <=    YES
>> NO         0:0:0    0
>> h_stack             h_stack    MEMORY      <=    YES
>> YES        32M      0
>> h_vmem              h_vmem     MEMORY      <=    YES
>> YES        1G       0
>> hostname            h          HOST        ==    YES
>> NO         NONE     0
>> load_avg            la         DOUBLE      >=    NO
>> NO         0        0
>> load_long           ll         DOUBLE      >=    NO
>> NO         0        0
>> load_medium         lm         DOUBLE      >=    NO
>> NO         0        0
>> load_short          ls         DOUBLE      >=    NO
>> NO         0        0
>> mem_free            mf         MEMORY      <=    YES
>> NO         0        0
>> mem_total           mt         MEMORY      <=    YES
>> NO         0        0
>> mem_used            mu         MEMORY      >=    YES
>> NO         0        0
>> min_cpu_interval    mci        TIME        <=    NO
>> NO         0:0:0    0
>> np_load_avg         nla        DOUBLE      >=    NO
>> NO         0        0
>> np_load_long        nll        DOUBLE      >=    NO
>> NO         0        0
>> np_load_medium      nlm        DOUBLE      >=    NO
>> NO         0        0
>> np_load_short       nls        DOUBLE      >=    NO
>> NO         0        0
>> num_proc            p          INT         ==    YES
>> NO         0        0
>> qname               q          RESTRING    ==    YES
>> NO         NONE     0
>> rerun               re         BOOL        ==    NO
>> NO         0        0
>> s_core              s_core     MEMORY      <=    YES
>> NO         0        0
>> s_cpu               s_cpu      TIME        <=    YES
>> NO         0:0:0    0
>> s_data              s_data     MEMORY      <=    YES
>> NO         0        0
>> s_fsize             s_fsize    MEMORY      <=    YES
>> NO         0        0
>> s_rss               s_rss      MEMORY      <=    YES
>> NO         0        0
>> s_rt                s_rt       TIME        <=    YES
>> NO         0:0:0    0
>> s_stack             s_stack    MEMORY      <=    YES
>> NO         0        0
>> s_vmem              s_vmem     MEMORY      <=    YES
>> NO         0        0
>> seq_no              seq        INT         ==    NO
>> NO         0        0
>> slots               s          INT         <=    YES
>> YES        1        1000
>> swap_free           sf         MEMORY      <=    YES
>> NO         0        0
>> swap_rate           sr         MEMORY      >=    YES
>> NO         0        0
>> swap_rsvd           srsv       MEMORY      >=    YES
>> NO         0        0
>> swap_total          st         MEMORY      <=    YES
>> NO         0        0
>> swap_used           su         MEMORY      >=    YES
>> NO         0        0
>> tmpdir              tmp        RESTRING    ==    NO
>> NO         NONE     0
>> virtual_free        vf         MEMORY      <=    YES
>> YES        1G       0
>> virtual_total       vt         MEMORY      <=    YES
>> NO         0        0
>> virtual_used        vu         MEMORY      >=    YES
>> NO         0        0
>> # >#< starts a comment but comments are not saved across edits
>> --------
>>
>> And the qacct output for the failed job is
>>
>> ==============================================================
>> qname        gpu.q
>> hostname     compute-0-86.local
>> group        bart
>> owner        bart
>> project      GPU
>> department   defaultdepartment
>> jobname      bart.sh
>> jobnumber    74549
>> taskid       undefined
>> account      sge
>> priority     0
>> qsub_time    Mon Mar  9 07:53:51 2009
>> start_time   Mon Mar  9 07:53:53 2009
>> end_time     Mon Mar  9 07:54:03 2009
>> granted_pe   NONE
>> slots        1
>> failed       0
>> exit_status  1
>> ru_wallclock 10
>> ru_utime     8.912
>> ru_stime     0.680
>> ru_maxrss    0
>> ru_ixrss     0
>> ru_ismrss    0
>> ru_idrss     0
>> ru_isrss     0
>> ru_minflt    59969
>> ru_majflt    0
>> ru_nswap     0
>> ru_inblock   16
>> ru_oublock   7968
>> ru_msgsnd    0
>> ru_msgrcv    0
>> ru_nsignals  0
>> ru_nvcsw     1641
>> ru_nivcsw    297
>> cpu          9.592
>> mem          3.980
>> io           0.000
>> iow          0.000
>> maxvmem      861.945M
>> arid         undefined
>>
>> Is there anything else you can suggest to debug/fix the problem
>> besides compiling a custom shepherd binary?
>>
>> Thanks,
>> Bart
>>
>>> Hi,
>>>
>>> Am 07.03.2009 um 02:50 schrieb bwillems:
>>>
>>>> I 'm experience strange behavior running maya as a batch job with
>>>> SGE.
>>>> When maya is launched from the command line on a node, it runs fine
>>>> without any problems. However, when it is launched with a qsub
>>>> script,
>>>> maya exits with a Signal 11 (segmentation fault).
>>>>
>>>> The problem started to occur when I made h_vmem into a consumable
>>>> attribute with a default value of 1GB. Undoing this change did not
>>>> solve
>>>> the problem though, so I 'm not sure it 's related.
>>>>
>>>> Any suggestions would be most appreciated.
>>>
>>> maybe you must also set h_stack to 32M or so:
>>>
>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>> dsForumId=38&dsMessageId=119559
>>>
>>> -- Reuti
>>>
>>>
>>>> Thanks,
>>>> Bart
>>>>
>>>> ------------------------------------------------------
>>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>>> dsForumId=38&dsMessageId=122681
>>>>
>>>> To unsubscribe from this discussion, e-mail: [users-
>>>> unsubscribe at gridengine.sunsource.net].
>>>>
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?
>> dsForumId=38&dsMessageId=125419
>>
>> To unsubscribe from this discussion, e-mail: [users-
>> unsubscribe at gridengine.sunsource.net].
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=125421
>
> To unsubscribe from this discussion, e-mail:
> [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=125475

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list