[GE users] maya signal 11

reuti reuti at staff.uni-marburg.de
Mon Mar 9 13:09:14 GMT 2009


Hi,

Am 09.03.2009 um 14:06 schrieb bwillems:

> thanks! Setting h_stack does not solve the problem though. This is  
> the output from "qconf -sc":

I didn't meant to make it consumable, but to include it with a fixed  
limit of 32M either in the queue definition or "qsub -l  
h_stack=32M ...".

If it's still not working, it might of course be just an application  
error and unrelated to SGE.

-- Reuti


> #name               shortcut   type        relop requestable  
> consumable default  urgency
> #--------------------------------------------------------------------- 
> --------------------
> arch                a          RESTRING    ==    YES          
> NO         NONE     0
> calendar            c          RESTRING    ==    YES          
> NO         NONE     0
> cpu                 cpu        DOUBLE      >=    YES          
> NO         0        0
> display_win_gui     dwg        BOOL        ==    YES          
> NO         0        0
> h_core              h_core     MEMORY      <=    YES          
> NO         0        0
> h_cpu               h_cpu      TIME        <=    YES          
> NO         0:0:0    -1000
> h_data              h_data     MEMORY      <=    YES          
> NO         0        0
> h_fsize             h_fsize    MEMORY      <=    YES          
> NO         0        0
> h_rss               h_rss      MEMORY      <=    YES          
> NO         0        0
> h_rt                h_rt       TIME        <=    YES          
> NO         0:0:0    0
> h_stack             h_stack    MEMORY      <=    YES          
> YES        32M      0
> h_vmem              h_vmem     MEMORY      <=    YES          
> YES        1G       0
> hostname            h          HOST        ==    YES          
> NO         NONE     0
> load_avg            la         DOUBLE      >=    NO           
> NO         0        0
> load_long           ll         DOUBLE      >=    NO           
> NO         0        0
> load_medium         lm         DOUBLE      >=    NO           
> NO         0        0
> load_short          ls         DOUBLE      >=    NO           
> NO         0        0
> mem_free            mf         MEMORY      <=    YES          
> NO         0        0
> mem_total           mt         MEMORY      <=    YES          
> NO         0        0
> mem_used            mu         MEMORY      >=    YES          
> NO         0        0
> min_cpu_interval    mci        TIME        <=    NO           
> NO         0:0:0    0
> np_load_avg         nla        DOUBLE      >=    NO           
> NO         0        0
> np_load_long        nll        DOUBLE      >=    NO           
> NO         0        0
> np_load_medium      nlm        DOUBLE      >=    NO           
> NO         0        0
> np_load_short       nls        DOUBLE      >=    NO           
> NO         0        0
> num_proc            p          INT         ==    YES          
> NO         0        0
> qname               q          RESTRING    ==    YES          
> NO         NONE     0
> rerun               re         BOOL        ==    NO           
> NO         0        0
> s_core              s_core     MEMORY      <=    YES          
> NO         0        0
> s_cpu               s_cpu      TIME        <=    YES          
> NO         0:0:0    0
> s_data              s_data     MEMORY      <=    YES          
> NO         0        0
> s_fsize             s_fsize    MEMORY      <=    YES          
> NO         0        0
> s_rss               s_rss      MEMORY      <=    YES          
> NO         0        0
> s_rt                s_rt       TIME        <=    YES          
> NO         0:0:0    0
> s_stack             s_stack    MEMORY      <=    YES          
> NO         0        0
> s_vmem              s_vmem     MEMORY      <=    YES          
> NO         0        0
> seq_no              seq        INT         ==    NO           
> NO         0        0
> slots               s          INT         <=    YES          
> YES        1        1000
> swap_free           sf         MEMORY      <=    YES          
> NO         0        0
> swap_rate           sr         MEMORY      >=    YES          
> NO         0        0
> swap_rsvd           srsv       MEMORY      >=    YES          
> NO         0        0
> swap_total          st         MEMORY      <=    YES          
> NO         0        0
> swap_used           su         MEMORY      >=    YES          
> NO         0        0
> tmpdir              tmp        RESTRING    ==    NO           
> NO         NONE     0
> virtual_free        vf         MEMORY      <=    YES          
> YES        1G       0
> virtual_total       vt         MEMORY      <=    YES          
> NO         0        0
> virtual_used        vu         MEMORY      >=    YES          
> NO         0        0
> # >#< starts a comment but comments are not saved across edits  
> --------
>
> And the qacct output for the failed job is
>
> ==============================================================
> qname        gpu.q
> hostname     compute-0-86.local
> group        bart
> owner        bart
> project      GPU
> department   defaultdepartment
> jobname      bart.sh
> jobnumber    74549
> taskid       undefined
> account      sge
> priority     0
> qsub_time    Mon Mar  9 07:53:51 2009
> start_time   Mon Mar  9 07:53:53 2009
> end_time     Mon Mar  9 07:54:03 2009
> granted_pe   NONE
> slots        1
> failed       0
> exit_status  1
> ru_wallclock 10
> ru_utime     8.912
> ru_stime     0.680
> ru_maxrss    0
> ru_ixrss     0
> ru_ismrss    0
> ru_idrss     0
> ru_isrss     0
> ru_minflt    59969
> ru_majflt    0
> ru_nswap     0
> ru_inblock   16
> ru_oublock   7968
> ru_msgsnd    0
> ru_msgrcv    0
> ru_nsignals  0
> ru_nvcsw     1641
> ru_nivcsw    297
> cpu          9.592
> mem          3.980
> io           0.000
> iow          0.000
> maxvmem      861.945M
> arid         undefined
>
> Is there anything else you can suggest to debug/fix the problem  
> besides compiling a custom shepherd binary?
>
> Thanks,
> Bart
>
>> Hi,
>>
>> Am 07.03.2009 um 02:50 schrieb bwillems:
>>
>>> I 'm experience strange behavior running maya as a batch job with  
>>> SGE.
>>> When maya is launched from the command line on a node, it runs fine
>>> without any problems. However, when it is launched with a qsub  
>>> script,
>>> maya exits with a Signal 11 (segmentation fault).
>>>
>>> The problem started to occur when I made h_vmem into a consumable
>>> attribute with a default value of 1GB. Undoing this change did not
>>> solve
>>> the problem though, so I 'm not sure it 's related.
>>>
>>> Any suggestions would be most appreciated.
>>
>> maybe you must also set h_stack to 32M or so:
>>
>> http://gridengine.sunsource.net/ds/viewMessage.do?
>> dsForumId=38&dsMessageId=119559
>>
>> -- Reuti
>>
>>
>>> Thanks,
>>> Bart
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>> dsForumId=38&dsMessageId=122681
>>>
>>> To unsubscribe from this discussion, e-mail: [users-
>>> unsubscribe at gridengine.sunsource.net].
>>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=125419
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=125421

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list