[GE users] exit status = 10 pe_start = 134

reuti reuti at staff.uni-marburg.de
Fri May 7 17:17:32 BST 2010


Am 07.05.2010 um 18:12 schrieb henk:

> Hi Reuti
> 
>> When you use Open MPI, there is no need for any start procedure of the
>> PE. Did you define one anyway as you use the same PE also for other
>> types of jobs?
> 
> No, there is no start procedure. The queue is basically the default
> all.q with adjustment for the pe and the number of slots:
> 
> shell_start_mode      posix_compliant

Often unix_behavior is more appropriate as it honors the first line of the script.


> starter_method        NONE
> suspend_method        NONE
> resume_method         NONE
> terminate_method      NONE

No, in the PE:

$ qconf -sp orte

(or whatever you call it)

-- Reuti


> I also tried the debug options with the dl utility but I don't think
> this gives more information for this kind of problem?
> 
> Thanks
> 
>> -----Original Message-----
>> From: reuti [mailto:reuti at staff.uni-marburg.de]
>> Sent: 07 May 2010 17:07
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] exit status = 10 pe_start = 134
>> 
>> Am 07.05.2010 um 17:00 schrieb henk:
>> 
>>> I installed gridengine 6.2u5 and allmost all nodes work fine except
> a
>>> few where a job generates the following error message:
>>> 
>>> failed in pestart:05/07/2010 15:14:29 [43532:15077]: exit_status of
>>> pe_start = 134
>>> 
>>> (It's also in the qmaster message file)
>>> 
>>> and the node message file has this entry
>>> 
>>> 05/07/2010 15:14:30|  main|cn031|E|shepherd of job 556.1 exited with
>>> exit status = 10
>> 
>> This just says that the PE start procedure failed.
>> 
>> 
>>> indicating the problem.
>>> 
>>> I use openmpi-1.4.1. The job is put in the queue again and the queue
>> is
>>> in the error state. Clearing the error repeats the problem.
>>> 
>>> Does anyone know what the code 134 means?
>> 
>> Codes greater 128 are the sum of 128 plus the number of the received
>> signal. Means SIGABRT in your case. Now the question: where does this
>> signal come from.
>> 
>> 
>> 
>> When you use Open MPI, there is no need for any start procedure of the
>> PE. Did you define one anyway as you use the same PE also for other
>> types of jobs?
>> 
>> -- Reuti
>> 
>> 
>>> 
>>> Thanks
>>> 
>>> Henk
>>> 
>>> ------------------------------------------------------
>>> 
>> 
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessag
>> eId=256539
>>> 
>>> To unsubscribe from this discussion, e-mail: [users-
>> unsubscribe at gridengine.sunsource.net].
>>> 
>> 
>> ------------------------------------------------------
>> 
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessag
>> eId=256545
>> 
>> To unsubscribe from this discussion, e-mail: [users-
>> unsubscribe at gridengine.sunsource.net].
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=256546
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=256547

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list