[GE users] exit status = 10 pe_start = 134

reuti reuti at staff.uni-marburg.de
Fri May 7 17:06:34 BST 2010


Am 07.05.2010 um 17:00 schrieb henk:

> I installed gridengine 6.2u5 and allmost all nodes work fine except a
> few where a job generates the following error message:
> 
> failed in pestart:05/07/2010 15:14:29 [43532:15077]: exit_status of
> pe_start = 134
> 
> (It's also in the qmaster message file)
> 
> and the node message file has this entry
> 
> 05/07/2010 15:14:30|  main|cn031|E|shepherd of job 556.1 exited with
> exit status = 10

This just says that the PE start procedure failed.


> indicating the problem.
> 
> I use openmpi-1.4.1. The job is put in the queue again and the queue is
> in the error state. Clearing the error repeats the problem.
> 
> Does anyone know what the code 134 means?

Codes greater 128 are the sum of 128 plus the number of the received signal. Means SIGABRT in your case. Now the question: where does this signal come from.



When you use Open MPI, there is no need for any start procedure of the PE. Did you define one anyway as you use the same PE also for other types of jobs?

-- Reuti


> 
> Thanks
> 
> Henk
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=256539
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=256545

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list