[GE issues] [Issue 3194] sge_shepherd segfault on OpenSuSE 11.2 (x86_64)

megware stephan.ebelt at megware.com
Mon Jan 25 12:35:41 GMT 2010


http://gridengine.sunsource.net/issues/show_bug.cgi?id=3194






------- Additional comments from megware at sunsource.net Mon Jan 25 04:35:40 -0800 2010 -------
good idea. Here we go:

$ echo "sleep 10;" | qsub
Your job 15 ("STDIN") has been submitted

node02:/var/spool/sge/node02 # less /tmp/shepherd.strace
01/25/2010 13:29:22 [0:9087]: shepherd called with uid = 0, euid = 0
01/25/2010 13:29:22 [104:9087]: starting up 6.2u5
01/25/2010 13:29:22 [104:9087]: setpgid(9087, 9087) returned 0
01/25/2010 13:29:22 [104:9087]: do_core_binding: "binding" parameter not found in config file
01/25/2010 13:29:22 [104:9087]: no prolog script to start
01/25/2010 13:29:22 [104:9088]: child: starting son(job, /var/spool/sge/node02/job_scripts/15, 0);
01/25/2010 13:29:22 [104:9087]: parent: forked "job" with pid 9088
01/25/2010 13:29:22 [104:9087]: parent: job-pid: 9088
01/25/2010 13:29:22 [104:9087]: wait3 returned 9088 (status: 6; WIFSIGNALED: 1,  WIFEXITED: 0, WEXITSTATUS: 0)
01/25/2010 13:29:22 [104:9087]: job exited with exit status 0
01/25/2010 13:29:22 [104:9087]: reaped "job" with pid 9088
01/25/2010 13:29:22 [104:9087]: job exited due to signal
01/25/2010 13:29:22 [104:9087]: job signaled: 6
01/25/2010 13:29:22 [104:9087]: now sending signal KILL to pid -9088
01/25/2010 13:29:22 [104:9087]: writing usage file to "usage"
01/25/2010 13:29:22 [104:9087]: no tasker to notify
01/25/2010 13:29:22 [104:9087]: no epilog script to start
--

hmm, does not look like strace.I see if I failed to catch stderr and will try -f. But maybe the above gives some hint already?

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=36&dsMessageId=240872

To unsubscribe from this discussion, e-mail: [issues-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list