Custom Query (431 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (73 - 75 of 431)

Ticket Resolution Summary Owner Reporter
#1488 fixed SoGE 8.1.6 GUI Crash Dave Love <d.love@…> tobeychris@…
Description

Hi,

I ran into this error on a new CentOS 6.4 x64 installation with SoGE 8.1.6.

# ./start_gui_installer

Starting Installer ...

java.lang.reflect.InvocationTargetException?

at

sun.reflect.NativeConstructorAccessorImpl?.newInstance0(Native Method)

at

sun.reflect.NativeConstructorAccessorImpl?.newInstance(NativeConstructorAcces? sorImpl.java:57)

at

sun.reflect.DelegatingConstructorAccessorImpl?.newInstance(DelegatingConstruc? torAccessorImpl.java:45)

at

java.lang.reflect.Constructor.newInstance(Constructor.java:526)

at

com.izforge.izpack.installer.InstallerFrame?.loadPanels(Unknown Source)

at

com.izforge.izpack.installer.InstallerFrame?.<init>(Unknown Source)

at com.izforge.izpack.installer.GUIInstaller.loadGUI(Unknown

Source)

at

com.izforge.izpack.installer.GUIInstaller.access$100(Unknown Source)

at com.izforge.izpack.installer.GUIInstaller$2.run(Unknown

Source)

at

java.awt.event.InvocationEvent?.dispatch(InvocationEvent?.java:251)

at

java.awt.EventQueue?.dispatchEventImpl(EventQueue?.java:733)

at java.awt.EventQueue?.access$200(EventQueue?.java:103)

at java.awt.EventQueue?$3.run(EventQueue?.java:694)

at java.awt.EventQueue?$3.run(EventQueue?.java:692)

at java.security.AccessController?.doPrivileged(Native

Method)

at

java.security.ProtectionDomain?$1.doIntersectionPrivilege(ProtectionDomain?.ja va:76)

at java.awt.EventQueue?.dispatchEvent(EventQueue?.java:703)

at

java.awt.EventDispatchThread?.pumpOneEventForFilters(EventDispatchThread?.java :242)

at

java.awt.EventDispatchThread?.pumpEventsForFilter(EventDispatchThread?.java:16 1)

at

java.awt.EventDispatchThread?.pumpEventsForHierarchy(EventDispatchThread?.java :150)

at

java.awt.EventDispatchThread?.pumpEvents(EventDispatchThread?.java:146)

at

java.awt.EventDispatchThread?.pumpEvents(EventDispatchThread?.java:138)

at

java.awt.EventDispatchThread?.run(EventDispatchThread?.java:91)

Caused by: java.lang.NullPointerException?

at

java.text.MessageFormat?.applyPattern(MessageFormat?.java:436)

at java.text.MessageFormat?.<init>(MessageFormat?.java:363)

at java.text.MessageFormat?.format(MessageFormat?.java:835)

at

com.sun.grid.installer.gui.HostTable?.createPopupMenu(HostTable?.java:220)

at

com.sun.grid.installer.gui.HostTable?.initPopupMenu(HostTable?.java:99)

at

com.sun.grid.installer.gui.HostTable?.<init>(HostTable?.java:66)

at

com.sun.grid.installer.gui.HostPanel?.init(HostPanel?.java:276)

at

com.sun.grid.installer.gui.HostPanel?.<init>(HostPanel?.java:151)

... 23 more

Any thoughts on the cause / what I can do to fix this?

If there is something I can run to produce more useful logs let me know.

-Chris

#1484 fixed CSP initialisation broken Dave Love <d.love@…> markdixon
Description

Hi,

I think this commit has broken "sge_ca -init", called during installation to initialise an x509 CA when gridengine runs in CSP mode:

commit 3b435c132b22bc9499db7106074027a65aef6ecc Author: Dave Love <d.love@…> Date: Wed Jun 19 18:37:59 2013 +0000

Remove sge_ssl.cnf

It seems to be down to the fact that sge_ssl.cnf and sge_ssl_template.cnf differ in this respect: one has "prompt=yes" set, the other "prompt=no".

The "prompt" directive alters how some other options are interpreted (see http://www.openssl.org/docs/apps/req.html for details), causing the problem.

Mark

#1483 fixed Prevent cgroup/cpuset code from killing shepherd at job en Mark Dixon <m.c.dixon@…> markdixon
Description

Prevent cgroup/cpuset code from killing shepherd at job end

When the execd_params option USE_CGROUPS is enabled, the cgroup/cpuset cleanup code checks for and kills processes related to the job. This includes the shepherd, triggering the job cleanup signal handler. However, as the execd also kills the shepherd elsewhere, this can cause the job cleanup code to be traversed twice as many times as usual.

This has been seen to be a problem when the node running the job master qrsh's back into itself. In that case, the most obvious symptoms are:

  • Messages of the following form in the execd logs:

10/14/2013 12:15:23| main|comp1|W|rogue process(es) found for task 1353.1 10/14/2013 12:15:23| main|comp1|E|shepherd of job 1353.1 died through signal = 9 10/14/2013 12:15:23| main|comp1|E|abnormal termination of shepherd for job 1353.1: "exit_status" file is empty 10/14/2013 12:15:23| main|comp1|E|can't open usage file "active_jobs/1353.1/usage" for job 1353.1: No such file or directory 10/14/2013 12:15:23| main|comp1|E|shepherd exited with exit status 19: before writing exit_status

  • A job failure email sent to adminmail
  • The job start_time / end_time entries in the accounting file are 0

(interpreted as -/- in qacct)

Suggested patch to skip the shepherd is attached.

All the best,

Mark --


Mark Dixon Email : m.c.dixon@… HPC/Grid Systems Support Tel (int): 35429 Information Systems Services Tel (ext): +44(0)113 343 5429 University of Leeds, LS2 9JT, UK


0001-Prevent-cgroup-cpuset-code-from-killing-shepherd-at-.patch

Note: See TracQuery for help on using queries.