[GE users] "can't open usage file" Spool Error

John Coldrick jc at axyzfx.com
Mon Feb 28 19:37:04 GMT 2005

	Running SGE 6.0u1...

	I've recently upgraded(clean installs) the OS's on some of our SGE exec Linux 
machines - going from Redhat 7.3 to SUSE 9.1.  I've encountered a sporadic 
problem with certain users submitting jobs from certain machines - getting 
error states and errors in SGE_ROOT/default/spool/machine/messages such as:

02/28/2005 13:45:45|execd|frodo|E|shepherd of job 123512.1 exited with exit 
status = 26
02/28/2005 13:45:45|execd|frodo|E|can't open usage file 
"active_jobs/123512.1/usage" for job 123512.1: No such file or directory
02/28/2005 13:45:45|execd|frodo|E|"can't read usage file for job 123512.1

	Having done some research I found this thread:


	which matches exactly our symptoms and would seem to direct me into the right 
direction - user account IDs not being maintained across the systems.  This 
has been the case, since we still have some older RH7.3 systems around(most 
notably, the SGE qmaster), and SUSE has introduced the notion of higher 
numbers for standard users.

	What's throwing me is that I'm running SGE as root everywhere - and NFS is 
mounted in such a way that I have complete read/write access across SGE_ROOT 
for root.  Also, the particular machines that have exhibited this problem are 
actually original machines that I haven't changed at all(both for submission, 
execution and the qmaster).

	Any thoughts?  Is having different user IDs for the same accounts across a 
grid just plain bad and I should address this before going any further, 
despite the fact that machines I haven't altered are acting up?

	Also, is running without an sgeadmin account OK?

	Many thanks,


John Coldrick                  www.axyzfx.com        Axyz Animation
Houdini/Renderman/Discreet                           425 Adelaide St W
416-504-0425                                         Toronto, ON Canada
jc at axyzfx.com                                        M5V 1S4
A triangle which has an angle of 135 degrees is called an obscene

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list