[GE users] SGE noobie

pollinger harald.pollinger at sun.com
Tue Jun 2 16:21:42 BST 2009


biostat wrote:
> Hmmm. When I echoed $TMPDIR, i got this: 
> /var/folders/WF/WFbNEBgSHa8jWCIrXBrvnE+++yU/-Tmp-/

Looks a bit odd, but it should point to a local directory. You could set 
it simply to "/tmp" to be sure it's a valid and working local directory.


> Then, after the echo I tried qconf -mhgrp @allhosts, and it worked
> fine. No more error. Is it possible the echo somehow initialized the
> variable?

Hmm.. no. At least it should not.


> As for the other commands, apparently my $ARCH variable is not
> defined.

Did you source the $SGE_ROOT/$SGE_CELL/common/settings.sh (or .csh) 
before you use a SGE command? Even for the "-help" output?

If the installation succeeded, the file should be there. If not, please 
re-run the QMaster installation.


> qmaster: sge% $SGE_ROOT/bin/$ARCH/sge_qmaster -help | head -1 ARCH:
> Undefined variable. qmaster: sge% echo $ARCH ARCH: Undefined
> variable. qmaster: sge% source /SGE6/default/common/settings.csh 
> qmaster: sge% echo $ARCH ARCH: Undefined variable.
> 
> To make things even stranger, exec3, the computer that was originally
> throwing the null object error, just started working again -- exec4
> is still throwing that error though. I've been messing around with
> things alot recently, trying to get everything working, but I haven't
> touched exec3 in days. I don't understand why it would start working
> again all of a sudden.
> 
> Essentially, here are the somewhat convoluted order of events so far:
>  *Exec3 started throwing a "NULL object pointer passed to function
> "spool_flatfile_read_object"" error. *Exec4, in the middle of a run,
> starting throwing the same error. *My qmaster threw a "error getting
> temporary file name: File name too long" error when trying to run
> 'qconf -mhgrp @allhosts'. *Exec3 spontaneously started working again
> (by which I mean the qmaster stopped seeing its state as "currently
> unavailable", and I was able to send it jobs directly via 'qsub -l
> 'hostname=exec3'') even though I havent touched that computer in
> days. *I echoed $TMPDIR and the temp file name error on the qmaster
> disappeared. *Exec4 is still listed as "currently unavailable, and
> the $ARCH variable is undefined.

This looks like some environment changes. Be it NFS, be it some hardware 
issue, or something else.

Regards,
Harald

> 
> Thanks again for the help *
> 
> ------------------------------------------------------ 
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=200389
> 
> 
> To unsubscribe from this discussion, e-mail:
> [users-unsubscribe at gridengine.sunsource.net].


-- 
Sun Microsystems GmbH         Harald Pollinger
Dr.-Leo-Ritter-Str. 7         Sun Grid Engine Engineering
D-93049 Regensburg            Phone: +49 (0)941 3075-209  (x60209)
Germany                       Fax: +49 (0)941 3075-222  (x60222)
http://www.sun.com/gridware
mailto:harald.pollinger at sun.com
Sitz der Gesellschaft:
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Haering

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=200392

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list