[GE users] qmaster dies

Kirk Patton kpatton at transmeta.com
Thu Jun 3 00:14:27 BST 2004


Disreguard...

My NFS server was out of inodes.

Kirk
On Wed, Jun 02, 2004 at 03:40:52PM -0700, Kirk Patton wrote:
> Hello,
> 
> I am having a strange issue.  The qmaster daemon on my master host died.
> The log file says
> Wed Jun  2 15:23:28 2004|qmaster|lsf-k8|E|cant open file users/.teesea3d: No space left on device
> 
> But, I do not have any full partitions that I can find on that host
> [root at lsf-k8 users]# df -h
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/hda2             6.0G  4.1G  1.6G  73% /
> /dev/hda1             122M   25M   91M  22% /boot
> /dev/hda5             176G   34M  167G   1% /export/home
> none                  2.0G     0  2.0G   0% /dev/shm
> none                  2.0G   72K  2.0G   1% /tmp
> usrlocal-fs:/fs/usrlocal/i386-linux-libc6
>                        22G   17G  5.4G  76% /transmeta/i386-linux-libc6
> mis-fs:/fs/mis/project-mis
>                        17G   16G  874M  95% /home/mis
> lsf-fs:/fs/lsf/transmeta-lsf4.0.1
>                        11G  2.8G  7.6G  27% /transmeta/lsf4.0.1
> cerise:/vol/vol1/sge/sge_5.3p5
>                        10G  141M  9.9G   2% /transmeta/sge
> eng3-fs:/fs/eng3/home/kpatton
>                       376G  287G   86G  78% /home/kpatton
> cad-fs:/fs/cad/transmeta-cad
>                       108G  105G  2.9G  98% /transmeta/cad
> 
> When I start the daemons, I get 
> Reading in users:
>         User "chris".
>         User "gsmith".
>         User "jamesd".
>         User "kpatton".
>         User "teesec60d".
>         User "teeser10".
>         User "teeser9".
>         User "teeser8".
>         User "teeser11".
>         User "teesea3d".
>         User "teeseastro".
>         User "teesef4ad".
>         User "teeseaat".
>         User "teesef4in".
> removing reference to no longer existing job 182722 of user "teesea3d"
> error: cant open file users/.teesea3d: No space left on device
>    starting sge_schedd
> error: getting configuration: unable to contact qmaster via "" commd - qmaster not enrolled at commd
> error: can't get configuration from qmaster -- backgrounding
>    starting sge_shadowd
> 
> The files that it is complaining about reside on a nfs shared directory and it has lots of space.
> cerise:/vol/vol1/sge/sge_5.3p5
>                        10G  141M  9.9G   2% /transmeta/sge
> 
> Anyone have an idea how I can track down what is wrong?
> 
> Kirk
> -- 
> Kirk Patton
> Unix Administrator
> Transmeta Inc.
> Tel. 408 919-3055
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 

-- 
Kirk Patton
Unix Administrator
Transmeta Inc.
Tel. 408 919-3055

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list