[GE users] NFS errors

Schmitz Dale M Contr 20 IS/INPTG Dale.Schmitz at offutt.af.mil
Fri Apr 15 14:22:42 BST 2005


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

I'm getting the first type you listed.  The file handles point to default/spool/<hostname>/active_jobs/<pid>/*

-----Original Message-----
From: Timo Viitanen [mailto:Timo.Viitanen at csc.fi] 
Sent: Friday, April 15, 2005 4:15 AM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] NFS errors

Schmitz Dale M Contr 20 IS/INPTG wrote:
> My grid is operating fine with the exception of the intolerable number 
> of NFS errors generated by files in 
> /var/sge/default/spool/<hostname>/active_jobs/<pid>/*
> 
>  
> 
> Looking at the permissions on those files, I see that they belong to the 
> user that started the grid job, and the group is the user's group, which 
> is also sgeadmin's group.  I recently changed sgeadmin's group to 
> reflect the user's group when I noticed these files took the user's 
> ownership, but sgeadmin's group-ship.
> 
>  
> 
> The errors are still persistent however.  The number of NFS generated 
> errors is intolerable and clogging up my system logs.  Does any have a 
> suggestion on how permissions/ownerships should be set to eliminate 
> these errors?
> 
>  
> 
> Thanks,
> 
> Dale
> 

Do you mean in the /var/adm/messages and something like this:

Apr 12 01:40:00 corona1 nfs: [ID 897781 kern.notice] NFS write error 
on host corona-sge-g: Permission denied.
Apr 12 01:40:00 corona1 nfs: [ID 702911 kern.notice] (file handle: 
3799a28 3e7 a0000 5a1 71f57435 a0000 2 7744)
Apr 12 01:40:00 corona1 nfs: [ID 897781 kern.notice] NFS write error 
on host corona-sge-g: Permission denied.
Apr 12 01:40:00 corona1 nfs: [ID 702911 kern.notice] (file handle: 
3799a28 3e7 a0000 5a1 71f57435 a0000 2 7744)
Apr 12 01:40:00 corona1 nfs: [ID 897781 kern.notice] NFS write error 
on host corona-sge-g: Permission denied.
Apr 12 01:40:00 corona1 nfs: [ID 702911 kern.notice] (file handle: 
3799a28 3e7 a0000 5a1 71f57435 a0000 2 7744)

Or in the default/spool/<hostname>/messages and something like this:

04/14/2005 12:36:14|execd|corona1|W|reaping job "6269" ptf complains: 
Job does not exist
04/14/2005 12:36:14|execd|corona1|E|can't open usage file 
"active_jobs/6269.1/usage" for job 6269.1: No such file or directory
04/14/2005 12:36:14|execd|corona1|E|"can't read usage file for job 6269.1
"
04/14/2005 12:38:00|execd|corona1|W|reaping job "6270" ptf complains: 
Job does not exist
04/14/2005 12:38:00|execd|corona1|E|can't open usage file 
"active_jobs/6270.1/usage" for job 6270.1: No such file or directory
04/14/2005 12:38:00|execd|corona1|E|"can't read usage file for job 6270.1

I'm running N1GE6U3 and Sun support doesn't have a clue where they are 
coming from... (Service ticket has been open for last 4-5months)

/Timo
-- 
Timo Viitanen, järjestelmäasiantuntija, Laskentapalvelimet
CSC, Keilaranta 14, PL 405, 02101 Espoo
puh (09) 457 2284, fax (09) 457 2302
CSC on tieteen tietotekniikan keskus
http://www.csc.fi/
Timo.Viitanen at csc.fi


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list