[GE users] couldn't create pipe: too many open files

reuti reuti at staff.uni-marburg.de
Wed Nov 18 00:12:43 GMT 2009

Am 17.11.2009 um 23:32 schrieb richmaes:

> Yes your correct.  We just got it worked out.  We are using TCSH so  
> the command we use is limit.  We raised our hard and soft  
> descriptor limits to 4096 and this resolved the issue.
> Here is the part I am not absolutely clear about.  I did a lsof to  
> show our open file handles which I think is system wide.   The  
> number before starting our SGE job was about 4000.  After kicking  
> the job off, the number grew to about 14000 before it started  
> dumping jobs because of too many open files.
> So that is 10000 file increase in terms of open files.  Our  
> original setting in our /etc/security/limits.conf file was 1024.
> I don't see how 1024 related to the 10000 open files.  It seems  
> like it should have died sooner.  Nor does 4096 seem like enough to  
> solve the issue.

You are on Linux or another platform? IIRC, the limits in Linux are  
per process, not per job. So, when your software does 10 forks, the  
comple job gets 10 times the limits. Therefore SGE uses the  
additonal_group_id to compute the complete h_cpu, h_vmem, ...  
consumption of all forks of a job.

-- Reuti

> For future reference, the edits we made were as follows in the  
> limits.conf file.  Notice it uses "nofile" as the limit name and  
> not "descriptors" as you might expect.
> [waxgridqm.ciena.com(ltconeng)]-> builds 115> cat /etc/security/ 
> limits.conf
> *                hard   nofile      4096
> *                soft   nofile      4096
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=227520
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list