[GE users] Experiences with pvfs2

Craig Tierney ctierney at hypermall.net
Thu Oct 26 13:39:17 BST 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

gthomas at ForteDS.com wrote:
> Our jobs are basically test cases in a large regression suite.  They
> Involve a lot of compilation with shared sources so there is a lot of
> Network traffic.  Too many files are involved to copy sources to the
> Execution hosts, so they have to be accessed over the network. 
> 

You do not want PVFS2 for this.  PVFS2 was designed for large, 
sequential IO.  Its meta-data performance is very poor.  One
client compiling on a node is going to very slow, and overwhelm
the servers and any other processes (even if they are doing
large,sequential IO).

There are options out there, but they are not free.  Products
like Ibrix, Panasas, Terascale, and Isilon are distributed filesystems.
You could configured them to spread out your files across all servers
and lessen the load from any one server.  They may also allow for file
replication so that different clients could access different copies
allowing faster compiles.

You could also ask the Beowulf.org list for other suggestions.

What if you tarred up your source, setup a process to use multi-cast
to spread the source to all nodes, then compiled?  This would reduce the
network traffic if most of the client nodes needed access to the same
files.

Craig


> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de] 
> Sent: Wednesday, October 25, 2006 2:49 AM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Experiences with pvfs2
> 
> Hi,
> 
> Am 25.10.2006 um 00:59 schrieb <gthomas at ForteDS.com>
> <gthomas at ForteDS.com>:
> 
>> I hope someone out there has experience with this or can help me with 
>> other recommendations.
>>
>> We have a cluster of 30+ SGE servers that all run jobs using a file 
>> system mounted on a single linux file server.  We are seeing some 
>> performance issues that I believe are due to the limited capability of
> 
>> the file server.  We have some other machines available that we are 
>> considering using with pvfs2 to create a distributed file server.  
>> Does anyone have experience with this or have other recommendations 
>> for how to increase our file server capacity.  We may want to increase
> 
>> the number of SGE nodes to around 100, but are reluctant to make the 
>> investment if we have a performance bottleneck with file serving.
> 
> are your jobs writing a huge amount of data into the home directory of
> the users? We copy the (input) files first to the nodes, where also the
> scratch files will be located, and the results later back to the home
> directory. So there isn't much network traffic. Exception is e.g. the
> parallel version of Turbomole, which needs a common directory in case of
> a parallel run. In such cases, a pvfs2 might significantly increase the
> performance by using a special location and file server(s) for this
> "global common tmp".
> 
> -- Reuti
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list