[GE users] Best cluster management software and OS

Brady Catherman bradyc at uidaho.edu
Mon Nov 21 09:22:44 GMT 2005


None of that advice was directed at anybody specifically =) I know of  
far to many clusters that go south real fast because a non-cluster  
person who needs a cluster runs off and listens to a single person  
about how clusters are supposed to be done. Be it a vender, a  
college, or a friend they are almost guarantied to get something  
wrong for what you need. So talk to as many people as you can.. There  
is always more information to get.

The guys in this shop had never used PDSH before I came on the scene.  
They didn't know anything about it. They used ssh in for loops and it  
worked fine. When I showed up and showed them a true parallel shell  
it was like a totally new toy that was cool. Its not that the first  
was didn't work, its just that combining as much experience in a  
topic as you can always leads to a better solution.

To ask a question "What is the best generic solution to x" tends to  
miss the point that most clusters are not completely generic. Even if  
they are running the exact same programs two clusters might differ  
because of the skill level of there users and admins.

I guess that was all I was trying to get at with my earlier posting.


On Nov 21, 2005, at 1:09 AM, Sebastian Stark wrote:

> On Sunday 20 November 2005 00:14, Brady Catherman wrote:
>> Very well said =) Im sorry for dragging the thread down there.
>>
>> Remember though. When building a cluster. Do not, I repeat, do NOT
>> rely on a single person for all of your answers. I have seen a good
>> half a dozen clusters built completely wrong because the person
>> offering advice was either used to a single type of cluster, or in
>> the "throw money at the problem" mind set.
>
> Right. I really did not intend to advise people this was the best  
> solution for
> all imaginable cases.
>
>> If your programs require heavy communication/file access avoid a NFS
>> mounted root. You only burden your network with more load. If you are
>> running jobs like Paup, that run for weeks and don't do communication
>> or heavy file access then NFS root is an awesome solution.
>
> I forgot to mention we use striped swap and /tmp filesytems on fast  
> local
> disks. Users write their programs such that huge datasets they need  
> to work
> with are copied to /tmp first and copied back when the job  
> finished. So most
> of the time the NFS mounted root does not produce too much network  
> traffic.
> Home directories are located on a different server anyway. (Now if  
> there was
> a good self-organising peer-to-peer filesystem available...)
>
> This went completely off-topic and I'm sorry for this...
>
>
> -Sebastian
>
> -- 
> Sebastian Stark -- http://www.kyb.tuebingen.mpg.de/~stark
> Max Planck Institute for Biological Cybernetics
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list