[GE users] Startup times and other issues with 6.0u3

Brian R Smith brian at cypher.acomp.usf.edu
Sun Mar 20 19:59:26 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Reuti:

In SGE 6.0u1, I experience the same delay with starting jobs.  Does u1 
have the same bug as u3, or is it unique to u3?

Brian

Reuti wrote:

>Hi Brian,
>
>I heard something about April for the u4, but you can also download the 
>maintrunk and compile it on your own. The fix is already in.
>
>In the most setups I saw, there was the NFS/NIS interface the one which is 
>referred to by the hostname. So you mount the NFS server from an address 
>192.168.0.* - whatever your server is. I wonder, whether NFS will not answer 
>over the other interface. What is "/usr/sbin/showmount -a" giving you on the 
>server?
>
>Thx for the info about PVFS2, as I'm about using it in the next cluster we set 
>up.
>
>Cheers - Reuti
>
>Quoting Brian R Smith <brian at cypher.acomp.usf.edu>:
>
>  
>
>>Reuti,
>>
>>MM5 usually dumps its data in the same directory you start the binary 
>>from.  You can move the folder to a shared file system like PVFS2, but 
>>the writes are so small and infrequent that we just let it run on NFS. 
>>
>>With the separate networks, each node's /etc/hosts file has definitions for
>>
>>192.168.0.1 n001          # 100Mb Administrative network, NFS/NIS, user 
>>logins
>>192.168.1.1 gbn001      # GigE network Message Passing, SGE, PVFS2
>>
>>The machines.LINUX file in mpich specifies only the gbn0** hosts as does 
>>SGE.  I have verified that MPICH and SGE run only on the GigE net and 
>>make no references to the 100Mb net.  All host names on the nodes are 
>>denoted by the GigE names (gbn0**) so that SGE plays nicely with it.
>>
>>As for PVFS2, we run 1 metadata host (I use one of my development 
>>(building software, testing builds, etc) nodes for this, since Metadata 
>>server loads are relatively low) and 4 dedicated I/O nodes, each with a 
>>300GB RAID1 volume.  We get about 1.1 - 1.2 TB of space from this.  I 
>>ran some performance benchmarks with some software I coded to find out 
>>the optimal number of nodes for a cluster of this size.  Based on our 
>>average usage, 4 I/O nodes is quite adaquate.  On the GigE net, I can 
>>have 8-10 simultaneous large writes and many more smaller ones  from 
>>different hosts before I start suffering any performance hits.  On 
>>100Mb, this number is reduced to 5 or 6.
>>
>>So, should I just wait for u4?  Any clue as to when it will be released?
>>
>>Brian
>>
>>Reuti wrote:
>>
>>    
>>
>>>Hi,
>>>
>>>Quoting Brian R Smith <brian at cypher.acomp.usf.edu>:
>>>
>>> 
>>>
>>>      
>>>
>>>>Reuti:
>>>>
>>>>shell                 /bin/csh
>>>>shell_start_mode      posix_compliant
>>>>
>>>>Thats how we run it on our other machines and it seems to work just fine.
>>>>        
>>>>
>>>>Figured I didn't have to change anything.  
>>>>   
>>>>
>>>>        
>>>>
>>>thx for the all the infos. If all of your jobs are using csh it's perfect
>>>      
>>>
>>okay 
>>    
>>
>>>to let it stay at this setting.
>>>
>>> 
>>>
>>>      
>>>
>>>>The PE mpich-vasp has all traces of tight integration removed.  I was
>>>>        
>>>>
>>using
>>    
>>
>>>>it for testing Vasp.  However, I got vasp to work with tight-integration
>>>>        
>>>>
>>so
>>    
>>
>>>>that queue is depricated.  
>>>>   
>>>>
>>>>        
>>>>
>>>Okay, I saw it only in your definition and thought about the reason having a
>>>      
>>>
>>>special mpich for vasp.
>>>
>>> 
>>>
>>>      
>>>
>>>>As for MM5, it's compiled with the PGI compilers.  That said, it runs
>>>>beautifully outside of SGE on the same system (mpirun).  It is only under
>>>>        
>>>>
>>SGE
>>    
>>
>>>>that the processes come to a crawl.  If you want more info on it, let me
>>>>know.  I just don't see how that would help.
>>>>
>>>>my mpirun command for that is simply
>>>>
>>>>$MPIR_PATH/mpirun -np $NSLOTS -machinefile $TMPDIR/machines mm5.mpp
>>>>
>>>>mm5.mpp does some small writes to the NFS mount that it runs out of. 
>>>>        
>>>>
>>During
>>    
>>
>>>>the span of about 3 minutes, it will dump approximately 10MB of data into
>>>>        
>>>>
>>its
>>    
>>
>>>>current working directory.  I have load tested this outside of SGE and
>>>>        
>>>>
>>found
>>    
>>
>>>>that the NFS writes are not the cause of any slowdown.
>>>>   
>>>>
>>>>        
>>>>
>>>It's not so much. Requires MM5 a shared scratch space or could it also run
>>>      
>>>
>>by 
>>    
>>
>>>using the created $TMPDIR on the nodes?
>>>
>>> 
>>>
>>>      
>>>
>>>>MPICH runs over a dedicated GigE connection that's devoted to 1) SGE 2)
>>>>Message Passing 3) PVFS2.  The other network (100mb) handles NFS, NIS, etc.
>>>>        
>>>>
>>>>The GigE network is a Cisco Catalyst series GigE switch with Intel GigE
>>>>controllers on the nodes.  We are not using Jumbo frames on that network
>>>>either as we've yet to get any testing done on the benefits of doing so.
>>>>   
>>>>
>>>>        
>>>>
>>>So you set up a host_aliases file to seperate the traffic - is `hostname` 
>>>giving the "external" or "internal" name of the machine? You replaced 
>>>"MPI_HOST=`hostname`" in your mpirun with a mapping to your GigE name if 
>>>necessary?
>>>
>>>Just for interest: how many nodes are you using in your cluster for PVFS2 
>>>servers?
>>>
>>>Cheers - Reuti
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>> 
>>>
>>>      
>>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>    
>>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list