[GE users] Startup times and other issues with 6.0u3

Reuti reuti at staff.uni-marburg.de
Sun Mar 20 20:19:04 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Brian:

According to the original post:

http://gridengine.sunsource.net/servlets/ReadMsg?list=users&msgNo=9420

it was also in u1. But the delay wasn't in the range of minutes, just seconds. 
So I'm not really sure, whether you are not facing an other problem. The 
problem was also a delay, when the job changed from qw to t, but not later when 
it changed to r.

CU - Reuti

Quoting Brian R Smith <brian at cypher.acomp.usf.edu>:

> Reuti:
> 
> In SGE 6.0u1, I experience the same delay with starting jobs.  Does u1 
> have the same bug as u3, or is it unique to u3?
> 
> Brian
> 
> Reuti wrote:
> 
> >Hi Brian,
> >
> >I heard something about April for the u4, but you can also download the 
> >maintrunk and compile it on your own. The fix is already in.
> >
> >In the most setups I saw, there was the NFS/NIS interface the one which is
> 
> >referred to by the hostname. So you mount the NFS server from an address 
> >192.168.0.* - whatever your server is. I wonder, whether NFS will not answer
> 
> >over the other interface. What is "/usr/sbin/showmount -a" giving you on the
> 
> >server?
> >
> >Thx for the info about PVFS2, as I'm about using it in the next cluster we
> set 
> >up.
> >
> >Cheers - Reuti
> >
> >Quoting Brian R Smith <brian at cypher.acomp.usf.edu>:
> >
> >  
> >
> >>Reuti,
> >>
> >>MM5 usually dumps its data in the same directory you start the binary 
> >>from.  You can move the folder to a shared file system like PVFS2, but 
> >>the writes are so small and infrequent that we just let it run on NFS. 
> >>
> >>With the separate networks, each node's /etc/hosts file has definitions
> for
> >>
> >>192.168.0.1 n001          # 100Mb Administrative network, NFS/NIS, user 
> >>logins
> >>192.168.1.1 gbn001      # GigE network Message Passing, SGE, PVFS2
> >>
> >>The machines.LINUX file in mpich specifies only the gbn0** hosts as does 
> >>SGE.  I have verified that MPICH and SGE run only on the GigE net and 
> >>make no references to the 100Mb net.  All host names on the nodes are 
> >>denoted by the GigE names (gbn0**) so that SGE plays nicely with it.
> >>
> >>As for PVFS2, we run 1 metadata host (I use one of my development 
> >>(building software, testing builds, etc) nodes for this, since Metadata 
> >>server loads are relatively low) and 4 dedicated I/O nodes, each with a 
> >>300GB RAID1 volume.  We get about 1.1 - 1.2 TB of space from this.  I 
> >>ran some performance benchmarks with some software I coded to find out 
> >>the optimal number of nodes for a cluster of this size.  Based on our 
> >>average usage, 4 I/O nodes is quite adaquate.  On the GigE net, I can 
> >>have 8-10 simultaneous large writes and many more smaller ones  from 
> >>different hosts before I start suffering any performance hits.  On 
> >>100Mb, this number is reduced to 5 or 6.
> >>
> >>So, should I just wait for u4?  Any clue as to when it will be released?
> >>
> >>Brian
> >>
> >>Reuti wrote:
> >>
> >>    
> >>
> >>>Hi,
> >>>
> >>>Quoting Brian R Smith <brian at cypher.acomp.usf.edu>:
> >>>
> >>> 
> >>>
> >>>      
> >>>
> >>>>Reuti:
> >>>>
> >>>>shell                 /bin/csh
> >>>>shell_start_mode      posix_compliant
> >>>>
> >>>>Thats how we run it on our other machines and it seems to work just
> fine.
> >>>>        
> >>>>
> >>>>Figured I didn't have to change anything.  
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>>thx for the all the infos. If all of your jobs are using csh it's
> perfect
> >>>      
> >>>
> >>okay 
> >>    
> >>
> >>>to let it stay at this setting.
> >>>
> >>> 
> >>>
> >>>      
> >>>
> >>>>The PE mpich-vasp has all traces of tight integration removed.  I was
> >>>>        
> >>>>
> >>using
> >>    
> >>
> >>>>it for testing Vasp.  However, I got vasp to work with
> tight-integration
> >>>>        
> >>>>
> >>so
> >>    
> >>
> >>>>that queue is depricated.  
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>>Okay, I saw it only in your definition and thought about the reason having
> a
> >>>      
> >>>
> >>>special mpich for vasp.
> >>>
> >>> 
> >>>
> >>>      
> >>>
> >>>>As for MM5, it's compiled with the PGI compilers.  That said, it runs
> >>>>beautifully outside of SGE on the same system (mpirun).  It is only
> under
> >>>>        
> >>>>
> >>SGE
> >>    
> >>
> >>>>that the processes come to a crawl.  If you want more info on it, let
> me
> >>>>know.  I just don't see how that would help.
> >>>>
> >>>>my mpirun command for that is simply
> >>>>
> >>>>$MPIR_PATH/mpirun -np $NSLOTS -machinefile $TMPDIR/machines mm5.mpp
> >>>>
> >>>>mm5.mpp does some small writes to the NFS mount that it runs out of. 
> >>>>        
> >>>>
> >>During
> >>    
> >>
> >>>>the span of about 3 minutes, it will dump approximately 10MB of data
> into
> >>>>        
> >>>>
> >>its
> >>    
> >>
> >>>>current working directory.  I have load tested this outside of SGE and
> >>>>        
> >>>>
> >>found
> >>    
> >>
> >>>>that the NFS writes are not the cause of any slowdown.
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>>It's not so much. Requires MM5 a shared scratch space or could it also
> run
> >>>      
> >>>
> >>by 
> >>    
> >>
> >>>using the created $TMPDIR on the nodes?
> >>>
> >>> 
> >>>
> >>>      
> >>>
> >>>>MPICH runs over a dedicated GigE connection that's devoted to 1) SGE 2)
> >>>>Message Passing 3) PVFS2.  The other network (100mb) handles NFS, NIS,
> etc.
> >>>>        
> >>>>
> >>>>The GigE network is a Cisco Catalyst series GigE switch with Intel GigE
> >>>>controllers on the nodes.  We are not using Jumbo frames on that
> network
> >>>>either as we've yet to get any testing done on the benefits of doing
> so.
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>>So you set up a host_aliases file to seperate the traffic - is `hostname`
> 
> >>>giving the "external" or "internal" name of the machine? You replaced 
> >>>"MPI_HOST=`hostname`" in your mpirun with a mapping to your GigE name if
> 
> >>>necessary?
> >>>
> >>>Just for interest: how many nodes are you using in your cluster for PVFS2
> 
> >>>servers?
> >>>
> >>>Cheers - Reuti
> >>>
> >>>---------------------------------------------------------------------
> >>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>> 
> >>>
> >>>      
> >>>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>
> >>    
> >>
> >
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >For additional commands, e-mail: users-help at gridengine.sunsource.net
> >  
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list