[GE users] GE in heterogenic environment

Daniel Templeton Dan.Templeton at Sun.COM
Wed May 28 19:31:50 BST 2008


    [ The following text is in the "windows-1252" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Could we please have that in HOWTO form? ;)

Actually, while the howto's on the open source site are useful, they're 
rather static.  I would personally rather see this information be the 
inaugural entry in a HOWTO/Blueprints/patterns section on 
wiki.gridengine.info.  That way, the community can maintain it without 
needing to find someone with Grid Engine commit privileges.  If you'll 
start it up, I'll cross-link it from the open source howto page.

Daniel

Chris Dagdigian wrote:
> Nice!
>
> If the SGE folks don't ask for this in HOWTO form, please consider 
> dropping it into the community wiki located at 
> http://wiki.gridengine.info/wiki/index.php/Main_Page
>
> Regards,
> Chris
>
>
> On May 28, 2008, at 2:05 PM, Jacek Strzelczyk wrote:
>
>> Hello everyone,
>>
>> I'm new here. I decided to join this group because I had some 
>> adventures with GE while working on my thesis. I successfully 
>> installed SGE 6.1 with MPICH2 in heterogenic environment - network 
>> consisted of Linux and Windows machines, and I made it all to work at 
>> boot, so no manual startup is needed. My testing network consisted of 
>> three Fedora Core 3 machines and one machine with Windows 2000. I 
>> haven't seen any document describing that kind of installation, so I 
>> decided to write one. There are some tricks on the way, and I hope my 
>> experience can be helpful.
>> You can find my document in attachment. Tell me if this can be 
>> helpful to anyone (maybe I can publish it somewhere on the web?). All 
>> comments are welcome.
>>
>> Regards,
>> Jacek Strzelczyk
>> Installing SGE 6.1 in a heterogenic (Linux Fedora Core release 3 and 
>> Windows 2000) environment with MPICH2.
>>
>> Author: Jacek Strzelczyk <jacek.strzelczyk at gmail.com>
>>
>> 1. Pre-install requirements
>>     1.1 NIS
>>     1.2 NFS
>>         1.2.1 on Linux (NFS server and clients)
>>         1.2.2 on Windows (NFS clients)
>> 2. SGE installation
>>     2.1 SGE Linux master host
>>     2.2 SGE Linux exec hosts
>>     2.3 SGE Windows exec hosts
>> 3. MPICH2
>>     3.1 on Linux
>>     3.2 on Windows
>>     3.3 Add MPICH2 as parallel environment to SGE
>> 4. Configure Interix
>> 5. Post-install check
>> 6. Test
>>
>>
>> 1. Pre-install requirements
>>
>>     1.1 NIS
>>
>>     NIS is a service that provides information, that has to be known 
>> throughout the network, to all machines on the network. It can be 
>> very helpful in maintaining coherent user structure on all the nodes 
>> in grid. Full NIS HOWTO can be found here: 
>> http://www.linux-nis.org/nis-howto/HOWTO/. For purposes of this HOWTO 
>> one user account is needed. Name it 'sgeadmin' and add it to NIS 
>> database. Set the $HOME on ?/usr/SGE?.
>>     Also /etc/hosts can be added to NIS.
>>
>>     1.2 NFS
>>
>>     Having a common filesystem to install and run SGE is a simple and 
>> flexible solution. It can be achieved in many ways, and I'll focus on 
>> NFS. The full NFS HOWTO can be found here: 
>> http://nfs.sourceforge.net/nfs-howto/. The easiest way would be 
>> installing NFS server on the machine purposed to be SGE master host. 
>> The rest of the hosts will be NFS clients.
>>
>>     1.2.1 NFS on Linux
>>
>>     NFS server:
>>     To prepare and share the directory with SGE do:
>>     $ mkdir /usr/SGE
>>     $ echo ?/usr/SGE  M1(rw,no_root_squash,async) 
>> M2(rw,no_root_squash,async) M3(rw,no_root_squash,async)? >> 
>> /etc/exports         #    Where M1, M2 and M3 are the names of client 
>> hosts (need to be in /etc/hosts).
>>     Restart NFS.
>>
>>     NFS client:
>>     $ mkdir /usr/SGE
>>     $ chown sgeadmin /usr/SGE
>>     $ mount -t nfs masternode:/usr/SGE /usr/SGE  #should be added to 
>> fstab with suid option
>>
>>     1.2.2 NFS on Windows
>>
>>     To mount network drive in Windows login as Administrator and type:
>>     >net use X: \\masternode\usr\SGE
>>     To make it automatically at each system boot use AutoExNT 
>> (http://support.microsoft.com/kb/243486):
>>     a) Using a text editor (such as Notepad), create a batch file 
>> named Autoexnt.bat and include the commands you want to run at 
>> startup in this file ? that would be
>>     ?@net use X:\ \\masternode\usr\SGE?
>>     b) Copy the Autoexnt.bat file you just created, in addition to 
>> the Autoexnt.exe, Servmess.dll, and Instexnt.exe files located in the 
>> Resource Kit CD-ROM (or here: 
>> http://www.dynawell.com/reskit/microsoft/win2000/autoexnt.zip) to the 
>> C:\WINNT\System32 folder on your computer.
>>     c) At a command prompt, type instexnt install, and then press ENTER.
>>     You should then receive the following message:
>>     CreateService AutoExNT SUCCESS with InterActive Flag turned OFF
>>
>>     This will create AutoExNT service in Windows, that will 
>> automatically mount /usr/SGE as X: drive at boot time. To be sure it 
>> will happen after all network connections are up add some 
>> dependencies in windows registry. Open registry editor regedt32 (not 
>> regedit!). Go to 
>> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AutoExNT and add 
>> string value named ?DependOnService? with value ?LanmanWorkstation?.
>>
>>
>> 2.SGE installation
>>
>>     2.1 SGE Linux master host
>>
>>     First, create /usr/SGE/.rhosts file containing all hosts in your 
>> SGE installation. Chmod it to 600. Check if rsh works on Linux 
>> machines by executing:
>>     sgeadmin $ rsh otherlinuxhost date
>>
>>     Then, add three lines to /etc/services:
>>     ?sge_execd        535/tcp
>>     sge_commd    536/tcp
>>     sge_qmaster    537/tcp?
>>
>>     Download SGE
>>     Common Files:
>>     http://gridengine.sunsource.net/download/SGE61/sge-6.1-common.tar.gz
>>     Linux files:
>>     http://gridengine.sunsource.net/download/SGE61/sge-6.1-bin-lx24-x86.tar.gz 
>>
>>     Unpack them:
>>     #su - sgeadmin
>>     $mv sge-6.1-common.tar.gz /usr/SGE/
>>     $mv sge-6.1-bin-lx24-x86.tar.gz /usr/SGE/
>>     $cd /usr/SGE/
>>     $tar -xvf sge*
>>     Befor starting installation procedure, file util/arch needs to be 
>> edited. Change line 248 to:
>>     3*|4*|5*
>>     and then:
>>     $su -
>>     #./install_qmaster
>>     Full installation procedure described in SGE Docs: 
>> http://docs.sun.com/app/docs/doc/817-6118/emrar?q=N1GE&a=view.
>>
>>     2.2 SGE Linux exec hosts
>>
>>     Described in SGE Doc: 
>> http://docs.sun.com/app/docs/doc/817-6118/emrar?q=N1GE&a=view.
>>
>>     2.3 SGE Windows exec hosts
>>
>>     * Create user 'sgeadmin' locally.
>>     * Download Services For Unix (SFU) from 
>> http://www.microsoft.com/downloads/details.aspx?FamilyID=896c9688-601b-44f1-81a4-02878ff11778&displaylang=en 
>>
>>     * Turn off DEP by adding ?/noexecute=alwaysoff? to C:\boot.ini
>>     * Run SFU installation procedure and add Interix SDK and Interix 
>> GNU SDK to default installation.
>>     * Check if User Mapping daemon is working after installation is 
>> complete
>>     * Go to Menu Start -> Programs -> Windows Services for Unix -> 
>> Configuration -> User Name Mapping, choose NIS and Show User Maps. 
>> Then connect unix user sgeadmin with windows user of the same name.
>>     * Mount X: drive as /usr/SGE in Interix:
>>     %ls -l /dev/fs    # should show also X
>>     %ln -s /dev/fs/X /usr/SGE
>>     * Run telnet and rsh from Interix ? log in to Windows as 
>> Administrator, turn off telnet and rsh deamons from Windows 
>> permanently, remove comment marks from rsh and telnet lines in 
>> /etc/inetd.conf in Interix, restart inet:
>>     %ps -ef | grep inetd
>>     %kill -1 <PIDofINETD>
>>     Check Windows firewall and open ports 23 (telnet) and 514 
>> (shell). Use nmap to check if everything is ok.
>>     * Add all grid machines to /etc/hosts in Interix
>>     * Install bootstrap installer
>>     http://www.interopsystems.com/tools/pkg_install.htm
>>     * Add line ?64.235.106.194 ftp.interopsystems.com? to 
>> C:\Windows\system32\drivers\etc\hosts so that ftp can reach this 
>> portal (otherwise there are problems in address translation)
>>     * Install bash:
>>     %pkg_update -L bash
>>     * Create $HOME/.rhosts file in Interix containing all hosts in 
>> your SGE installation
>>     * Download windows specific SGE files 
>> (sge61u4_addarchs_targz.zip) from Sun's webpage. Available after 
>> logging.
>>     * Unpack and copy them to proper directories
>>     * Run the installation:
>>     %/usr/SGE/install_execd
>>
>> 3. MPICH2
>>
>>     3.1 MPICH2 on Linux
>>
>>     * Download mpich2-1.0.7.tar.gz archive
>>     * Configure and install:
>>     $ mkdir /usr/SGE/mpich2
>>     $./configure ?prefix=/usr/SGE/mpich2 ?with-pm=smpd --with-pmi=smpd
>>     $ make
>>     $ make install
>>     $ cd $HOME
>>     $ echo ?phrase=behappy? > .smpd
>>     * Add /usr/SGE/mpich2/bin to $PATH
>>     * Check if smpd daemon is working, if not, run it by smpd -s
>>     * To compile MPI programs:
>>     $ gcc mpi-test.c -ompi-test -I/usr/SGE/mpich2/include 
>> -L/usr/SGE/mpich2/lib -lmpich
>>     * Create credentials file:
>>     $ echo ?sgeadmin\n sgeadmin? > /usr/SGE/credentials
>>     $ chmod 600 /usr/SGE/credentials
>>
>>     3.2 MPICH2 on Windows
>>
>>     * Download and install Visual C++ 2005 SP1
>>     * Download and install MPICH2 for Windows
>>     * Check if MPICH2 Process Manager daemon is working
>>     * To compile programs in Windows install Dev-Cpp (or other 
>> programming environment)
>>     * Compile source code in Dev-Cpp with MPICH2 libraries and 
>> headers: -I?C:\Program Files\MPICH2\include? -L?C:\Program 
>> Files\MPICH2\lib? -lmpi
>>     * Copy the compiled program into C:\WINNT\system32 (or other 
>> directory from windows $PATH)
>>
>>     3.3 Add MPICH2 as parallel environment to SGE
>>
>>     Use qmon from SGE to manually add MPICH2 as parallel environment 
>> to SGE. Description of PE can be found here: 
>> http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-integration.html 
>>
>>
>> 4.Configure Interix
>>
>>     Simple configuration needs to be done, so that Interix will start 
>> after network drive with SGE (drive X:) is mounted on Windows 
>> machine. Then, SGE can be started automatically by Interix startup 
>> script. To do that, add dependency to the Windows Registry:
>>     * open regedt32
>>     * Go to: 
>> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Interix and add 
>> string value named ?DependOnService? with value ?AutoExNT?.
>>     * Copy and adjust one of the Interix startup scripts from 
>> /etc/init.d to start SGE.
>>     * Add symbolic links to sge start script in /etc/rc2.d
>>     Then, after Windows machine restart all network connections 
>> should be up, network drive X: (with SGE) should be mounted and 
>> Interix startup script should start SGE exec daemon. All 
>> automatically, with no need of user logging.
>>
>>
>> 5.Post-install check
>>
>>     Ok, so now you should have:
>>     * Linux master host with: NIS and NFS servers, SGE master and SGE 
>> exec daemons running. Check by ps aux | grep sge ? there should be 
>> three processes: sge_qmaster, sge_commd and sge_execd. Also MPICH2 
>> daemon ? smpd.
>>     * Linux execution hosts with: mounted /usr/SGE from master host, 
>> SGE exec and smpd daemons running.
>>     * Windows execution hosts with: mounted /usr/SGE as network drive 
>> X:, smpd daemon running (from Windows version of MPICH2) and Interix 
>> with SGE exec daemon.
>>
>> 6.Test
>>
>>     To test the installation, create simple MPI program (or use 
>> example from mpich2/examples) and compile it both:
>>     * on Linux:  gcc -I/usr/SGE/mpich2/include -L/usr/SGE/mpich2/lib 
>> -lmpi -ompi-test mpi-test.c
>>     * on Windows: use Dev-Cpp with arguments -I?C:\Program 
>> Files\MPICH2\include? -L?C:\Program Files\MPICH2\lib? -lmpi
>>     Then copy output binary files to the one directory in $PATH, both 
>> on Windows (ex. C:\WINNT\system32) and Linux (ex. /usr/bin/).     
>> Create script (examples in /usr/SGE/examples) that executes mpirun:
>>     mpirun -n $NSLOTS -machinefile $TMPDIR/machines -pwdfile 
>> /usr/SGE/credentials mpi-test
>>     It should give you proper results (I hope...)!
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>




    [ Part 2: "Attached Text" ]

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



More information about the gridengine-users mailing list