[GE users] SunGrid Engine & BLAST

Chris Dagdigian dag at sonsorol.org
Sat Feb 5 17:20:32 GMT 2005


(1) Looks like you already have a PE named "mpich" installed. This means 
you need to be doing "qconf -mp mpich" to modify the existing PE 
configuration


(2) Integrating a MPICH PE into Grid Engine is the last step you should 
be doing - in order to properly debug the various ways that things can 
go wrong you need to first:


1. Install mpich cluster wide

2. compile the example program such as the simple "cpi" binary

3. using mpirun, verify that you can run the example binary in parallel 
mode across at least a few CPUs. Keep running these tests until you are 
confident that MPI is working on all your apple nodes

I've sort of covered the process of installing and testing MPICH on OS X 
in a trivial and lightweight way here:

http://bioteam.net/faq/index.php?action=artikel&cat=4&id=33&artlang=en


Basically you need to prove to yourself (and us!) that your installation 
of MPICH *outside of Grid Engine* is correctly configured and working.

Only then should you begin the SGE integration part.

For future reference saying "it just hangs" is not enough either :) We 
need to see outout, error messsages, SGE logfiles etc. before we even 
have a remote chance of understanding what is happening.

You may also want to change the subject line of your messages. This 
topic is no longer related to Blast.

-Chris



Hrishikesh Deshmukh wrote:
> Hi All,
> 
> I am trying to integrate mpich PE on Sun Grid 5.3 (Mac OS X).
> I am trying to follow up on the way to do this:
> http://bioteam.net/faq/index.php?action=artikel&cat=5&id=39&artlang=en
> 
> I used qconf -ap mpich and changed the variables:
> emac21:/usr/local/sge/bin/darwin sgeadmin$ ./qconf -ap mpich      
> pe_name           mpich
> queue_list        all
> slots             5
> user_lists        NONE
> xuser_lists       NONE
> start_proc_args   /usr/local/sge/mpi/startmpi.sh $pe_hostfile
> stop_proc_args    /usr/local/sge/mpi/stopmpi.sh
> allocation_rule   $fill_up
> control_slaves    FALSE
> job_is_first_task TRUE
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> ~
> parallel environment "mpich" already exists
> 
> So when i submit the test script given in the link the job just "hangs"
> Do i need to change anything else?!
> 
> Thanks,
> Hrishi
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

-- 
Chris Dagdigian, <dag at sonsorol.org>
BioTeam  - Independent life science IT & informatics consulting
Office: 617-665-6088, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E iChat/AIM: bioteamdag  Web: http://bioteam.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list