[GE users] SGE/MPI question ?

Barry J Mcinnes Barry.J.Mcinnes at noaa.gov
Fri Mar 25 21:24:56 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Chris,
upon further review with user, it does fail running locally as well as 
through SGE, so I will join the lam-mpi group, So its not an SGE problem
sorry barry


Chris Dagdigian wrote:

>
> Random question -- You have not mentioned if the code ran successfully 
> to completion when you use lam-mpi on Mac OS X without Grid Engine.
>
> Usually it's not worth debugging SGE parallel environment issues 
> unless the code reliably and reproducibly runs fine in the MPI 
> environment outside of Grid Engine. All around easier to fix the app 
> first (if needed) then deal with the specifics of the PE/SGE integration.
>
> -Chris
>
>
>
>
> Barry J Mcinnes wrote:
>
>> We are having problems running a model that works on Lintel/SGE 5.3, 
>> we are trying it on MacOS X/SGE 6u3 using lam-mpi 7.1.2b18
>> Any advice appreciated - is there an mpi/sge group ?
>> Here is the end of the model output and the errors.
>>
>> thanks
>>
>> fortran code compiled with xlf
>>
>> + mpiexec -boot -np 2 
>> /Volumes/Disk/jsw/gfsdist/cfs6264/cfs.25797/cfs6228
>>
>>
>> * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * . * 
>> . * . * .
>>     PROGRAM gsm      HAS BEGUN. COMPILED       0.00     ORG: np23
>>     STARTING DATE-TIME  MAR 25,2005  12:50:49.379   84  FRI   2453455
>>
>>
>> &NAM_MRF
>> FHMAX=6.00000000000000000, FHOUT=6.00000000000000000, 
>> FHRES=6.00000000000000000, FHZER=6.00000000000000000, 
>> FHSEG=0.000000000000000000E+00, FHROT=0.000000000000000000E+00, 
>> DELTIM=1200.00000000000000, IGEN=82, FHDFI=3.00000000000000000, 
>> FHSWR=1.00000000000000000, FHLWR=3.00000000000000000, 
>> FHCYC=0.000000000000000000E+00, RAS=F, LDIAG3D=F
>> /
>>  From compns : iret= 0  nsout= 18  nsswr= 3  nslwr= 9  nszer= 18 
>> nsres= 18  nsdfi= 9  nscyc= 0  ras= F
>> Reduced grid, nb points= 6536 full= 9024
>> nfile,fhour,idate= 11 0.0000000000E+00 0 10 9 2003  ntozi= 1  ntcwi= 
>> 2 ncldi= 1  ntraci= 2  tracers= 3.000000000  vtid= 21.00000000 
>> 1.000000000  xgf= 0.0000000000E+00
>> in fixio nread= 14  HOUR=  0.00   IDATE=    0   10    9 2003 
>> lonsfc,latsfc,ivssfc=     192      94  200004
>> MPI_Recv: process in local group is dead (rank 1, comm 4)
>> Rank (1, MPI_COMM_WORLD): Call stack within LAM:
>> Rank (1, MPI_COMM_WORLD):  - MPI_Recv()
>> Rank (1, MPI_COMM_WORLD):  - MPI_Scatter()
>> Rank (1, MPI_COMM_WORLD):  - main()
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list