[GE users] Re: LAM SGE Integration issues with rocks 4.1

Srividya Valivarthi srividya.v at gmail.com
Fri Jan 20 17:58:27 GMT 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Reuti,

    U were right about the path's not getting sourced.. so i exported
the path in the script itself  and get the following output for the
script below.
:/opt/lam/gnu/lib
        liblamf77mpi.so.0 => /opt/lam/gnu/lib/liblamf77mpi.so.0 (0x40000000)
        libmpi.so.0 => /opt/lam/gnu/lib/libmpi.so.0 (0x4000e000)
        liblam.so.0 => /opt/lam/gnu/lib/liblam.so.0 (0x40070000)
        libutil.so.1 => /lib/libutil.so.1 (0x0083e000)
        libdl.so.2 => /lib/libdl.so.2 (0x005ca000)
        libpthread.so.0 => /lib/tls/libpthread.so.0 (0x00669000)
        libc.so.6 => /lib/tls/libc.so.6 (0x0049e000)
        /lib/ld-linux.so.2 (0x00485000)

However, now I use the following script to run a mpihello script and
get the following error:
Script
------------
#!/bin/sh
#$ -cwd
#$ -j y
#$ -S /bin/bash
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/lam/gnu/lib
echo $LD_LIBRARY_PATH
ldd mpihello
/opt/lam/gnu/bin/mpirun C mpihello

error
--------
:/opt/lam/gnu/lib
        liblamf77mpi.so.0 => /opt/lam/gnu/lib/liblamf77mpi.so.0 (0x40000000)
        libmpi.so.0 => /opt/lam/gnu/lib/libmpi.so.0 (0x4000e000)
        liblam.so.0 => /opt/lam/gnu/lib/liblam.so.0 (0x40070000)
        libutil.so.1 => /lib/libutil.so.1 (0x00de9000)
        libdl.so.2 => /lib/libdl.so.2 (0x00c60000)
        libpthread.so.0 => /lib/tls/libpthread.so.0 (0x00cff000)
        libc.so.6 => /lib/tls/libc.so.6 (0x00b34000)
        /lib/ld-linux.so.2 (0x00b1b000)
mpihello: error while loading shared libraries: liblamf77mpi.so.0:
cannot open shared object file: No such file or directory
-----------------------------------------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).

mpirun can *only* be used with MPI programs (i.e., programs that
invoke MPI_INIT and MPI_FINALIZE).  You can use the "lamexec" program
to run non-MPI programs over the lambooted nodes.
-----------------------------------------------------------------------------

Thanks again,




On 1/19/06, Reuti <reuti at staff.uni-marburg.de> wrote:
> Hi again,
>
> Am 19.01.2006 um 17:52 schrieb Srividya Valivarthi:
>
> <snip>
>
> > #ldd mpihello  results in
> >  	liblamf77mpi.so.0 => /opt/lam/gnu/lib/liblamf77mpi.so.0 (0x0043d000)
> >         libmpi.so.0 => /opt/lam/gnu/lib/libmpi.so.0 (0x00702000)
> >         liblam.so.0 => /opt/lam/gnu/lib/liblam.so.0 (0x00b7f000)
> >         libutil.so.1 => /lib/libutil.so.1 (0x00dee000)
> >         libdl.so.2 => /lib/libdl.so.2 (0x0099f000)
> >         libpthread.so.0 => /lib/tls/libpthread.so.0 (0x00a6e000)
> >         libc.so.6 => /lib/tls/libc.so.6 (0x0084e000)
> >         /lib/ld-linux.so.2 (0x00835000)
> > I do not know why that happens!! I compile the obj as follows:
> > #/opt/lam/gnu/bin/mpicc -o mpihello -x c mpihello.c
> > I have tried to look over the lam users mailing list.. couldn't come
> > up with anything so far.... will update as soon as i find out why!!
> > and ldd of mpirun on both the frontend and compute nodes gives the
> > following results
> > 	liblam.so.0 => /opt/lam/gnu/lib/liblam.so.0 (0x004c5000)
> >         libdl.so.2 => /lib/libdl.so.2 (0x0099f000)
> >         libutil.so.1 => /lib/libutil.so.1 (0x00dee000)
> >         libpthread.so.0 => /lib/tls/libpthread.so.0 (0x00a6e000)
> >         libc.so.6 => /lib/tls/libc.so.6 (0x0084e000)
> >         /lib/ld-linux.so.2 (0x00835000)
> >
>
> this is quite interesting, as it's working from the command line as
> we saw. But it might be the case, that during an interactive login
> other files of the .profile, .bashrc,... are sourced, than with a non-
> interactive login in your setup. So I'd suggest to test with this job:
>
> #!/bin/sh
> echo $LD_LIBRARY_PATH
> ldd mpihello
> exit 0
>
> what going on on the head node of the parallel job (again by
> requesting the PE for LAM also). - Reuti
>
>
> > Thanks again,
> > Srividya
> >
> >
> >> -- Reuti
> >>
> >>
> >>>>
> >>>> - Reuti
> >>>>
> >>>>
> >>>>>>>>> 7) On running the script file as follows
> >>>>>>>>>        [srividya at medusa ~]$ qsub -pe lam_loose_rsh 2
> >>>>>>>>> tester1.sh
> >>>>>>>>> 		Your job 79 ("tester1.sh") has been submitted.
> >>>>>>>>> 	[srividya at medusa ~]$ qstat
> >>>>>>>>> 	job-ID  prior   name       user         state submit/start at
> >>>>>>>>> queue                          slots 		ja-task-ID
> >>>>>>>>> 	-------------------------------------------------------------
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --------------------------------------------
> >>>>>>>>>      79 0.00000 tester1.sh srividya     qw    01/18/2006
> >>>>>>>>> 09:37:12
> >>>>>>>>>                               2
> >>>>>>>>>
> >>>>>>>>> 8) And obtain the following results in the tester1.sh.e77
> >>>>>>>>>
> >>>>>>>>>      [srividya at medusa ~]$ cat tester1.sh.e79
> >>>>>>>>> 	/home/srividya/mpihello: error while loading shared
> >>>>>>>>> libraries:
> >>>>>>>>> liblamf77mpi.so.0: 	cannot open shared object file: No such
> >>>>>>>>> file or
> >>>>>>>>> directory
> >>>>>>>>> --------------------------------------------------------------
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> -------
> >>>>>>>>> It seems that [at least] one of the processes that was started
> >>>>>>>>> with
> >>>>>>>>> mpirun did not invoke MPI_INIT before quitting (it is possible
> >>>>>>>>> that
> >>>>>>>>> more than one process did not invoke MPI_INIT -- mpirun was
> >>>>>>>>> only
> >>>>>>>>> notified of the first one, which was on node n0).
> >>>>>>>>>
> >>>>>>>>> mpirun can *only* be used with MPI programs (i.e., programs
> >>>>>>>>> that
> >>>>>>>>> invoke MPI_INIT and MPI_FINALIZE).  You can use the "lamexec"
> >>>>>>>>> program
> >>>>>>>>> to run non-MPI programs over the lambooted nodes.
> >>>>>>>>> --------------------------------------------------------------
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> -------
> >>>>>>>>> /home/srividya/mpihello: error while loading shared libraries:
> >>>>>>>>> liblamf77mpi.so.0: cannot open shared object file: No such
> >>>>>>>>> file or
> >>>>>>>>> directory
> >>>>>>>>>
> >>>>>>>>> I am not sure why the path information is not being read by
> >>>>>>>>> SGE... The
> >>>>>>>>> LD_LIBRARY_PATH env variable has the required path... Is there
> >>>>>>>>> something else that i am missing.
> >>>>>>>>>
> >>>>>>>>> 9) On changing the script to sge.lam.script as follows.. the
> >>>>>>>>> only
> >>>>>>>>> diff
> >>>>>>>>> being the LAM_MPI_SOCKET_SUFFIX
> >>>>>>>>>    #cat sge.lam.script
> >>>>>>>>>     #!/bin/sh
> >>>>>>>>>    #$ -N mpihello
> >>>>>>>>>    #$ -cwd
> >>>>>>>>>    #$ -j y
> >>>>>>>>>    #
> >>>>>>>>>    # pe request for LAM. Set your number of processors here.
> >>>>>>>>>   #$ -pe lam_loose_rsh 2
> >>>>>>>>>   #
> >>>>>>>>>   # Run job through bash shell
> >>>>>>>>>   #$ -S /bin/bash
> >>>>>>>>>   # This MUST be in your LAM run script, otherwise
> >>>>>>>>>   # multiple LAM jobs will NOT RUN
> >>>>>>>>>   export LAM_MPI_SOCKET_SUFFIX=$JOB_ID.$JOB_NAME
> >>>>>>>>>  #
> >>>>>>>>>  # Use full pathname to make sure we are using the right
> >>>>>>>>> mpirun
> >>>>>>>>> /opt/lam/gnu/bin/mpirun -np $NSLOTS /home/srividya/mpihello
> >>>>>>>>>
> >>>>>>>>> 10) and submitting to the queue
> >>>>>>>>>         #qsub sge.lam.script
> >>>>>>>>>
> >>>>>>>>> 11) Obtain the following error message
> >>>>>>>>>         [srividya at medusa ~]$ cat mpihello.o80
> >>>>>>>>> --------------------------------------------------------------
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> -------
> >>>>>>>>> It seems that there is no lamd running on the host
> >>>>>>>>> compute-0-6.local.
> >>>>>>>>>
> >>>>>>>>> This indicates that the LAM/MPI runtime environment is not
> >>>>>>>>> operating.
> >>>>>>>>> The LAM/MPI runtime environment is necessary for the "mpirun"
> >>>>>>>>> command.
> >>>>>>>>>
> >>>>>>>>> Please run the "lamboot" command the start the LAM/MPI runtime
> >>>>>>>>> environment.  See the LAM/MPI documentation for how to invoke
> >>>>>>>>> "lamboot" across multiple machines.
> >>>>>>>>> --------------------------------------------------------------
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> -------
> >>>>>>>>>
> >>>>>>>>> And this is the message that i was sending out earlier. I am
> >>>>>>>>> new to
> >>>>>>>>> the sge-lam environment and thanks so much for your patience.
> >>>>>>>>> Any
> >>>>>>>>> help
> >>>>>>>>> will be greatly appreciated.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Srividya
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 1/11/06, Reuti <reuti at staff.uni-marburg.de> wrote:
> >>>>>>>>>> Am 11.01.2006 um 20:45 schrieb Srividya Valivarthi:
> >>>>>>>>>>
> >>>>>>>>>>> The change in the startlam.sh from
> >>>>>>>>>>> echo host
> >>>>>>>>>>> to
> >>>>>>>>>>> echo host.local
> >>>>>>>>>>>
> >>>>>>>>>>> after stopping and booting the lamuniverse does not seem to
> >>>>>>>>>>> solve
> >>>>>>>>>>> the
> >>>>>>>>>>
> >>>>>>>>>> No - stop the lamuniverse. Don't boot it by hand! Just
> >>>>>>>>>> start a
> >>>>>>>>>> parallel job like I mentioned the mpihello.c, and post the
> >>>>>>>>>> error/
> >>>>>>>>>> log-
> >>>>>>>>>> files of this job. Your rsh connection is also working
> >>>>>>>>>> between
> >>>>>>>>>> the
> >>>>>>>>>> nodes for a passwordless invocation? - Reuti
> >>>>>>>>>>
> >>>>>>>>>>> problem either..
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks again,
> >>>>>>>>>>> Srividya
> >>>>>>>>>>>
> >>>>>>>>>>> On 1/11/06, Reuti <reuti at staff.uni-marburg.de> wrote:
> >>>>>>>>>>>> Am 11.01.2006 um 19:53 schrieb Srividya Valivarthi:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> The pe is defined as follows:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> #qconf -sp lam_loose_rsh
> >>>>>>>>>>>>> pe_name           lam_loose_rsh
> >>>>>>>>>>>>> slots             4
> >>>>>>>>>>>>> user_lists        NONE
> >>>>>>>>>>>>> xuser_lists       NONE
> >>>>>>>>>>>>> start_proc_args   /home/srividya/scripts/lam_loose_rsh/
> >>>>>>>>>>>>> startlam.sh \
> >>>>>>>>>>>>>                   $pe_hostfile
> >>>>>>>>>>>>> stop_proc_args    /home/srividya/scripts/lam_loose_rsh/
> >>>>>>>>>>>>> stoplam.sh
> >>>>>>>>>>>>> allocation_rule   $round_robin
> >>>>>>>>>>>>> control_slaves    FALSE
> >>>>>>>>>>>>> job_is_first_task TRUE
> >>>>>>>>>>>>> urgency_slots     min
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Okay, fine. As you use ROCKS, please change in the
> >>>>>>>>>>>> startlam.sh in
> >>>>>>>>>>>> PeHostfile2MachineFile():
> >>>>>>>>>>>>
> >>>>>>>>>>>>           echo $host
> >>>>>>>>>>>>
> >>>>>>>>>>>> to
> >>>>>>>>>>>>
> >>>>>>>>>>>>           echo $host.local
> >>>>>>>>>>>>
> >>>>>>>>>>>> As we have no ROCKS, I don't know whether this is
> >>>>>>>>>>>> necessary.
> >>>>>>>>>>>> Then
> >>>>>>>>>>>> just try as outlined in the Howto with the included
> >>>>>>>>>>>> mpihello.c,
> >>>>>>>>>>>> just
> >>>>>>>>>>>> to test the distribution to the nodes (after shutting down
> >>>>>>>>>>>> the
> >>>>>>>>>>>> started LAM universe). - Reuti
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks so much,
> >>>>>>>>>>>>> Srividya
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 1/11/06, Srividya Valivarthi <srividya.v at gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>    I did define the pe for loose rsh using qmon. and also
> >>>>>>>>>>>>>> added
> >>>>>>>>>>>>>> this
> >>>>>>>>>>>>>> pe to the queue list using the queue manager provided by
> >>>>>>>>>>>>>> qmon.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> Srividya
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 1/11/06, Reuti <reuti at staff.uni-marburg.de> wrote:
> >>>>>>>>>>>>>>> Hi again.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Am 11.01.2006 um 19:34 schrieb Srividya Valivarthi:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>    Thanks for your prompt response. I am sorry if i
> >>>>>>>>>>>>>>>> was not
> >>>>>>>>>>>>>>>> clear on
> >>>>>>>>>>>>>>>> the earlier mail. I did not  start the lamd deamons
> >>>>>>>>>>>>>>>> prior to
> >>>>>>>>>>>>>>>> submitting the job by hand. What I was trying to
> >>>>>>>>>>>>>>>> convey was
> >>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> lamd deamons are running on the compute nodes possibly
> >>>>>>>>>>>>>>>> started
> >>>>>>>>>>>>>>>> by SGE
> >>>>>>>>>>>>>>>> itself, but somehow is not registered with LAM/MPI??!!
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>     And also the hostfile that is used during lamboot
> >>>>>>>>>>>>>>>> #lamboot -v -ssi boot rsh hostfile
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> lamboot will start the daemons, which isn't necessary.
> >>>>>>>>>>>>>>> Also
> >>>>>>>>>>>>>>> with a
> >>>>>>>>>>>>>>> loose integration, SGE will start the daemons on its own
> >>>>>>>>>>>>>>> (just by
> >>>>>>>>>>>>>>> rsh
> >>>>>>>>>>>>>>> in contrast to qrsh with a Tight Integration).
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> LAM/MPI is in some way SGE aware, and will look for some
> >>>>>>>>>>>>>>> special
> >>>>>>>>>>>>>>> information in the SGE created directories on all the
> >>>>>>>>>>>>>>> slave
> >>>>>>>>>>>>>>> nodes.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> But anyway: how did you define the PE - loose with
> >>>>>>>>>>>>>>> rsh or
> >>>>>>>>>>>>>>> qrsh? -
> >>>>>>>>>>>>>>> Reuti
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> is as follows, which already had the .local suffix as
> >>>>>>>>>>>>>>>> medusa.lab.ac.uab.edu cpu=4
> >>>>>>>>>>>>>>>> compute-0-0.local cpu=4
> >>>>>>>>>>>>>>>> compute-0-1.local cpu=4
> >>>>>>>>>>>>>>>> compute-0-2.local cpu=4
> >>>>>>>>>>>>>>>> compute-0-3.local cpu=4
> >>>>>>>>>>>>>>>> compute-0-4.local cpu=4
> >>>>>>>>>>>>>>>> compute-0-5.local cpu=4
> >>>>>>>>>>>>>>>> compute-0-6.local cpu=4
> >>>>>>>>>>>>>>>> compute-0-7.local cpu=4
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Any further ideas to solve this issue will be very
> >>>>>>>>>>>>>>>> helpful.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>> Srividya
> >>>>>>>>>>>>>>>> On 1/11/06, Reuti <reuti at staff.uni-marburg.de> wrote:
> >>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Am 11.01.2006 um 18:55 schrieb Srividya Valivarthi:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>     I am working with a pentium III rocks cluster
> >>>>>>>>>>>>>>>>>> which has
> >>>>>>>>>>>>>>>>>> LAM/MPI
> >>>>>>>>>>>>>>>>>> version 7.1.1 and SGE version 6.0. I am trying to
> >>>>>>>>>>>>>>>>>> get the
> >>>>>>>>>>>>>>>>>> loose
> >>>>>>>>>>>>>>>>>> integration mechanism with rsh working with SGE and
> >>>>>>>>>>>>>>>>>> LAM as
> >>>>>>>>>>>>>>>>>> suggested
> >>>>>>>>>>>>>>>>>> by the following post on this mailing list
> >>>>>>>>>>>>>>>>>> http://gridengine.sunsource.net/howto/lam-
> >>>>>>>>>>>>>>>>>> integration/
> >>>>>>>>>>>>>>>>>> lam-
> >>>>>>>>>>>>>>>>>> integration.html
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> However, on submitting the jobs to the queue, i
> >>>>>>>>>>>>>>>>>> get the
> >>>>>>>>>>>>>>>>>> following
> >>>>>>>>>>>>>>>>>> error message
> >>>>>>>>>>>>>>>>>> -----------------------------------------------------
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> -------
> >>>>>>>>>>>>>>>>>> It seems that there is no lamd running on the host
> >>>>>>>>>>>>>>>>>> compute-0-5.local.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> This indicates that the LAM/MPI runtime
> >>>>>>>>>>>>>>>>>> environment is
> >>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>> operating.
> >>>>>>>>>>>>>>>>>> The LAM/MPI runtime environment is necessary for the
> >>>>>>>>>>>>>>>>>> "mpirun"
> >>>>>>>>>>>>>>>>>> command.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Please run the "lamboot" command the start the LAM/
> >>>>>>>>>>>>>>>>>> MPI
> >>>>>>>>>>>>>>>>>> runtime
> >>>>>>>>>>>>>>>>>> environment.  See the LAM/MPI documentation for
> >>>>>>>>>>>>>>>>>> how to
> >>>>>>>>>>>>>>>>>> invoke
> >>>>>>>>>>>>>>>>>> "lamboot" across multiple machines.
> >>>>>>>>>>>>>>>>>> -----------------------------------------------------
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> -------
> >>>>>>>>>>>>>>>>>> But, lamnodes  command shows all the nodes on the
> >>>>>>>>>>>>>>>>>> system
> >>>>>>>>>>>>>>>>>> and i
> >>>>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>> also see the lamd deamon running on the local compute
> >>>>>>>>>>>>>>>>>> nodes.  Any
> >>>>>>>>>>>>>>>>>> ideas on the what the issue could be are greatly
> >>>>>>>>>>>>>>>>>> appreciated.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> there is no need to startup any daemon on your own by
> >>>>>>>>>>>>>>>>> hand
> >>>>>>>>>>>>>>>>> before. In
> >>>>>>>>>>>>>>>>> fact, it will not work. SGE takes care of starting a
> >>>>>>>>>>>>>>>>> private
> >>>>>>>>>>>>>>>>> daemon
> >>>>>>>>>>>>>>>>> for each job on all the selected nodes for this
> >>>>>>>>>>>>>>>>> particular
> >>>>>>>>>>>>>>>>> job.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> One issue with ROCKS might be similar to this (change
> >>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> startscript
> >>>>>>>>>>>>>>>>> to include .local for the nodes in the "machines"-
> >>>>>>>>>>>>>>>>> file):
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> http://gridengine.sunsource.net/servlets/ReadMsg?
> >>>>>>>>>>>>>>>>> listName=users&msgNo=14170
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Just let me know, whether it worked after adjusting
> >>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> start
> >>>>>>>>>>>>>>>>> script.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> -- Reuti
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>> Srividya
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> -----------------------------------------------------
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>>>> -
> >>>>>>>>>>>>>>>>>> To unsubscribe, e-mail: users-
> >>>>>>>>>>>>>>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>>>>>>>>>>>>>> For additional commands, e-mail: users-
> >>>>>>>>>>>>>>>>>> help at gridengine.sunsource.net
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> ------------------------------------------------------
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>>>> To unsubscribe, e-mail: users-
> >>>>>>>>>>>>>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>>>>>>>>>>>>> For additional commands, e-mail: users-
> >>>>>>>>>>>>>>>>> help at gridengine.sunsource.net
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> -------------------------------------------------------
> >>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>> To unsubscribe, e-mail: users-
> >>>>>>>>>>>>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>>>>>>>>>>>> For additional commands, e-mail: users-
> >>>>>>>>>>>>>>>> help at gridengine.sunsource.net
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --------------------------------------------------------
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> -
> >>>>>>>>>>>>>>> To unsubscribe, e-mail: users-
> >>>>>>>>>>>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>>>>>>>>>>> For additional commands, e-mail: users-
> >>>>>>>>>>>>>>> help at gridengine.sunsource.net
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ----------------------------------------------------------
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> -
> >>>>>>>>>>>>> To unsubscribe, e-mail: users-
> >>>>>>>>>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>>>>>>>>> For additional commands, e-mail: users-
> >>>>>>>>>>>>> help at gridengine.sunsource.net
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> -----------------------------------------------------------
> >>>>>>>>>>>> --
> >>>>>>>>>>>> --
> >>>>>>>>>>>> --
> >>>>>>>>>>>> --
> >>>>>>>>>>>> --
> >>>>>>>>>>>> To unsubscribe, e-mail: users-
> >>>>>>>>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>>>>>>>> For additional commands, e-mail: users-
> >>>>>>>>>>>> help at gridengine.sunsource.net
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> ------------------------------------------------------------
> >>>>>>>>>>> --
> >>>>>>>>>>> --
> >>>>>>>>>>> --
> >>>>>>>>>>> --
> >>>>>>>>>>> -
> >>>>>>>>>>> To unsubscribe, e-mail: users-
> >>>>>>>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>>>>>>> For additional commands, e-mail: users-
> >>>>>>>>>>> help at gridengine.sunsource.net
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> -------------------------------------------------------------
> >>>>>>>>>> --
> >>>>>>>>>> --
> >>>>>>>>>> --
> >>>>>>>>>> --
> >>>>>>>>>> To unsubscribe, e-mail: users-
> >>>>>>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>>>>>> For additional commands, e-mail: users-
> >>>>>>>>>> help at gridengine.sunsource.net
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --------------------------------------------------------------
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> --
> >>>>>>>>> -
> >>>>>>>>> To unsubscribe, e-mail: users-
> >>>>>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>>>>> For additional commands, e-mail: users-
> >>>>>>>>> help at gridengine.sunsource.net
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------
> >>>>>>>> --
> >>>>>>>> --
> >>>>>>>> --
> >>>>>>>> To unsubscribe, e-mail: users-
> >>>>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>>>> For additional commands, e-mail: users-
> >>>>>>>> help at gridengine.sunsource.net
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> <logsge-lam.txt>
> >>>>>>> ----------------------------------------------------------------
> >>>>>>> --
> >>>>>>> --
> >>>>>>> -
> >>>>>>> To unsubscribe, e-mail: users-
> >>>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>>> For additional commands, e-mail: users-
> >>>>>>> help at gridengine.sunsource.net
> >>>>>>
> >>>>>> -----------------------------------------------------------------
> >>>>>> --
> >>>>>> --
> >>>>>> To unsubscribe, e-mail: users-
> >>>>>> unsubscribe at gridengine.sunsource.net
> >>>>>> For additional commands, e-mail: users-
> >>>>>> help at gridengine.sunsource.net
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> ------------------------------------------------------------------
> >>>>> --
> >>>>> -
> >>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>> For additional commands, e-mail: users-
> >>>>> help at gridengine.sunsource.net
> >>>>>
> >>>>
> >>>> -------------------------------------------------------------------
> >>>> --
> >>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>> For additional commands, e-mail: users-
> >>>> help at gridengine.sunsource.net
> >>>>
> >>>>
> >>>
> >>> --------------------------------------------------------------------
> >>> -
> >>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list