[GE users] File Staging (local)

Reuti reuti at staff.uni-marburg.de
Wed Mar 16 21:13:10 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi,

question: you can make a passwordless rcp from an exec host to the head node 
where your directory with the files is? The exec nodes will execute the 
prolog/epilog. The file song_of_wreck.txt must reside in your working 
directory, from where you submitted the job.

Don't change $TMPDIR. The line is labeled tmpdir in the queue config, but the 
environment variable is TMPDIR.

CU - Reuti

Quoting Vijay Avarachen <vavarachen at gmail.com>:

> As per Ron Chen's suggestion I read and tried out the File-Staging
> how-to
> @http://gridengine.sunsource.net/project/gridengine/howto/filestaging/
> .  Here is what I did:
> [1] moved the prolog and epilog scripts to /usr/share/sge/bin/lx24-x86/
> 
> [2] qconf -mq all.q
> <snip>
> prolog                /usr/share/sge/bin/lx24-x86/file_trans_prlg.sh
> epilog                /usr/share/sge/bin/lx24-x86/file_trans_eplg.sh
> <snip>
> 
> [3] On master, in /working/sample downloaded the changecase.sh and
> song_of_wreck.txt
> 
> [4] from the above directory executed
> qsub -v SGE_IN=song_of_wreck.txt,SGE_OUT=output.txt changecase.sh
> 
> It submitted the job on the nodes but errored out.  From the exec node
> message file:
> 03/16/2005 13:59:11|execd|node1|E|shepherd of job 98.1 exited with
> exit status = 28
> 03/16/2005 13:59:11|execd|node1|W|reaping job "98" ptf complains: Job
> does not exist
> 03/16/2005 13:59:11|execd|node1|E|can't open usage file
> "active_jobs/98.1/usage" for job 98.1: No such file or directory
> 03/16/2005 13:59:11|execd|node1|E|"03/16/2005 13:59:11 [0:8239]:
> error: can't chdir to /working/sample: No such file or directory"
> 
> Since it did not create the /working/sample directory, I moved the
> files from sample to /working (all exec nodes have /working) and ran
> the [4] step again. I got error message:
> 03/16/2005 14:09:59|execd|node4|E|shepherd of job 100.1 exited with
> exit status = 7
> 03/16/2005 14:09:59|execd|node4|W|reaping job "100" ptf complains: Job
> does not exist
> 
> I was not sure if the exec nodes need the prolog and epilog scripts,
> so to play it safe I placed them at the same location on all exec
> nodes. Still no luck.
> 
> I noticed that all.q file definition uses tmpdir and not TMPDIR.  I
> was not sure if this mattered so I added export TMPDIR="/tmp" to
> /etc/profile of all the exec nodes...still no go :-(
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list