[GE users] Wrong job executing

Jeffrey Montesano jmontesano at aetheranetworks.com
Tue Mar 27 14:24:41 BST 2007


What is the expected behavior for jobs that define their own environment
variables in the absence of the -V switch?  Does each job get its own
shell in which its variables are shielded from other jobs?  Or do the
jobs share the same shell, in which case there is the potential for
one's environment variables to conflict with another's?

-----Original Message-----
From: Jeffrey Montesano [mailto:jmontesano at aetheranetworks.com] 
Sent: Friday, March 23, 2007 9:16 AM
To: users at gridengine.sunsource.net
Subject: RE: [GE users] Wrong job executing

Answer to 1: logfile means the output created by the application, not by
SGE.

Answer to 2: I'm not using a unique directory every time I submit a job;
I just want all of the jobs to run in the same directory as they are
launched in - so maybe the -cwd is not necessary?

I'm running on Linux RHEL4.3, SGE version 6.0u9.  

After doing some debugging of my own I have come to the realization that
my problem is related to environment variables.  What seems to be
happening is that when two jobs are dispatched to be executed within a
very short interval, one of the jobs ends up using the environment
variables from the other job.  For example, job A defines some
environment variable X=foo, and job B defines the same environment
variable X=bar, when these two jobs are scheduled within a short
interval of one another there is the possibility that job B will use
X=foo instead of X=bar.

Has anyone seen anything like this before?  

-----Original Message-----
From: Rayson Ho [mailto:rayrayson at gmail.com] 
Sent: Wednesday, March 21, 2007 12:30 PM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] Wrong job executing

1) By "logfile", you mean the job output file created by SGE or the
application??

2) Since you use "-cwd", did you go to a unique directory every time
you submit a job??

BTW, what OS and SGE version are you running??

Rayson



On 3/21/07, Jeffrey Montesano <jmontesano at aetheranetworks.com> wrote:
> No I didn't used the -b switch.  Here is the qsub command I used:
>
> qsub -p -500 -q $queue -r yes -o regression_output -e
regression_output
> -t 1 -l qls=1 -V -cwd  $testcase.$seed
>
>
> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de]
> Sent: Wednesday, March 21, 2007 11:45 AM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Wrong job executing
>
> Hi,
>
> Am 21.03.2007 um 15:32 schrieb Jeffrey Montesano:
>
> > To launch a regression, we submit several jobs (more than 10)
> > during the day to a queue which is open from 8pm until 7am.  These
> > jobs remain in the "qw" state until 8pm, at which time they all
> > compete for the 4 available CPU slots.
> >
> >
> >
> > When the regression results are verified the next day we notice
> > that some jobs have executed twice, while others have not executed
> > at all.  For example, if the jobs launched were A, B, C, D, E,  we
> > notice that there are logfiles created for A, B, C, D, E, but the
> > contents of logfiles A and C both correspond to job A.  It's as if
> > job C was executed as job A.
> did you submit the job with the option "-b y" by accident and edit
> the same script to submit it five times (hence only the last version
> of the script would be executed five times)? What were your exact
> qsub options and output redirections e.g. by a -o/-e option.
>
> -- Reuti
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list