[GE users] Wrong job executing

Jeffrey Montesano jmontesano at aetheranetworks.com
Tue Mar 27 14:35:15 BST 2007


I'm launching the jobs from a PERL script as follows:

while (<tclist>) {
  system("qsub -p -500 -q $queue -r yes -o regression_output -e
regression_output -t 1 -l qls=1 -cwd $testcase.$seed");
} # while

File "testcase.$seed" defines some environment variables. Perhaps the
PERL "system" function is at fault?


-----Original Message-----
From: Andreas.Haas at Sun.COM [mailto:Andreas.Haas at Sun.COM] 
Sent: Tuesday, March 27, 2007 9:27 AM
To: users at gridengine.sunsource.net
Subject: RE: [GE users] Wrong job executing

As a matter of course each job gets it's own environment.
Could it be that the mechanism used by your jobs is causing 
such a environment sharing?

Andreas


On Tue, 27 Mar 2007, Jeffrey Montesano wrote:

> What is the expected behavior for jobs that define their own
environment
> variables in the absence of the -V switch?  Does each job get its own
> shell in which its variables are shielded from other jobs?  Or do the
> jobs share the same shell, in which case there is the potential for
> one's environment variables to conflict with another's?
>
> -----Original Message-----
> From: Jeffrey Montesano [mailto:jmontesano at aetheranetworks.com]
> Sent: Friday, March 23, 2007 9:16 AM
> To: users at gridengine.sunsource.net
> Subject: RE: [GE users] Wrong job executing
>
> Answer to 1: logfile means the output created by the application, not
by
> SGE.
>
> Answer to 2: I'm not using a unique directory every time I submit a
job;
> I just want all of the jobs to run in the same directory as they are
> launched in - so maybe the -cwd is not necessary?
>
> I'm running on Linux RHEL4.3, SGE version 6.0u9.
>
> After doing some debugging of my own I have come to the realization
that
> my problem is related to environment variables.  What seems to be
> happening is that when two jobs are dispatched to be executed within a
> very short interval, one of the jobs ends up using the environment
> variables from the other job.  For example, job A defines some
> environment variable X=foo, and job B defines the same environment
> variable X=bar, when these two jobs are scheduled within a short
> interval of one another there is the possibility that job B will use
> X=foo instead of X=bar.
>
> Has anyone seen anything like this before?
>
> -----Original Message-----
> From: Rayson Ho [mailto:rayrayson at gmail.com]
> Sent: Wednesday, March 21, 2007 12:30 PM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Wrong job executing
>
> 1) By "logfile", you mean the job output file created by SGE or the
> application??
>
> 2) Since you use "-cwd", did you go to a unique directory every time
> you submit a job??
>
> BTW, what OS and SGE version are you running??
>
> Rayson
>
>
>
> On 3/21/07, Jeffrey Montesano <jmontesano at aetheranetworks.com> wrote:
>> No I didn't used the -b switch.  Here is the qsub command I used:
>>
>> qsub -p -500 -q $queue -r yes -o regression_output -e
> regression_output
>> -t 1 -l qls=1 -V -cwd  $testcase.$seed
>>
>>
>> -----Original Message-----
>> From: Reuti [mailto:reuti at staff.uni-marburg.de]
>> Sent: Wednesday, March 21, 2007 11:45 AM
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] Wrong job executing
>>
>> Hi,
>>
>> Am 21.03.2007 um 15:32 schrieb Jeffrey Montesano:
>>
>>> To launch a regression, we submit several jobs (more than 10)
>>> during the day to a queue which is open from 8pm until 7am.  These
>>> jobs remain in the "qw" state until 8pm, at which time they all
>>> compete for the 4 available CPU slots.
>>>
>>>
>>>
>>> When the regression results are verified the next day we notice
>>> that some jobs have executed twice, while others have not executed
>>> at all.  For example, if the jobs launched were A, B, C, D, E,  we
>>> notice that there are logfiles created for A, B, C, D, E, but the
>>> contents of logfiles A and C both correspond to job A.  It's as if
>>> job C was executed as job A.
>> did you submit the job with the option "-b y" by accident and edit
>> the same script to submit it five times (hence only the last version
>> of the script would be executed five times)? What were your exact
>> qsub options and output redirections e.g. by a -o/-e option.
>>
>> -- Reuti
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

http://gridengine.info/

Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551
Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list