[GE users] Wrong job executing

Joe Landman landman at scalableinformatics.com
Tue Mar 27 15:07:45 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Greetings Jeffrey

Jeffrey Montesano wrote:
> I'm launching the jobs from a PERL script as follows:
> 
> while (<tclist>) {
>   system("qsub -p -500 -q $queue -r yes -o regression_output -e
> regression_output -t 1 -l qls=1 -cwd $testcase.$seed");
> } # while
> 
> File "testcase.$seed" defines some environment variables. Perhaps the
> PERL "system" function is at fault?

Worst case, you can change %ENV to reflect your needed environment.  If 
$testcase.$seed sets up your environment, and is a bash script, you 
could do something like this:

# in the beginning of the program
my ($fh,$env_line);
...

# in the loop
open($fh,"<".$testcase.$seed) or die "FATAL ERROR: unable to open 
".$testcase.$seed."\n";
while($env_line=<$fh>)
  {
   $env_line=~ /(\S+)\s{0,}=\s{0,}(\S+)/;  # parse 'export VARIABLE=...'
					  # lines
   $ENV{$1}=$2;				  # and stuff them into our
					  # environment
  }
close($fh);

Note that this is generally a bad idea from a security perspective, but 
we are using regexes, so it should be un-tainted.

JOe
> 
> 
> -----Original Message-----
> From: Andreas.Haas at Sun.COM [mailto:Andreas.Haas at Sun.COM] 
> Sent: Tuesday, March 27, 2007 9:27 AM
> To: users at gridengine.sunsource.net
> Subject: RE: [GE users] Wrong job executing
> 
> As a matter of course each job gets it's own environment.
> Could it be that the mechanism used by your jobs is causing 
> such a environment sharing?
> 
> Andreas
> 
> 
> On Tue, 27 Mar 2007, Jeffrey Montesano wrote:
> 
>> What is the expected behavior for jobs that define their own
> environment
>> variables in the absence of the -V switch?  Does each job get its own
>> shell in which its variables are shielded from other jobs?  Or do the
>> jobs share the same shell, in which case there is the potential for
>> one's environment variables to conflict with another's?
>>
>> -----Original Message-----
>> From: Jeffrey Montesano [mailto:jmontesano at aetheranetworks.com]
>> Sent: Friday, March 23, 2007 9:16 AM
>> To: users at gridengine.sunsource.net
>> Subject: RE: [GE users] Wrong job executing
>>
>> Answer to 1: logfile means the output created by the application, not
> by
>> SGE.
>>
>> Answer to 2: I'm not using a unique directory every time I submit a
> job;
>> I just want all of the jobs to run in the same directory as they are
>> launched in - so maybe the -cwd is not necessary?
>>
>> I'm running on Linux RHEL4.3, SGE version 6.0u9.
>>
>> After doing some debugging of my own I have come to the realization
> that
>> my problem is related to environment variables.  What seems to be
>> happening is that when two jobs are dispatched to be executed within a
>> very short interval, one of the jobs ends up using the environment
>> variables from the other job.  For example, job A defines some
>> environment variable X=foo, and job B defines the same environment
>> variable X=bar, when these two jobs are scheduled within a short
>> interval of one another there is the possibility that job B will use
>> X=foo instead of X=bar.
>>
>> Has anyone seen anything like this before?
>>
>> -----Original Message-----
>> From: Rayson Ho [mailto:rayrayson at gmail.com]
>> Sent: Wednesday, March 21, 2007 12:30 PM
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] Wrong job executing
>>
>> 1) By "logfile", you mean the job output file created by SGE or the
>> application??
>>
>> 2) Since you use "-cwd", did you go to a unique directory every time
>> you submit a job??
>>
>> BTW, what OS and SGE version are you running??
>>
>> Rayson
>>
>>
>>
>> On 3/21/07, Jeffrey Montesano <jmontesano at aetheranetworks.com> wrote:
>>> No I didn't used the -b switch.  Here is the qsub command I used:
>>>
>>> qsub -p -500 -q $queue -r yes -o regression_output -e
>> regression_output
>>> -t 1 -l qls=1 -V -cwd  $testcase.$seed
>>>
>>>
>>> -----Original Message-----
>>> From: Reuti [mailto:reuti at staff.uni-marburg.de]
>>> Sent: Wednesday, March 21, 2007 11:45 AM
>>> To: users at gridengine.sunsource.net
>>> Subject: Re: [GE users] Wrong job executing
>>>
>>> Hi,
>>>
>>> Am 21.03.2007 um 15:32 schrieb Jeffrey Montesano:
>>>
>>>> To launch a regression, we submit several jobs (more than 10)
>>>> during the day to a queue which is open from 8pm until 7am.  These
>>>> jobs remain in the "qw" state until 8pm, at which time they all
>>>> compete for the 4 available CPU slots.
>>>>
>>>>
>>>>
>>>> When the regression results are verified the next day we notice
>>>> that some jobs have executed twice, while others have not executed
>>>> at all.  For example, if the jobs launched were A, B, C, D, E,  we
>>>> notice that there are logfiles created for A, B, C, D, E, but the
>>>> contents of logfiles A and C both correspond to job A.  It's as if
>>>> job C was executed as job A.
>>> did you submit the job with the option "-b y" by accident and edit
>>> the same script to submit it five times (hence only the last version
>>> of the script would be executed five times)? What were your exact
>>> qsub options and output redirections e.g. by a -o/-e option.
>>>
>>> -- Reuti
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
> 
> http://gridengine.info/
> 
> Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551
> Kirchheim-Heimstetten
> Amtsgericht Muenchen: HRB 161028
> Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
> Vorsitzender des Aufsichtsrates: Martin Haering
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list