[GE globus] Grid Engine and GT4

Diego Bello dbello at gmail.com
Sat Jun 17 02:09:48 BST 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

On 6/16/06, David McBride <dwm at doc.ic.ac.uk> wrote:
>
> On Thu, 2006-06-15 at 17:53 -0400, Diego Bello wrote:
> > With the gt4 integration package, I get this when starting container:
>
> > java.io.IOException: java.io.IOException:
> > /opt/globus4/libexec/globus-scheduler-provider-sge: not found
>
> Hi,
>
> The LeSC GT4-SGE adaptor never shipped with an MDS information provider,
> which results in the above error.
>
> This will only affect you if you're trying to run an MDS service; it
> doesn't affect GRAM or WS-GRAM.
>
> I'm surprised that the GT4 JobManager doesn't work with when invoked by
> the GRAM backend; I didn't think the changes between the two releases
> were that significant.
>
> The error you're getting corresponds to
> Globus::GRAM::Error::TEMP_SCRIPT_FILE_FAILED -- which is returned if the
> JobManager is unable to create the temporary script file for some
> reason.
>
> Unfortunately, by default Globus doesn't seem to write out the logging
> output generated by the JobManager anywhere.  Rewriting the new()
> function in the file JobManager.pm to read as follows will force log
> files to be created in /tmp:
>
> $GLOBUS_LOCATION/lib/perl/Globus/GRAM/JobManager.pm:
>
> --------8<---------------------8<---------------------8<-------------
>
> sub new
> {
>     my $class = shift;
>     my $self = {};
>     my $description = shift;
>
>     $self->{JobDescription} = $description;
>
> # [dwm] Begin changes
> #
> #    if(defined($description->logfile()))
> #    {
>
>     my $WHOAMI = `whoami`;
>     chomp $WHOAMI;
>
>         local(*FH);
>         open(FH, '>>'. "/tmp/$WHOAMI-jobmanager.log");
>         select((select(FH),$|=1)[$[]);
>         $self->{log} = *FH;
> #    }
> # [dwm] End changes
>
>     bless $self, $class;
>
>     $self->log("New Perl JobManager created.");
>
>     return $self;
> }
>
> --------8<---------------------8<---------------------8<-------------
>
> If you're still can't work out what's going wrong, the JobManager perl
> script itself will have been installed in
> $GLOBUS_LOCATION/lib/perl/Globus/GRAM/JobManager/sge.pm -- if you know
> any Perl at all, try adding additional logging statements to the
> submit() function to try to narrow down the problem.
>
> I'm afraid I haven't done any work with SGE and GT4 since the adapator
> was released, so I probably can't be of much more help;  however, it's
> possible that you might find my latest internal revision (which I've
> been using with my GT2-based LCG grid work) useful -- see:
>
> http://www.doc.ic.ac.uk/~dwm/Code/gnu-arch/2006/sge-jobmanager--main/
> http://www.doc.ic.ac.uk/~dwm/arch/2006/ (GNU Arch Repository)
>
> Best of luck.
>
> Cheers,
> David
> --
> David McBride <dwm at doc.ic.ac.uk>
> Department of Computing, Imperial College, London
>

I found something, when trying to submit a test job with a XML descriptor
with this:
globusrun-ws -submit -factory localhost -Ft SGE -f test_super_simple.xml

<!-- test_super_simple.xml -->
<job>
    <executable>/bin/echo</executable>
    <directory>/tmp</directory>
    <argument>12</argument>
    <argument>abc</argument>
    <argument>34</argument>
    <argument>this is an example_string </argument>
    <argument>Globus was here</argument>
    <environment>
       <name>PI</name>
       <value>3.141</value>
    </environment>
    <stdin>/dev/null</stdin>
    <stdout>stdout</stdout>
    <stderr>stderr</stderr>
    <count>2</count>
</job>


The container's error is this:

2006-06-16 20:46:41,135 ERROR exec.JobManagerScript [Thread-38,run:307]
Script stderr:
Failed to read /etc/sge-jobmanager/jobmanager.conf for reading: No such file
or directory at /opt/globus4/lib/perl/Globus/GRAM/JobManager/sge.pm line
137.Failed to read  for reading: No such file or directory at
/opt/globus4/lib/perl/Globus/GRAM/JobManager/sge.pm line 137.Failed to read
for reading: No such file or directory at
/opt/globus4/lib/perl/Globus/GRAM/JobManager/sge.pm line 137.Unable to read
configuration file '/etc/sge-jobmanager/vqueues.conf': No such file or
directoryWill continue without any virtual queue definitions. at
/opt/globus4/lib/perl/Globus/GRAM/JobManager/sge.pm line 186.Syntax:
globus-gass-cache [-help] -op [-r resource][-t tag]...[URL]Use -help to
display full usageERROR: operation requires an URLSyntax: globus-gass-cache
[-help] -op [-r resource][-t tag]...[URL]Use -help to display full usage
2006-06-16 20:46:41,136 WARN
exec.StateMachine[Thread-28,createFaultFromErrorCode:2936] Unhandled
fault code 28
2006-06-16 20:49:27,776 WARN  usefulrp.GLUEResourceProperty [GLUE refresher
0,runScript:314] Script Execution error when executing shell
/opt/globus4/libexec/globus-scheduler-provider-sge



It seems it is looking for a /etc/sge-jobmanager directory, wich is
specified in the sge.pm file, but I have not such directory or the
jobmanager.conf and vqueues.conf files.

I think it is looking for queue information, wich is not found, and then the
error, am I right?, what do you think?.

Regards.
-- 
Diego Bello Carre?o
Estudiante Memorista de Ingeniería Civil Informática
UTFSM, Valparaíso, Chile
Usuario #294897 counter.li.org



More information about the gridengine-users mailing list