[GE users] eqw problem.

jiangfan shi jiangfan.shi at gmail.com
Fri Sep 7 05:47:14 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Again, my eqw problem was not solved after I tried to choose some queue to
run. By qstat -j jobid, I got the following :


script_file:                /home/.../my.sh
error reason    1:          09/06/2007 23:41:25 [7026:7382]: error: can't
chdir to /home/.../my-script-folder

What is this error?  Anyone can help me?

Thanks.

Jiangfan


On 9/6/07, Nicholas Senedzuk <nicholas.senedzuk at gmail.com> wrote:
>
> What Eqw means is that the queue is in error state, thats the E, and is in
> queue wait, thats the qw. The jobs will retry them selfs after a certain
> amount of time if you have them configured to. What most likely is happening
> is that you have one system that you are having a problem with so when a job
> attempts to run on that system and errors out into the Eqw state another job
> is dispatched to run on the system. So when you rerun these jobs they end up
> running on another host that is not having a problem and going into r state.
>
>
> So what I would recommend doing to finding the system/systems that are
> having the problem and disable that node and then run all the jobs and see
> what happens. If no jobs go into Eqw state then you found the problem and
> you just need to find out why the jobs are not running on that node
> correctly.
>
>
> On 9/6/07, jiangfan shi <jiangfan.shi at gmail.com> wrote:
> >
> > Hi,
> >
> > I have a error of "eqw" when I use qstat to see the status of jobs. Some
> > jobs are successfully going into "r" state, but some into "eqw" state. And
> > when I run those jobs again, sometimes all jobs are going into "r" state,
> > but most time there are always 3 or 8 going into "eqw" state.
> >
> > For the ex.out log information, I got the following:
> >
> > */bin/bash: /root/*.bashrc: Permission denied
> > /home/grad/jfshi/sandbox/threshold/mini-threshold/maetg: error while
> > loading shared libraries: libstdc++.so.6: cannot open sh
> > ared object file: No such file or directory
> >
> >
> > Originally I used the V flag with qsub to resolve such problem. It
> > worked at that time. But now it gave me the "eqw" problem.
> >
> >  The following is the jobs information:
> >
> > 201036 0.00000 reuse-mini jfshi        Eqw   09/03/2007
> > 21:28:30                                    1
> > 201044 0.00000 reuse-mini jfshi        Eqw   09/03/2007 21:28:31
> >
> > Anyone can tell me the solution?
> >
> > Thanks.
> >
> > Jiangfan
> >
>
>



More information about the gridengine-users mailing list