[GE users] Rq state, but it never cleans up

King, Stefan sking at sepaton.com
Fri Jan 20 19:07:18 GMT 2006


Hi Rayson, 

Thanks for the reply.

I am using 6.02.

I checked the "messages" files on your suggestion and did learn
some good things for my own learning, but nothing looks like it
is related to the clean up problem.

I found some other problems related to the submission of my jobs,
i.e. sometimes the path of the shell was not resolvable (!) which
I have been able to explain, and since restarting the daemons a
couple times, I am now able to run jobs successfully.

So it appears the "clean up the old job first" admonition may be
a transient thing.  I do wish I knew what specifically it means,
and how to comply with it.
I guess if it doesn't happen again I could live with that.
Disconcerting though...

Stefan

-----Original Message-----
From: Rayson Ho [mailto:rayrayson at gmail.com] 
Sent: Friday, January 20, 2006 12:19 PM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] Rq state, but it never cleans up

What update level are you using??

And is there any useful information in the execd log file?

Rayson



On 1/20/06, King, Stefan <sking at sepaton.com> wrote:
>
>
> I cannot run a certain job on a two-node cluster.
>
>
>
> cannot run on host "node0X" until clean up of an previous run has
finished
>
> cannot run on host "node1" until clean up of an previous run has
finished
>
>
>
> Restarting all daemons has no effect.
>
> qdel of the job removes it, but resubmissions incur the same problem.
>
>
>
> The two measures in combination have likewise no effect.
>
>
>
> The job is submitted via Drmaa.
>
>
>
> Submitting a trivial job, (simple.sh) via qsub works.
>
>
>
> Submitting a different job via Drmaa works.
>
>
>
> How can SGE decide that cleanup needs to be done for this job?
>
>
>
> Is there a manual way to accomplish this "clean up" that SGE seems to
want?
>
>
>
> Is is likely  my (flat file) database is corrupt?
>
>
>
> I kind of need to understand what happened, more than I need to get
the job
> running.
>
>
>
>
>
> Any suggestions appreciated, I've been stuck for days.
>
>
>
> Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list