[GE users] ptf complains: Job does not exist

Reuti reuti at staff.uni-marburg.de
Fri Aug 11 22:47:27 BST 2006


Hi,

Am 11.08.2006 um 04:28 schrieb Thiep Duong:

> I am getting the above messages, and the last discussion on this
> issue was sometime in Aug/2005 without any conlusion/resolution.
>
> Let me try to describe it again:
> I have more than one user, who is using VNCserver window to submit
> interactive job
> 	/opt/app/SGE/6.0/bin/sol-sparc64/qsh -q solaris16G.q at scblad02
>
> The messages user got back is either:
>
> waiting for interactive job to be scheduled ...
> Your interactive job 27807 has been successfully scheduled.
> Or
> waiting for interactive job to be scheduled ...
> Could not start interactive job.
>
>
> Without any xterm window come up. Look like there is no resource
> found.  Adding -now no switch so that we can see what's going on.
> qstat would show there is a job in qw mode (job-id 27807), I can do
> qstat -j 27807 for 3-5 seconds, then it's just gone.
>
> User actually got email telling him that his job is completed.
>
> It's not DISPLAY issue -- user can open qsh using other queue

is it working in one queue, but not the other on the same machine?

I don't get your configuration: your user is sitting as his PC,  
making a VNC connection to the master node of your cluster, and  
issuing there a qsh comand?

-- Reuti

>
> Nothing found in spool/qmaster/messages
>
> Looking at comon/accounting file, the job seems to finish
> solaris16G.q:scblad02:ccds:zeke:INTERACTIVE:27807:sge:0
> :1155234047:1155234047:1155234047:0:1:0:0:0:0.000000:0
> :0:0:0:0:0:0:0.000000:0:0:0:0:0:0:eagle:zeke:NONE
> :1:0:0.000000:0.000000:0.000000:-U zeke -q solaris16G.q at scblad02
> -l num_proc=1 -soft -l group=scdc -I y -P eagle:0.000000:NONE:0.000000
>
>
> Looking in spool/scblad02 (execution host)
> 08/10/2006 11:12:59|execd|scblad02|W|reaping job "27801" ptf  
> complains: Job does not exist
> 08/10/2006 11:19:30|execd|scblad02|W|reaping job "27805" ptf  
> complains: Job does not exist
> 08/10/2006 11:20:47|execd|scblad02|W|reaping job "27807" ptf  
> complains: Job does not exist
>
> It's not execution host, because other user can open/run job
> on the same queue --
>
> What can we do to debug futher?  I am using 6.0u7 release.
>
> Thanks in advance.
>
> Thiep
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list