[GE users] qdel

John_Tai John_Tai at smics.com
Mon Nov 12 01:53:04 GMT 2007


    [ The following text is in the "gb2312" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

The NFS is fine, that error is out only when I interrupt the process.

The "clsbd" are leftover from a previous job, but that's actually normal. They just linger around. 

And yes, this problem is constant. 




-----Original Message-----
From: Reuti [mailto:reuti at staff.uni-marburg.de]
Sent: Saturday, November 10, 2007 9:43 PM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] qdel


Hi,

Am 06.11.2007 um 02:22 schrieb John_Tai:

> Here are the messages regarding an interrupted job:
>
> 11/06/2007 09:19:00|qmaster|dsls11|W|job 220729.1 failed on host  
> dsl13 general before job because: 11/06/2007 09:19:00 [999:15426]:  
> can't open file /tmp/220729.1.pc.q/pid: Permission denied

NFS problem?

> 11/06/2007 09:19:00|execd|dsl13|E|shepherd of job 220729.1 exited  
> with exit status = 11

11 means: Resource temporarily unavailable

> Here is the output for a running job. I can see the sge_shepherd now.
>
> Thanks again for the help.
>
>   PID  PPID  PGRP COMMAND
>  5316     1  5316 /home/sge/sge6.1/bin/lx24-x86/sge_execd
>  9543  5316  9543  \_ sge_shepherd-191394 -bg
>  9544  9543  9544      \_ /home/sge/sge6.1/utilbin/lx24-x86/rshd -l
>  9545  9544  9545          \_ /home/sge/sge6.1/utilbin/lx24-x86/ 
> qrsh_starter /data/sge/spool/dsl51/active_jobs/191394.1
>  9578  9545  9578              \_ csh -c eldo S013PLLFNB_NC9.sp - 
> compat
>  9612  9578  9578                  \_ /bin/sh /home/edamgr/linux/ 
> mentor/ams_2007.1/bin/eldo S013PLLFNB_NC9.sp -compat
>  9820  9612  9578                      \_ /bin/sh /home/edamgr/ 
> linux/mentor/ams_2007.1/com/eldo S013PLLFNB_NC9.sp -compat
>  9831  9820  9578                          \_ /home/edamgr/linux/ 
> mentor/ams_2007.1/ixl/bin/eldo.exe -i S013PLLFNB_NC9.sp -compat
>  9839  9831  9578                              \_ /home/edamgr/ 
> linux/mentor/ams_2007.1/ixl/lib/mgls_asynch  -f7,10
>  8350     1  8350 /home/cadence/linux/IC500/tools/bin/clsbd
>  8351  8350  8350  \_ /home/cadence/linux/IC500/tools/bin/clsbd
>  8352  8351  8350      \_ /home/cadence/linux/IC500/tools/bin/clsbd
> 22685     1 22685 cupsd

This looks perfect. Only: where are the three "clsbd" processes  
coming from? Maybe with the last job the shepherd quit already before  
you issued the qdel, hence it wasn't able to kill the job. Is the  
problem still persistent?

-- Reuti

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list