[GE users] qdel not remove all instances of jobs

Chris Dagdigian dag at sonsorol.org
Tue Mar 27 15:08:57 BST 2007


This brings back memories, is this software from Schrodinger?  I  
remember jmonitor.pl from previous projects integrating Schrodinger  
Glide and Impact with SGE systems.

Schrodinger software is pretty easy to interface with SGE, they  
maintain their own "queue" definition/tree  that is aware of other  
batch scheduling systems, all you need to do is write a couple of  
fairly trivial wrapper scripts that hook into the SGE binaries.

 From memory, the proper way to kill those jobs was to use the  
Schrodinger "jobcontrol" CLI which via SGE integration methods is  
already aware of SGE qdel and how to call it properly.

-Chris



On Mar 22, 2007, at 6:18 PM, Reuti wrote:

> Hi Simon,
>
> Am 22.03.2007 um 19:51 schrieb Simon Gao:
>
>>
>>> ps -e f -o pid,ppid,pgrp,command
>>>
>>> as it will provide more details. Thx - Reuti
>>>
>> Here is an example:
>>
>> Before running qdel, the job runs on a compute node as:
>>
>> 24140     1 24140   /opt/gridengine/bin/lx26-x86/sge_execd
>> 12002 24140 12002     sge_shepherd-17663 -bg
>> 12003 12002 12003       /bin/sh /scr/apps/mms/bin/Linux-x86/jmonitor
>> 12192 12003 12192         /scr/apps/mms/bin/Linux-x86/perl /scr/ 
>> apps/mms/bin/Linux-x86/jmonitor.pl
>> 12222 12192 12222           sh -c /scr/apps/ppro/bin/Linux-x86/ 
>> main1m marci-0-4602b39d.TIN.inp > 1ett_dock.log 2>&1
>> 12223 12222 12222             /scr/apps/ppro/bin/Linux-x86/main1m  
>> marci-0-4602b39d.TIN.inp
>
> this is clear now: the processgroup 12003 will be killed by SGE but  
> not more. Unfortunately the "jmonitor" is already creating a new  
> processgroup 12192, and also the "sh -c /scr/..." the 12222 one.
>
> How is the "jmonitor.pl" started? I mean, if I just call perl from  
> bash and then again a shell script with a command, all stay in one  
> and the same processgroup:
>
>  4256  1804  4256  \_ sge_shepherd-45618 -bg
> 4257  4256  4257      \_ /bin/sh /var/spool/sge/node41/job_scripts/ 
> 45618
> 4258  4257  4257          \_ /usr/bin/perl ./run.pl
> 4259  4258  4257              \_ /bin/sh ./run.sh
> 4260  4259  4257                  \_ ps -e f -o pid,ppid,pgrp,command
>
> In former times we had a Jaguar license, but I can't remember to  
> ever use jmonitor for starting it. To me it appeared to be a  
> queuingsystem on its own, and so we used just self-made scripts for  
> starting jaguar. But maybe my memory is wrong, as it's some years  
> ago now.
>
> -- Reuti
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list