[GE users] qdel

Lönroth Erik erik.lonroth at scania.com
Tue Sep 18 15:03:35 BST 2007


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hello Erik!

Any chance one may have a look at your scripts ?

Your solution appeals to me more than having to embedd alot of logic into the job scripts themselves.

I'd prefer even more to have this tied into the "PE" logics of SGE where one could specify a "over-ride" kill-signal for jobs running under a specific "PE" environment. That makes alot more senst to me...


/Erik


-----Original Message-----
From: Erik Soyez [mailto:E.Soyez at science-computing.de] 
Sent: den 18 september 2007 15:42
To: users at gridengine.sunsource.net
Subject: Re: [GE users] qdel


Erik,

we use something like
------------------------------------------------------------------------
suspend_method    /usr/local/gridengine/util/abaqus_signal.sh suspend $job_pid
resume_method     /usr/local/gridengine/util/abaqus_signal.sh resume $job_pid
terminate_method  /usr/local/gridengine/util/abaqus_signal.sh kill $job_pid
------------------------------------------------------------------------
to achieve similar things - if you do not have application specific queues, you need your script to find out somehow if to kill $job_pid or to touch your .isstopping-files.  See the man page of queue_conf for more command line parameters.

Erik.


On Tue, 18 Sep 2007, Reuti wrote:

> Hi,
>
> Am 18.09.2007 um 14:36 schrieb Lönroth Erik:
>
>> I have an application that detects the presence of a file 
>> ".isstopping" to
>> kill its paralell child processes on different hosts. I wan't this file to 
>> be created upon a "qdel" invocation for this specific application.
>
> so your parallel application isn't tightly integrated with SGE, as 
> otherwise
> the child processes would be killed by SGE automatically. Which parallel 
> library are you using?
>
>
>> My initial focus was the "PE" stop_procedure, but it's only executed 
>> AFTER
>> the completion of the job-script, so that won't help me much.
>> 
>> I need to catch the "qdel" and act on this by creating that file and 
>> wait
>> for some time before killing the job to let the application finish up.
>> 
>> My question is: How would I achieve this in a good way?
>> 
>>  I've read something about trapping SIGKILL or SIGTERM, but I'd 
>> figure
>> there are good ideas out there....
>> 
>> I'm on SGE 6.0u8
>> 
>> Regards
>> /Erik
-- 
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Florian Geyer,
Dr. Roland Niemeier, Dr. Arno Steitz, Dr. Ingrid Zech Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Prof. Dr. Hanns Ruder Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list