[GE users] Pb w/ suspending job ...

TRAN Chanh chanh.tran at dassault-aviation.fr
Fri May 20 16:35:58 BST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Reuti wrote:

> If -notify is set, the jobs are warned with a SIGUSR1 before SIGSTOP. 
> The default action of SIGUSR1 is to terminate the process, unless you 
> trap the signal to do some custom procedures on your own (and this 
> means to trap it in each program in the process group).

How do U change the default action of all these SIGUSR ?

>
> For parallel jobs, the parallel lib may face some timeouts. But just try.

I  've  successfully  suspended  my  'multi-proc' jobs but haven't the 
chance to try w/ 'multi-node' jobs ....

>
>
> CU - Reuti
>
>
> TRAN Chanh wrote:
>
>>
>>
>> Reuti wrote:
>>
>>> Were these just plain serial jobs? There is indeed the possibility 
>>> to change the suspend/resume method, but the built-in:
>>
>>
>>
>> Currently, all the jobs 're plain serail one.
>> BTW, I just discovered that I have '-notify'  option in my 'qsub' & 
>> by eliminating this now my 'suspend' pb is gone.
>> I must say I'm happy w/ this but nevertheless remain interested in 
>> having an explanation why I did have this effect ...
>>
>> Actually, next step for me is to suspend 'multi-proc' jobs & 
>> 'multi-node' jobs & hope everything 'll work out fine
>>
>> Thanks again,
>> Chanh
>>
>>>
>>> kill -stop -- -<pid>
>>>
>>> should stop the whole process group. Did you define any procedures 
>>> on your own? Are some forks/threads of your application jumping out 
>>> of the process group? - Reuti
>>>
>> Otherwise, I don't have any specific procedure of my own ...
>>
>>>
>>> TRAN Chanh wrote:
>>>
>>>> Hi Reuti,
>>>>
>>>> Actually, I did try to do this :
>>>> 1. via 'qmon->jobs->suspend ....'
>>>> 2. qmod -s job_id
>>>>
>>>> Both 2 bring the same result
>>>>
>>>> Chanh
>>>>
>>>> Reuti wrote:
>>>>
>>>>> Chanh,
>>>>>
>>>>> which SGE commands did you use in detail to suspend and unsuspend 
>>>>> your jobs? - Reuti
>>>>>
>>>>> TRAN Chanh wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I'm using SGE 5.3p6 & try to have my executing jobs suspended & 
>>>>>> have these one 'back-to-work' via 'resume'.
>>>>>> What happens is these jobs instead of being suspended like 'kill 
>>>>>> -SIGSTOP', they 're all aborted like 'kill -9'.
>>>>>> Is there anyway to change this behavior ?
>>>>>>
>>>>>> Thanks a lot for any help,
>>>>>> Chanh
>>>>>>
>>>>>> --------------------------------------------------------------------- 
>>>>>>
>>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>>
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list