[GE users] question on custom 'suspend method' ...

TRAN Chanh chanh.tran at dassault-aviation.fr
Thu May 26 13:16:07 BST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Reuti wrote:

> One thing I just saw: the changes to the queue will only be accepted 
> before the job starts to run on the node. Changing the queue 
> definition of then suspend method while the job is already running, 
> will not invoke it. - Reuti
>
I did try also 'qmod -s job_id' w/ same result .
The way I proceeded is :

1. set up the method in queue
2. 'qsub' job to queue
3. test w/ 'qmod -s queue' & 'qmod -s job_id'

All I did observe in both cases is my job 's state changed from 
'running' to 'suspended' ....

> TRAN Chanh wrote:
>
>> Reuti wrote:
>>
>>> Mmh, for me it's working (the default and also custom procedures). 
>>> What in detail do you observe. E.g., having a running job, issuing a 
>>> 'qmod -s ...' and log in to the the node. Then the 'ps -e f' should 
>>> list the status of the job as 'T' for stopped (on Linux).
>>>
>>> Having a custom procedure, can you try to echo something to a file 
>>> in your home directory? This way we might check, whether the 
>>> procedure is invoked at all.
>>>
>>> What platform are you on? - Reuti
>>>
>> I've have my 'qmaster' on  'AIX 5.2'  &  my execution plateforms 're 
>> on 'Linux RedHat Enterprise 3.0'.
>> I just re-double-checked my test case which is :
>>
>> - queue named 'queue.q' in which I defined a suspend method called 
>> '/nfs/suspend.sh'
>> - /nfs/suspend.sh :
>> #!/bin/ksh
>> output=/nfs/test.out
>> date >| $output
>> echo suspend  >> $output
>>
>> - /nfs/suspend.sh is set to 777
>> - /nfs/test.out is set to 777
>> - I did try this script to make sure its works in observing traces 
>> produced by 'date' & 'echo' in /nfs/test.out
>> - With 'qmod -s queue.q' executed from my 'qmaster' plateform, I did 
>> see no change in /nfs/test.out ...
>>
>> Chanh
>>
>>> TRAN Chanh wrote:
>>>
>>>> Reuti,
>>>>
>>>> Sorry for not having said I did try 'qmod -s' to trigger the 
>>>> 'suspend method' but saw no effect ... That 's why I posted ...
>>>>
>>>> Reuti wrote:
>>>>
>>>>> You can use in 5.3p6, but the command is always 'qmod -s ...'. 
>>>>> It's the syntax, which is deprecated for 6.0. - Reuti
>>>>>
>>>>> TRAN Chanh wrote:
>>>>>
>>>>>> If I get U right, this means I can't have this behavior under SGE 
>>>>>> 5.3p6 ....
>>>>>>
>>>>>> Thanks a lot anyway,
>>>>>> Cheers
>>>>>>
>>>>>> Reuti wrote:
>>>>>>
>>>>>>> Yes, for 6.0 there are the new options 'qmod -sj <job_id>' and
>>>>>>> 'qmod -sq <queue_name>'. And also any set suspend thresholds or 
>>>>>>> subordinations might invoke the suspend-method. - Reuti
>>>>>>>
>>>>>>> Quoting TRAN Chanh <chanh.tran at dassault-aviation.fr>:
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>>> Reuti wrote:
>>>>>>>>
>>>>>>>>  
>>>>>>>>
>>>>>>>>> Chanc,
>>>>>>>>>
>>>>>>>>> the methods will be invoked, when a job e.g. has to be suspended.
>>>>>>>>>     
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Reuti,
>>>>>>>>
>>>>>>>> What leads a job to state "has to be suspended", can this be 
>>>>>>>> triggered by 'qmod -s job_id' or 'qmod -s queue_name' something 
>>>>>>>> alike ?
>>>>>>>>
>>>>>>>>  
>>>>>>>>
>>>>>>>>> The default action is to send a sigstop to the whole process 
>>>>>>>>> group in this case. If you define a procedure on your own, you 
>>>>>>>>> can use some special variables, which will give you e.g. the 
>>>>>>>>> PID and do any cleanup or other things that are necessary (see 
>>>>>>>>> man queue_conf):
>>>>>>>>>
>>>>>>>>> suspend_method /usr/sge/mysuspend $job_pid
>>>>>>>>>
>>>>>>>>> and the script:
>>>>>>>>>
>>>>>>>>> #!/bin/sh
>>>>>>>>> kill -stop -- -$1
>>>>>>>>> exit 0
>>>>>>>>>
>>>>>>>>> Should behave like the default built-in if you suspend a job. 
>>>>>>>>> - Reuti
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> TRAN Chanh wrote:
>>>>>>>>>
>>>>>>>>>  
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I've pb understanding how the custom 'suspend / resume 
>>>>>>>>>> /terminate' method in a queue configuration works ?
>>>>>>>>>> How 're this methods related to action 'suspend/resume' on a 
>>>>>>>>>> queue via 'qmon' ?
>>>>>>>>>> More precisely, what I'm trying to do is to have this method 
>>>>>>>>>> triggered via 'suspend / resume' from 'qmon' ...
>>>>>>>>>>
>>>>>>>>>> Will someone please give me some insights on this matter ?
>>>>>>>>>>
>>>>>>>>>> Thanks in advance,
>>>>>>>>>> Chanh
>>>>>>>>>>
>>>>>>>>>> --------------------------------------------------------------------- 
>>>>>>>>>>
>>>>>>>>>> To unsubscribe, e-mail: 
>>>>>>>>>> users-unsubscribe at gridengine.sunsource.net
>>>>>>>>>> For additional commands, e-mail: 
>>>>>>>>>> users-help at gridengine.sunsource.net
>>>>>>>>>>       
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --------------------------------------------------------------------- 
>>>>>>>>>
>>>>>>>>> To unsubscribe, e-mail: 
>>>>>>>>> users-unsubscribe at gridengine.sunsource.net
>>>>>>>>> For additional commands, e-mail: 
>>>>>>>>> users-help at gridengine.sunsource.net
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------- 
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>>>> For additional commands, e-mail: 
>>>>>>>> users-help at gridengine.sunsource.net
>>>>>>>>
>>>>>>>>   
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --------------------------------------------------------------------- 
>>>>>>>
>>>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>>> For additional commands, e-mail: 
>>>>>>> users-help at gridengine.sunsource.net
>>>>>>>
>>>>>>>
>>>>>>>  
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------------- 
>>>>>>
>>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>>
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list