[GE users] question on custom 'suspend method' ...

Reuti reuti at staff.uni-marburg.de
Thu May 26 13:59:52 BST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Is there any error message in the messages file (of the qmaster or the node)? 
You tried your script interactive also on the execution nodes, maybe /nfs is 
mounted there without "exec"? - Reuti

Quoting TRAN Chanh <chanh.tran at dassault-aviation.fr>:

> Reuti wrote:
> 
> > One thing I just saw: the changes to the queue will only be accepted 
> > before the job starts to run on the node. Changing the queue 
> > definition of then suspend method while the job is already running, 
> > will not invoke it. - Reuti
> >
> I did try also 'qmod -s job_id' w/ same result .
> The way I proceeded is :
> 
> 1. set up the method in queue
> 2. 'qsub' job to queue
> 3. test w/ 'qmod -s queue' & 'qmod -s job_id'
> 
> All I did observe in both cases is my job 's state changed from 
> 'running' to 'suspended' ....
> 
> > TRAN Chanh wrote:
> >
> >> Reuti wrote:
> >>
> >>> Mmh, for me it's working (the default and also custom procedures). 
> >>> What in detail do you observe. E.g., having a running job, issuing a 
> >>> 'qmod -s ...' and log in to the the node. Then the 'ps -e f' should 
> >>> list the status of the job as 'T' for stopped (on Linux).
> >>>
> >>> Having a custom procedure, can you try to echo something to a file 
> >>> in your home directory? This way we might check, whether the 
> >>> procedure is invoked at all.
> >>>
> >>> What platform are you on? - Reuti
> >>>
> >> I've have my 'qmaster' on  'AIX 5.2'  &  my execution plateforms 're 
> >> on 'Linux RedHat Enterprise 3.0'.
> >> I just re-double-checked my test case which is :
> >>
> >> - queue named 'queue.q' in which I defined a suspend method called 
> >> '/nfs/suspend.sh'
> >> - /nfs/suspend.sh :
> >> #!/bin/ksh
> >> output=/nfs/test.out
> >> date >| $output
> >> echo suspend  >> $output
> >>
> >> - /nfs/suspend.sh is set to 777
> >> - /nfs/test.out is set to 777
> >> - I did try this script to make sure its works in observing traces 
> >> produced by 'date' & 'echo' in /nfs/test.out
> >> - With 'qmod -s queue.q' executed from my 'qmaster' plateform, I did 
> >> see no change in /nfs/test.out ...
> >>
> >> Chanh
> >>
> >>> TRAN Chanh wrote:
> >>>
> >>>> Reuti,
> >>>>
> >>>> Sorry for not having said I did try 'qmod -s' to trigger the 
> >>>> 'suspend method' but saw no effect ... That 's why I posted ...
> >>>>
> >>>> Reuti wrote:
> >>>>
> >>>>> You can use in 5.3p6, but the command is always 'qmod -s ...'. 
> >>>>> It's the syntax, which is deprecated for 6.0. - Reuti
> >>>>>
> >>>>> TRAN Chanh wrote:
> >>>>>
> >>>>>> If I get U right, this means I can't have this behavior under SGE 
> >>>>>> 5.3p6 ....
> >>>>>>
> >>>>>> Thanks a lot anyway,
> >>>>>> Cheers
> >>>>>>
> >>>>>> Reuti wrote:
> >>>>>>
> >>>>>>> Yes, for 6.0 there are the new options 'qmod -sj <job_id>' and
> >>>>>>> 'qmod -sq <queue_name>'. And also any set suspend thresholds or 
> >>>>>>> subordinations might invoke the suspend-method. - Reuti
> >>>>>>>
> >>>>>>> Quoting TRAN Chanh <chanh.tran at dassault-aviation.fr>:
> >>>>>>>
> >>>>>>>  
> >>>>>>>
> >>>>>>>> Reuti wrote:
> >>>>>>>>
> >>>>>>>>  
> >>>>>>>>
> >>>>>>>>> Chanc,
> >>>>>>>>>
> >>>>>>>>> the methods will be invoked, when a job e.g. has to be suspended.
> >>>>>>>>>     
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Reuti,
> >>>>>>>>
> >>>>>>>> What leads a job to state "has to be suspended", can this be 
> >>>>>>>> triggered by 'qmod -s job_id' or 'qmod -s queue_name' something 
> >>>>>>>> alike ?
> >>>>>>>>
> >>>>>>>>  
> >>>>>>>>
> >>>>>>>>> The default action is to send a sigstop to the whole process 
> >>>>>>>>> group in this case. If you define a procedure on your own, you 
> >>>>>>>>> can use some special variables, which will give you e.g. the 
> >>>>>>>>> PID and do any cleanup or other things that are necessary (see 
> >>>>>>>>> man queue_conf):
> >>>>>>>>>
> >>>>>>>>> suspend_method /usr/sge/mysuspend $job_pid
> >>>>>>>>>
> >>>>>>>>> and the script:
> >>>>>>>>>
> >>>>>>>>> #!/bin/sh
> >>>>>>>>> kill -stop -- -$1
> >>>>>>>>> exit 0
> >>>>>>>>>
> >>>>>>>>> Should behave like the default built-in if you suspend a job. 
> >>>>>>>>> - Reuti
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> TRAN Chanh wrote:
> >>>>>>>>>
> >>>>>>>>>  
> >>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> I've pb understanding how the custom 'suspend / resume 
> >>>>>>>>>> /terminate' method in a queue configuration works ?
> >>>>>>>>>> How 're this methods related to action 'suspend/resume' on a 
> >>>>>>>>>> queue via 'qmon' ?
> >>>>>>>>>> More precisely, what I'm trying to do is to have this method 
> >>>>>>>>>> triggered via 'suspend / resume' from 'qmon' ...
> >>>>>>>>>>
> >>>>>>>>>> Will someone please give me some insights on this matter ?
> >>>>>>>>>>
> >>>>>>>>>> Thanks in advance,
> >>>>>>>>>> Chanh
> >>>>>>>>>>
> >>>>>>>>>>
> --------------------------------------------------------------------- 
> >>>>>>>>>>
> >>>>>>>>>> To unsubscribe, e-mail: 
> >>>>>>>>>> users-unsubscribe at gridengine.sunsource.net
> >>>>>>>>>> For additional commands, e-mail: 
> >>>>>>>>>> users-help at gridengine.sunsource.net
> >>>>>>>>>>       
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> --------------------------------------------------------------------- 
> >>>>>>>>>
> >>>>>>>>> To unsubscribe, e-mail: 
> >>>>>>>>> users-unsubscribe at gridengine.sunsource.net
> >>>>>>>>> For additional commands, e-mail: 
> >>>>>>>>> users-help at gridengine.sunsource.net
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>     
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> --------------------------------------------------------------------- 
> >>>>>>>>
> >>>>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>>>>> For additional commands, e-mail: 
> >>>>>>>> users-help at gridengine.sunsource.net
> >>>>>>>>
> >>>>>>>>   
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> ---------------------------------------------------------------------
> 
> >>>>>>>
> >>>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>>>> For additional commands, e-mail: 
> >>>>>>> users-help at gridengine.sunsource.net
> >>>>>>>
> >>>>>>>
> >>>>>>>  
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> 
> >>>>>>
> >>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >> For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list