[GE users] question on custom 'suspend method' ...

Reuti reuti at staff.uni-marburg.de
Thu May 26 17:07:05 BST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Quoting TRAN Chanh <chanh.tran at dassault-aviation.fr>:

> Reuti wrote:
> 
> >Is there any error message in the messages file (of the qmaster or the
> node)? 
> >  
> >
> Sorry to ask but where I can check these msg ?

By default these are located in $SGE_ROOT/default/spool/qmaster and $SGE_ROOT/
default/spool/<name_of_node>, unless you have defined a custom spool directory. 
- Reuti

> 
> >You tried your script interactive also on the execution nodes, maybe /nfs is
> 
> >mounted there without "exec"? - Reuti
> >  
> >
> I tried to run my script on execution nodes, its works
> 
> >Quoting TRAN Chanh <chanh.tran at dassault-aviation.fr>:
> >
> >  
> >
> >>Reuti wrote:
> >>
> >>    
> >>
> >>>One thing I just saw: the changes to the queue will only be accepted 
> >>>before the job starts to run on the node. Changing the queue 
> >>>definition of then suspend method while the job is already running, 
> >>>will not invoke it. - Reuti
> >>>
> >>>      
> >>>
> >>I did try also 'qmod -s job_id' w/ same result .
> >>The way I proceeded is :
> >>
> >>1. set up the method in queue
> >>2. 'qsub' job to queue
> >>3. test w/ 'qmod -s queue' & 'qmod -s job_id'
> >>
> >>All I did observe in both cases is my job 's state changed from 
> >>'running' to 'suspended' ....
> >>
> >>    
> >>
> >>>TRAN Chanh wrote:
> >>>
> >>>      
> >>>
> >>>>Reuti wrote:
> >>>>
> >>>>        
> >>>>
> >>>>>Mmh, for me it's working (the default and also custom procedures). 
> >>>>>What in detail do you observe. E.g., having a running job, issuing a 
> >>>>>'qmod -s ...' and log in to the the node. Then the 'ps -e f' should 
> >>>>>list the status of the job as 'T' for stopped (on Linux).
> >>>>>
> >>>>>Having a custom procedure, can you try to echo something to a file 
> >>>>>in your home directory? This way we might check, whether the 
> >>>>>procedure is invoked at all.
> >>>>>
> >>>>>What platform are you on? - Reuti
> >>>>>
> >>>>>          
> >>>>>
> >>>>I've have my 'qmaster' on  'AIX 5.2'  &  my execution plateforms 're 
> >>>>on 'Linux RedHat Enterprise 3.0'.
> >>>>I just re-double-checked my test case which is :
> >>>>
> >>>>- queue named 'queue.q' in which I defined a suspend method called 
> >>>>'/nfs/suspend.sh'
> >>>>- /nfs/suspend.sh :
> >>>>#!/bin/ksh
> >>>>output=/nfs/test.out
> >>>>date >| $output
> >>>>echo suspend  >> $output
> >>>>
> >>>>- /nfs/suspend.sh is set to 777
> >>>>- /nfs/test.out is set to 777
> >>>>- I did try this script to make sure its works in observing traces 
> >>>>produced by 'date' & 'echo' in /nfs/test.out
> >>>>- With 'qmod -s queue.q' executed from my 'qmaster' plateform, I did 
> >>>>see no change in /nfs/test.out ...
> >>>>
> >>>>Chanh
> >>>>
> >>>>        
> >>>>
> >>>>>TRAN Chanh wrote:
> >>>>>
> >>>>>          
> >>>>>
> >>>>>>Reuti,
> >>>>>>
> >>>>>>Sorry for not having said I did try 'qmod -s' to trigger the 
> >>>>>>'suspend method' but saw no effect ... That 's why I posted ...
> >>>>>>
> >>>>>>Reuti wrote:
> >>>>>>
> >>>>>>            
> >>>>>>
> >>>>>>>You can use in 5.3p6, but the command is always 'qmod -s ...'. 
> >>>>>>>It's the syntax, which is deprecated for 6.0. - Reuti
> >>>>>>>
> >>>>>>>TRAN Chanh wrote:
> >>>>>>>
> >>>>>>>              
> >>>>>>>
> >>>>>>>>If I get U right, this means I can't have this behavior under SGE 
> >>>>>>>>5.3p6 ....
> >>>>>>>>
> >>>>>>>>Thanks a lot anyway,
> >>>>>>>>Cheers
> >>>>>>>>
> >>>>>>>>Reuti wrote:
> >>>>>>>>
> >>>>>>>>                
> >>>>>>>>
> >>>>>>>>>Yes, for 6.0 there are the new options 'qmod -sj <job_id>' and
> >>>>>>>>>'qmod -sq <queue_name>'. And also any set suspend thresholds or 
> >>>>>>>>>subordinations might invoke the suspend-method. - Reuti
> >>>>>>>>>
> >>>>>>>>>Quoting TRAN Chanh <chanh.tran at dassault-aviation.fr>:
> >>>>>>>>>
> >>>>>>>>> 
> >>>>>>>>>
> >>>>>>>>>                  
> >>>>>>>>>
> >>>>>>>>>>Reuti wrote:
> >>>>>>>>>>
> >>>>>>>>>> 
> >>>>>>>>>>
> >>>>>>>>>>                    
> >>>>>>>>>>
> >>>>>>>>>>>Chanc,
> >>>>>>>>>>>
> >>>>>>>>>>>the methods will be invoked, when a job e.g. has to be
> suspended.
> >>>>>>>>>>>    
> >>>>>>>>>>>                      
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>Reuti,
> >>>>>>>>>>
> >>>>>>>>>>What leads a job to state "has to be suspended", can this be 
> >>>>>>>>>>triggered by 'qmod -s job_id' or 'qmod -s queue_name' something 
> >>>>>>>>>>alike ?
> >>>>>>>>>>
> >>>>>>>>>> 
> >>>>>>>>>>
> >>>>>>>>>>                    
> >>>>>>>>>>
> >>>>>>>>>>>The default action is to send a sigstop to the whole process 
> >>>>>>>>>>>group in this case. If you define a procedure on your own, you 
> >>>>>>>>>>>can use some special variables, which will give you e.g. the 
> >>>>>>>>>>>PID and do any cleanup or other things that are necessary (see 
> >>>>>>>>>>>man queue_conf):
> >>>>>>>>>>>
> >>>>>>>>>>>suspend_method /usr/sge/mysuspend $job_pid
> >>>>>>>>>>>
> >>>>>>>>>>>and the script:
> >>>>>>>>>>>
> >>>>>>>>>>>#!/bin/sh
> >>>>>>>>>>>kill -stop -- -$1
> >>>>>>>>>>>exit 0
> >>>>>>>>>>>
> >>>>>>>>>>>Should behave like the default built-in if you suspend a job. 
> >>>>>>>>>>>- Reuti
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>TRAN Chanh wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> 
> >>>>>>>>>>>
> >>>>>>>>>>>                      
> >>>>>>>>>>>
> >>>>>>>>>>>>Hi all,
> >>>>>>>>>>>>
> >>>>>>>>>>>>I've pb understanding how the custom 'suspend / resume 
> >>>>>>>>>>>>/terminate' method in a queue configuration works ?
> >>>>>>>>>>>>How 're this methods related to action 'suspend/resume' on a 
> >>>>>>>>>>>>queue via 'qmon' ?
> >>>>>>>>>>>>More precisely, what I'm trying to do is to have this method 
> >>>>>>>>>>>>triggered via 'suspend / resume' from 'qmon' ...
> >>>>>>>>>>>>
> >>>>>>>>>>>>Will someone please give me some insights on this matter ?
> >>>>>>>>>>>>
> >>>>>>>>>>>>Thanks in advance,
> >>>>>>>>>>>>Chanh
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>                        
> >>>>>>>>>>>>
> >>--------------------------------------------------------------------- 
> >>    
> >>
> >>>>>>>>>>>>To unsubscribe, e-mail: 
> >>>>>>>>>>>>users-unsubscribe at gridengine.sunsource.net
> >>>>>>>>>>>>For additional commands, e-mail: 
> >>>>>>>>>>>>users-help at gridengine.sunsource.net
> >>>>>>>>>>>>      
> >>>>>>>>>>>>                        
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>                      
> >>>>>>>>>>>
> >>--------------------------------------------------------------------- 
> >>    
> >>
> >>>>>>>>>>>To unsubscribe, e-mail: 
> >>>>>>>>>>>users-unsubscribe at gridengine.sunsource.net
> >>>>>>>>>>>For additional commands, e-mail: 
> >>>>>>>>>>>users-help at gridengine.sunsource.net
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>    
> >>>>>>>>>>>                      
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>                    
> >>>>>>>>>>
> >>--------------------------------------------------------------------- 
> >>    
> >>
> >>>>>>>>>>To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> >>>>>>>>>>For additional commands, e-mail: 
> >>>>>>>>>>users-help at gridengine.sunsource.net
> >>>>>>>>>>
> >>>>>>>>>>  
> >>>>>>>>>>                    
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>---------------------------------------------------------------------
> >>>>>>>>>                  
> >>>>>>>>>
> >>>>>>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>>>>>>For additional commands, e-mail: 
> >>>>>>>>>users-help at gridengine.sunsource.net
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 
> >>>>>>>>>
> >>>>>>>>>                  
> >>>>>>>>>
> >>>>>>>>---------------------------------------------------------------------
> >>>>>>>>                
> >>>>>>>>
> >>>>>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>>>>>For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> >>>>>>>>                
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>---------------------------------------------------------------------
> >>>>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>>>>>
> >>>>>>>
> >>>>>>>              
> >>>>>>>
> >>>>>>---------------------------------------------------------------------
> >>>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>>>>            
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>---------------------------------------------------------------------
> >>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>>>
> >>>>>
> >>>>>          
> >>>>>
> >>>>---------------------------------------------------------------------
> >>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>>        
> >>>>
> >>>
> >>>---------------------------------------------------------------------
> >>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>>
> >>>
> >>>      
> >>>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >>For additional commands, e-mail: users-help at gridengine.sunsource.net
> >>
> >>    
> >>
> >
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
> >  
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list