[GE users] Can SGE kill all the subprocesses created by pilot jobs even when they have different UIDs?

Gon?alo Borges goncalo at lip.pt
Mon Sep 17 14:31:39 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Reuti,

Some of my colleagues already tested and checked what we were looking 
for... Check on the results...
(/usr/local/bin/glexec_test_idmeu is what we refer as pilot job running 
as user A and submitting commands from other users):


== Writing a script to submit the test ==
[esfreire at svgd ~]$ cat glexec_test.sh
#!/bin/bash
/usr/local/bin/glexec_test_idmeu sleep 600


== Submit the job ==
[esfreire at svgd ~]$ qsub -l 
arch=32,num_proc=1,s_rt=01:00:00,s_vmem=1G,h_fsize=1G -q GRID 
glexec_test.sh


== Monitoring On The WN ==
root     31768  0.3  0.1  5656 1880 ?        S     2006 2003:26 
/opt/cesga/sge60/bin/lx26-x86/sge_execd.new
root      6620  0.0  0.0  2596 1008 ?        S    13:53   0:00  \_ 
sge_shepherd-1379602 -bg
esfreire  6621  0.0  0.0  4164  964 ?        Ss   13:53   0:00      \_ 
/bin/bash /opt/cesga/sge60/default/spool/compute-1-0/job_scripts/1379602
jlopez    6622  0.0  0.0  2264  328 ?        S    13:53   0:00          
\_ /usr/local/bin/glexec_test_idmeu sleep 600
jlopez    6623  0.0  0.0  5108  536 ?        S    13:53   
0:00              \_ sleep 600


== Killing the job ==

esfreire at svgd ~]$ qstat -u esfreire
job-ID  prior   name       user         state submit/start at     
queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
1379602 3.47413 glexec_tes esfreire     r     09/17/2007 13:53:57 
GRID at compute-1-0.local             1       
[esfreire at svgd ~]$ qdel 1379602
esfreire has registered the job 1379602 for deletion


== Watching the results ==
[esfreire at svgd ~]$ cat glexec_test.sh.e1379602
[esfreire at svgd ~]$ cat glexec_test.sh.o1379602


== Monitoring on the WN again (there are no process running for the 
users) ==
root     31768  0.3  0.1  5656 1880 ?        S     2006 2003:27 
/opt/cesga/sge60/bin/lx26-x86/sge_execd.new


*** Other Case ***

= Testing without kill the job =
= Monitoring on the WN after job finished (there are no process running 
for the users) =

root     31768  0.3  0.1  5656 1880 ?        S     2006 2003:27 
/opt/cesga/sge60/bin/lx26-x86/sge_execd.new
 
== Watching the results ==
[esfreire at svgd ~]$ cat glexec_test.sh.e1379609
[esfreire at svgd ~]$ cat glexec_test.sh.o1379609
Notice: identity changed to uid 10946
times utime/stime: real/user/sys (300.030000/0.000000/0.000000)
times cutime/cstime: real/user/sys (300.030000/0.000000/0.000000)
Child return code: 0


Cheers
Goncalo


Reuti wrote:
> Hi,
>
> Am 14.09.2007 um 14:58 schrieb Gon?alo Borges:
>
>> There is a group responsible for deploying SGE in EGEE Grid project, 
>> and to properly interface it with gLite, the EGEE middleware.
>> I send you a question asked by one of EGEE gLite staff:
>>
>>> there is an idea to use glexec on the WN to allow for generic
>>> pilot jobs submitted by the VO for all its users.  The question is:
>>> can SGE still do a proper cleanup of such jobs, or does it
>>> expect all subprocesses (and files) to have the UID of the
>>> Would SGE kill all the subprocesses created by
>>> the pilot job, even when they have different UIDs?
>>
>> I think We need some opinion of the experts in this...
>
> what do you mean by "pilot job" (the jobscript?) and "different UIDs"? 
> If SGE runs a job as user xyz, it can't change during it's lifetime to 
> abc.
>
> -- Reuti
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list