[GE users] Jobs leave orphaned processes behind

rayson rayrayson at gmail.com
Tue Aug 31 00:30:43 BST 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

On Mon, Aug 30, 2010 at 7:20 PM, isakrejda <isakrejda at lbl.gov> wrote:
> Hi,
>
> This did the trick - looks like, but why isn't this a default?
> What are possible drawbacks?

According to Sun, NFS daemons can run as the group id of the user. So
if every process that has the group id added by SGE is killed when you
kill a job, then NFS daemons will be killed, which is not a good
thing.

However, I am not sure if newer NFS daemons are like that or not...


> I was trying to google up more info about it, but I cannot find any
> comprehensive description beyond the short note in the man pages...

Search the list archive, it can be hard to google because the
project's robots.txt is blocking google from indexing the site.

Rayson


>
> Thanks again,
>
> Iwona
>
> On Mon, Aug 30, 2010 at 9:38 AM, reuti <reuti at staff.uni-marburg.de> wrote:
>> Hi,
>>
>> Am 30.08.2010 um 18:24 schrieb isakrejda:
>>
>>> I am trying to understand without diving into user's rather complex workflow
>>> what might be causing  orphaned processes to be left behind when user's
>>> job is killed by a time limit.
>>>
>>> Usually SGE kills all the child processes cleanly, but with this one case we
>>> are having problems. Are there any well known loopholes that I am not aware of?
>>
>> for some applications they start a new process group and can't be caught by the usual kill of the process group. Setting:
>>
>> execd_params                 ENABLE_ADDGRP_KILL=TRUE
>>
>> in SGE's configuration should help. If not, it must be investigated why the job isn't tightly integrated into SGE.
>>
>> -- Reuti
>>
>>
>>> Grateful for hints,
>>>
>>> Iwona
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=278262
>>>
>>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=278268
>>
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=278331
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=278332

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list