[GE users] Cross referencing a grid ID and process ID

Reuti reuti at staff.uni-marburg.de
Tue Jun 19 14:35:40 BST 2007


Hi,

Am 19.06.2007 um 14:56 schrieb Colin Thomas:

> Many thanks for reading my email.
>
> A normal suspend method is sigstop, and resume is sigcont.
>
> My reading therefore is that we can have a script here - is that  
> correct
> ?

yes.

> If so, would the script run on the exec machine

Yes.

> (or the grid master)?
> If the script that I am writing could be added here?

Yes. But remember, still to send the sigstop to the complete  
processgroup (suspend_method /path/to/my/suspend.sh $job_pid):

#!/bin/sh
export PID=$1
kill -stop -- -$PID
...

and then release the license token. You can use the pseudo-variables  
$job_pid to get the pid of the main process. Details you will find in  
`man queue_conf` where the *_methods are explained.

-- Reuti


> Scenario therefore becomes : grid-"suspend" a job - runs this script
> which identifies the pid on the execution machine (child of the
> sge_shephard) which is the id in the flexlm licence entry that can be
> "lmremoved"
>
> The grid job 123456 creates the shepherd pid 18440
>
> sgeadm   18440 10645  0 12:23 ?        00:00:00 sge_shepherd-123456  
> -bg
>
> and child of pid 18400 is 18844, which is found in the flexlm lmstat
> data, so I can "lmremove" 18844, knowing that it is the correct one.
>
> i.e.
>
> lmremove -c <portID>@<licenseServer> token_a <user> mymachine
> mymachine_18844:0
>
> All slowly making sense..
>
> many thanks
>
> /colin/
>
> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de]
> Sent: 19 June 2007 13:30
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Cross referencing a grid ID and process ID
>
> Hi,
>
> Am 19.06.2007 um 14:01 schrieb Colin Thomas:
>
>> We have some applications which run for days, and consume valuable
>> license tokens. At peak license usage, we would want to grid-
>> suspend the long running job, and give its license token to another
>> short job, which once finished we give back to the long job which
>> get grid-resumed, and gets the license back.
>>
>> I have manully "proved" the system, so we want a little script that
>> will suspend a named job , and free up its license automatically
>>
>> If the long job runs on an execution machine "A", I can trace the
>> sge_shephard to the code that calls the license ( so I know which
>> license to free). From the grid side though I know the grid ID, and
>> that it is running on machine A, but I need the
> what do you mean with grid ID? The best place to put such a code
> would be the suspend_method and resume_method for a queue. Inside
> them the environment variables might provide enough information.
>
> -- Reuti
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>
>  To report this email as spam click
> https://www.mailcontrol.com/sr/wQw0zmjPoHdJTZGyOCrrhg==
> 0KIB0fgQUqt0ErkNcq 
> +FhwAvMkOh7KNjdczjoq0SFGuMhykeltXdqDoExjVKwLULxUSnY7E6
> 1WJQlgHIDuYU!3VW3Lbkp9jx4V3kR!QASV3veFL4zfvnFhghPrpqe6kz5PkBV3 
> +hZW4fQOcP
> t2XpnARjYqyKPEgQJ9WlXfwQkELAWZgirMTZHhU .
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list