[GE users] semaphore leftovers

Ron Chen ron_chen_123 at yahoo.com
Fri Feb 11 00:02:53 GMT 2005


--- Reuti <reuti at staff.uni-marburg.de> wrote:
> Proof of concept:

Nice.

A while ago I was helping someone to migrate from PBS
to SGE, I found that they have patches against MPICH
to log down the semaphore IDs to a file. At the end of
the job they run some kind of cleanup process.

BTW, one small suggestion: add the Job ID to the file
name.

And also, if we place a loop in the stop_proc_args, we
need to use rsh. In some cluster env, using rsh is not
allowed. One way around this is to allow prolog/epilog
for parallel slave tasks. The code change should be
small.

(In case you are interested, SGE 6.0u3:
source/daemons/execd/exec_job.c:990 just removing the
check for should work)

 -Ron


> 
> You find a small code snippet (no error checking
> etc.) to create a logfile of 
> all the semget()s during a job for dynamically
> linked applications. Compiled 
> with:
> 
> gcc -shared -o ipc_wrapper.so ipc_wrapper.c -ldl -lc
> 
> Loaded in the .bashrc with:
> 
> export
> LD_PRELOAD=/home/reuti/ipc_wrapper/ipc_wrapper.so
> 
> After the job a loop in stop_proc_args over all the
> used nodes (well, rsh must 
> be used here as qrsh is no longer allowed), and
> using the local $TMPDIR on the 
> node (which reflects the nodename) all of the stuff
> belonging to this job could 
> be deleted.
> 
> The demo is only for semaphores, just as a test.
> 
> Is this an idea to be further followed? - Reuti
> 
>  
> 
> Quoting John Hearns
> <john.hearns at streamline-computing.com>:
> 
> > On Wed, 2005-02-09 at 18:38 -0600, David Farrell
> wrote:
> > 
> > > > But instead of a cron job, the cleanipcs could
> be put in the 
> > > > stop_proc_args or
> > > > queue_epilog.
> > > Yes this is MPICH, I will give this bit a try.
> The issue here is that 
> > > the users tend to use ctrl-C sorts to kill a job
> when running in 
> > > interactive mode, 
> > David,
> >     there is a simple solution to this.
> > It involves cutting off the left thumb and fourth
> finger of offending
> > users.
> > 
> > 
> > Seriously though, putting some sort of cleanipcs
> script in the stop_proc
> > is a good idea.
> > I looked at this briefly myself.
> > As I recall, the problem I found was that you can
> deal with the shared
> > memory segments. Either you can parse the output
> of ipcs -m
> > or grep /proc/sysvipc/shm 
> > and delete the relevant segments.
> > 
> > The problem I found was the semaphore queues,
> which you can't associate
> > with a particular process ID.
> > 
> > 
> > 
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail:
> users-help at gridengine.sunsource.net
> > 
> 
> 
> > #include <dlfcn.h>
> #include <sys/types.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> 
> int semget(key_t __key, int __nsems, int __semflg)
> {
>     void *handle;
>     int (*my_semget)(key_t, int, int);
> 
>     int my_semid;
> 
>     FILE *semaphore_logfile=0;
> 
>     char *tmpdir;
>     char semaphore_logfile_name[255];
> 
>     handle = dlopen("/lib/libc.so.6", RTLD_LAZY);
>     my_semget=dlsym(handle, "semget");
>     my_semid=(*my_semget)(__key, __nsems, __semflg);
>     dlclose(handle);
> 
>     tmpdir=getenv("TMPDIR");
>     strcpy(semaphore_logfile_name, tmpdir);
>     strcat(semaphore_logfile_name,
> "/semaphore_logfile");
> 
>     semaphore_logfile=fopen(semaphore_logfile_name,
> "a");    
>     fprintf(semaphore_logfile, "%d\n", my_semid);
>     fclose(semaphore_logfile);
> 
>     return(my_semid);
> }
> >
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail:
users-help at gridengine.sunsource.net



		
__________________________________ 
Do you Yahoo!? 
All your favorites on one personal page ? Try My Yahoo!
http://my.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list