[GE users] Runtime Design Automation?

Johnny Layne laynejg at vcu.edu
Mon Jul 30 15:48:46 BST 2007


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

hye everyone,
    I'm playing around with mpich2, running some VASP jobs.  I'm 
noticing that occasionally some rsh processes become zombies, anybody 
else seeing this?  Right now I suspect it's possibly due to not using a 
job-specific .smpd file, I'm going to play around & see if creating a 
specific one for each job seems to help.  So I wonder if launching a 
bunch of these jobs in quick succession is causing problems when the 
jobs finish & the .smpd has changed.

    I've got everything set up following Reuti's tight integration with 
mpich 2 
(http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-integration.html) 
and in general it works great, I've just noticed this happening a couple 
times, and couldn't find (so far) any similar postings in the mailing 
list archive.

    I could add this guy's solution to my stopmpich2.sh to kill any 
zombies I suppose:  
https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2004-January/004113.html 
or do something along those lines anyway in the kill code.

    It's not a big problem for me as I'll hunt down zombie processes & 
kill 'em, but I hardly trust our users to do that when we turn this 
stuff loose to them!  Thanks for any advice & info in advance.  I'll 
continue playing around with things & post if something seems to work.
     johnny

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list