[GE users] qsub acting up and using 99% CPU.

Stefan.O.Nordlander at astrazeneca.com Stefan.O.Nordlander at astrazeneca.com
Mon Apr 11 12:04:45 BST 2005


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Ok, so it wasn't a standard issue then. dang..


This is what I've done:

1. Added an entry for my cluster in /opt/Schrodinger/schrodinger.hosts:
name: test28
tmpdir: /usr/tmp
host: master
queue: SGE
processors: 28

2. Configured the files in /opt/Schrodinger/queues/SGE/:
QPATH=/opt/az/hpc/SunONEGridEngine/current/bin/lx24-x86/
QSUB=qsub
QDEL=qdel
QSTAT=qstat

etc..

3. Started the job with:
# $SCHRODINGER/para_glide -i molsdock.inp -n 2 -HOST test28
Which in turn calls a wrapper script called submit which ...:

# ...
# Submit job
$QSUB -S /bin/sh $* $script > $qsubout
# ...

And this is the result:

# top (on on the master):
PID   USER PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
31590 me   25   0  1496 1496  1264  R    96.9  0.0   0:29   0 qsub

# ps -ef | grep qsub
me    2552  2544  0 12:47 pts/7    00:00:00 qsub -S /bin/sh
/home/me/.schrodinger/.jobdb/master-0-425a55d7.batch

(With an strace as described below.)

If I kill off this job and resubmit it manually with:

# qsub -S /bin/sh /home/me/.schrodinger/.jobdb/master-0-425a55d7.batch
It starts ok on a node!

I guess this is a problem with the inner workings of this application and
not a SGE issue.


/Stefan

> -----Original Message-----
> From: Stephan Grell - Sun Germany - SSG - Software Engineer
> [mailto:stephan.grell at sun.com]
> Sent: den 11 april 2005 11:06
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] qsub acting up and using 99% CPU.
> 
> 
> Hi,
> 
> more information would be nice. Which version of SGE? Which 
> OS? What was
> the qsub command?
> 
> Stephan
> 
> Stefan.O.Nordlander at astrazeneca.com wrote:
> 
> >Hi,
> >
> >I'm hoping this is a no brainer like "erum, well you forgot 
> to turn on the
> >main auxiliary power relay.."
> >
> >I'm trying to run para_glide through SGE and I think I have 
> everything set
> >up ok. But when I submit a job qsub starts and uses 99.9% 
> cpu. An strace
> >shows me this:
> >
> >select(1, [0], [0], NULL, {1, 0})       = 1 (out [0], left {1, 0})
> >gettimeofday({1113205289, 87082}, NULL) = 0
> >gettimeofday({1113205289, 87121}, NULL) = 0
> >gettimeofday({1113205289, 87159}, NULL) = 0
> >gettimeofday({1113205289, 87197}, NULL) = 0
> >select(1, [0], [0], NULL, {1, 0})       = 1 (out [0], left {1, 0})
> >gettimeofday({1113205289, 87547}, NULL) = 0
> >gettimeofday({1113205289, 87585}, NULL) = 0
> >gettimeofday({1113205289, 87624}, NULL) = 0
> >gettimeofday({1113205289, 87662}, NULL) = 0
> >select(1, [0], [0], NULL, {1, 0})       = 1 (out [0], left {1, 0})
> >gettimeofday({1113205289, 88012}, NULL) = 0
> >gettimeofday({1113205289, 88051}, NULL) = 0
> >gettimeofday({1113205289, 88089}, NULL) = 0
> >gettimeofday({1113205289, 88127}, NULL) = 0
> >...
> >...
> >...
> >
> >What's going on?
> >(More details about the job is avalible if necessary.)
> >
> >
> >Stefan Nordlander - Linux System Manager
> >________________________________________________
> >AstraZeneca R&D Mölndal
> >Pepparedsleden 1
> >S-431 83 Mölndal, Sweden
> >Phone:    +46 (0)31 706 49 14
> >Email:    Stefan.O.Nordlander at astrazeneca.com
> >________________________________________________
> >Unix _is_ user friendly, its just picky about who its friends are...
> >
> >  
> >
> >-------------------------------------------------------------
> -----------
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> >For additional commands, e-mail: users-help at gridengine.sunsource.net
> >  
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 



    [ Part 2: "Attached Text" ]

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



More information about the gridengine-users mailing list