[GE users] parallel jobs aren't running

Orion Poplawski orion at cora.nwra.com
Tue Apr 15 18:52:02 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

This seems to be a new issue for me.

I'm setting up some new parallel environments, but anything I submit to 
them doesn't run, and I'm not sure why.

$ qsub -q coop.q -pe mpi 2 mpi.csh pi3
Your job 1569 ("mpi.csh") has been submitted
$ qstat -j 1569
==============================================================
job_number:                 1569
exec_file:                  job_scripts/1569
submission_time:            Tue Apr 15 11:33:20 2008
owner:                      orion
uid:                        1744
group:                      cora
gid:                        1001
sge_o_home:                 /home/orion
sge_o_log_name:             orion
sge_o_path: 
/home/orion/bin:/opt/local/intel/fce/10.1.013/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/orion/bin:/opt/local/bin:/usr/sbin:/opt/local/rsi/idl/bin:/usr/local/NAGWare/bin:/opt/local/pathscale/bin:/opt/local/mpich/bin:/home/orion/src/nek5000/trunk//bin:/home/orion/src/nek5000/trunk//pre:/home/orion/src/nek5000/trunk//post:/home/orion/src/nek5000/trunk//bin/site_specific/cora
sge_o_shell:                /bin/bash
sge_o_workdir:              /home/orion/sge/mpi
sge_o_host:                 apollo
account:                    sge
cwd:                        /home/orion/sge/mpi
mail_list:                  orion at apollo.cora.nwra.com
notify:                     FALSE
job_name:                   mpi.csh
jobshare:                   0
hard_queue_list:            coop.q
env_list: 
MANPATH=/opt/local/intel/fce/10.1.013/man:/usr/kerberos/man:/usr/local/share/man:/usr/share/man/en:/usr/share/man:/opt/local/man:/usr/local/NAGWare/man:/opt/local/pathscale/man:/opt/local/mpich/man,HOSTNAME=apollo,NEK5000_HOME=/home/orion/src/nek5000/trunk/,ORGANIZATION=Colorado 
Research 
Associates,INTEL_LICENSE_FILE=/opt/local/intel/fce/10.1.013/licenses:/opt/intel/licenses:/home/orion/intel/licenses:/Users/Shared/Library/Application 
Support/Intel/Licenses,HOST=apollo,TERM=xterm,SHELL=/bin/bash,IDL_STARTUP=/home/orion/.idlrc,HISTSIZE=1000,MAILCHEC=30,SSH_CLIENT=192.168.0.72 
56234 
22,IDL_DIR=/opt/local/rsi/idl,CVSROOT=/data/sw1/cvsrep,OLDPWD=/home/orion/sge,OS=Linux,SSH_TTY=/dev/pts/7,KDENODEBUG=true,NAME=Orion 
E. 
Poplawski,USER=orion,LD_LIBRARY_PATH=/opt/local/intel/fce/10.1.013/lib,LS_COLORS=no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=00;32:*.cmd=00;32:*.exe=00;32:*.com=00;32:*.btm=00;32:*.bat=00;32:*.sh=00;32:*.csh=00;32:*.tar=00;31:*.tgz=00;31:*.arj=00;31:*.taz=00;31:*.lzh=00;31:*.zip=00;31:*.z=00;31:*.Z=00;31:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31:*.cpio=00;31:*.jpg=00;35:*.gif=00;35:*.bmp=00;35:*.xbm=00;35:*.xpm=00;35:*.png=00;35:*.tif=00;35:,KDEDIR=/usr,PAGER=less,MACH=x86_64,XPRINTER=poe,PGI=/opt/local/pgi,FTP_PASSIVE=1,MAIL=/var/spool/mail/orion,PATH=/home/orion/bin:/opt/local/intel/fce/10.1.013/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/home/orion/bin:/opt/local/bin:/usr/sbin:/opt/local/rsi/idl/bin:/usr/local/NAGWare/bin:/opt/local/pathscale/bin:/opt/local/mpich/bin:/home/orion/src/nek5000/trunk//bin:/home/orion/src/nek5000/trunk//pre:/home/orion/src/nek5000/trunk//post:/home/orion/src/nek
5000/trunk//bin/site_specific/cora,INPUTRC=/etc/inputrc,PWD=/home/orion/sge/mpi,NCARG_ROOT=/usr,EDITOR=vi,LANG=en_US.UTF-8,MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles:,SGE_ROOT=/usr/share/gridengine,LOADEDMODULES,JITC_PROCESSOR_TYPE=6,NCARG_LIB=/usr/lib64/ncarg,SHLVL=1,HOME=/home/orion,GDL_PATH=+/usr/share/gdl:+/opt/local/rsi/idl/lib,IDL_PATH=+/home/orion/idl:<IDL_DEFAULT>,DYLD_LIBRARY_PATH=/opt/local/intel/fce/10.1.013/lib,LOGNAME=orion,PRINTER=poe,VISUAL=vi,CVS_RSH=ssh,SSH_CONNECTION=192.168.0.72 
56234 192.168.0.118 
22,MODULESHOME=/usr/share/Modules,LESSOPEN=|/usr/bin/lesspipe.sh 
%s,LPDEST=poe,DISPLAY=localhost:11.0,G_BROKEN_FILENAMES=1,GS_DEVICE=x11,_=/usr/bin/qsub
job_args:                   pi3
script_file:                mpi.csh
scheduling info:            queue instance 
"cynosure.q at cynosure.cora.nwra.com" dropped because it is temporarily 
not available
                             queue instance 
"all.q at araucano.cora.nwra.com" dropped because it is temporarily not 
available
                             queue instance "all.q at irimi.cora.nwra.com" 
dropped because it is temporarily not available
                             queue instance 
"all.q at machias.cora.nwra.com" dropped because it is temporarily not 
available
                             queue instance 
"all.q at mercury.cora.nwra.com" dropped because it is temporarily not 
available
                             queue instance "all.q at radar.cora.nwra.com" 
dropped because it is temporarily not available
                             queue instance "all.q at vault.cora.nwra.com" 
dropped because it is temporarily not available
                             queue instance "all.q at wind.cora.nwra.com" 
dropped because it is temporarily not available
                             queue instance "all.q at antero.cora.nwra.com" 
is in suspend alarm: iidle=326.000000 (no load adjustment) <= 0:15:0
                             queue instance "all.q at iago.cora.nwra.com" 
is in suspend alarm: iidle=775.000000 (no load adjustment) <= 0:15:0
                             queue instance "all.q at kolea.cora.nwra.com" 
is in suspend alarm: iidle=341.000000 (no load adjustment) <= 0:15:0
                             queue instance "all.q at marie.cora.nwra.com" 
is in suspend alarm: iidle=112.000000 (no load adjustment) <= 0:15:0
                             queue instance "all.q at orca.cora.nwra.com" 
is in suspend alarm: iidle=224.000000 (no load adjustment) <= 0:15:0
                             queue instance 
"all.q at pyramid.cora.nwra.com" is in suspend alarm: iidle=290.000000 (no 
load adjustment) <= 0:15:0
                             queue instance "all.q at ranier.cora.nwra.com" 
is in suspend alarm: iidle=125.000000 (no load adjustment) <= 0:15:0
                             queue instance 
"all.q at shavano.cora.nwra.com" is in suspend alarm: iidle=0.000000 (no 
load adjustment) <= 0:5:0

$ qstat -f -pe mpi
queuename                      qtype used/tot. load_avg arch          states
----------------------------------------------------------------------------
all.q at apapane.cora.nwra.com    BIPC  0/4       2.00     lx26-amd64
----------------------------------------------------------------------------
compute.q at apollo.cora.nwra.com BIPC  0/4       0.10     lx26-amd64
----------------------------------------------------------------------------
compute.q at castor.cora.nwra.com BIPC  1/4       2.00     lx26-amd64
----------------------------------------------------------------------------
compute.q at pollux.cora.nwra.com BIPC  0/4       0.00     lx26-amd64
----------------------------------------------------------------------------
coop.q at coop00.cora.nwra.com    BIPC  0/2       0.01     lx26-amd64
----------------------------------------------------------------------------
coop.q at coop01.cora.nwra.com    BIPC  0/2       0.00     lx26-amd64

############################################################################
  - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
    1569 0.60125 mpi.csh    orion        qw    04/15/2008 11:33:20     2
    1570 0.60062 mpi.csh    orion        qw    04/15/2008 11:44:41     2
-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA/CoRA Division                    FAX: 303-415-9702
3380 Mitchell Lane                  orion at cora.nwra.com
Boulder, CO 80301              http://www.cora.nwra.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list