[GE users] NEEDING HELP IN ADMINISTRATING SGE6.1

Nourhéne Alaya nourhenalaya at gmail.com
Tue Jul 24 11:03:04 BST 2007


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

it is the qmaster : 
sge at hilbert:/opt/sge/examples/jobs$ ps -ax | grep sge
12976 ?        S      0:00 /opt/sge//utilbin/lx24-x86/berkeley_db_svc -L /opt/sge/default/spooldb/bdb_messages -h /opt/sge/default spooldb
13233 ?        Sl     1:15 /opt/sge/bin/lx24-x86/sge_qmaster
13260 ?        Sl     1:08 /opt/sge/bin/lx24-x86/sge_schedd
10009 pts/4    S      0:00 su sge
10934 pts/4    S+     0:00 grep sge

this one of  exec_hosts:
sge at parallele04:/root$ ps -ax | grep sge
 4965 ?        S      0:00 /opt/sge/bin/lx24-x86/sge_execd
22994 pts/0    S      0:00 su sge
23008 pts/0    S+     0:00 grep sge


These are the latest error messages in the qmaster host:
07/23/2007 10:22:16|qmaster|hilbert|W|rescheduling job 32.1
07/23/2007 10:22:16|qmaster|hilbert|E|queue queue_test marked QERROR as result of job 32's failure at host parallele07.pasteur.rns.tn
07/23/2007 10:25:13|qmaster|hilbert|W|Modify operation can not be applied on job-array task 50.1 in pending/hold state
07/23/2007 10:25:36|qmaster|hilbert|W|Modify operation can not be applied on job-array task 50.1 in pending/hold state
07/23/2007 10:27:29|qmaster|hilbert|E|unable to find job 54 from the scheduler order package
07/23/2007 10:30:44|qmaster|hilbert|W|scheduler tries to change pending tickets of a non pending job 48 task 1
07/23/2007 10:55:03|qmaster|hilbert|E|acknowledge timeout after 600 seconds for event client (schedd:1) on host "hilbert.pasteur.rns.tn"
07/23/2007 12:35:11|qmaster|hilbert|W|Modify operation can not be applied on job-array task 56.1 in pending/hold state
07/23/2007 12:35:35|qmaster|hilbert|W|Modify operation can not be applied on job-array task 51.1 in pending/hold state
07/23/2007 12:42:26|qmaster|hilbert|E|Job was rejected because job requests unknown queue "hilbert"
07/23/2007 14:39:39|qmaster|hilbert|E|no event client known with id 1 to deliver events immediately
07/23/2007 14:45:31|qmaster|hilbert|E|submithost "hilbert.pasteur.rns.tn" already exists
07/23/2007 15:32:03|qmaster|hilbert|E|no event client known with id 1 to deliver events immediately
07/23/2007 15:35:40|qmaster|hilbert|E|no event client known with id 1 to deliver events immediately
07/23/2007 16:45:44|qmaster|hilbert|E|job rejected: no access to project "livgm" for user "sge"
07/23/2007 16:46:50|qmaster|hilbert|E|no event client known with id 1 to deliver events immediately
07/23/2007 16:52:54|qmaster|hilbert|E|no event client known with id 1 to deliver events immediately

 and thank you





More information about the gridengine-users mailing list