[GE users] rcmd: socket: Cannot assign requested address

John_Tai John_Tai at smics.com
Thu Jan 13 06:43:34 GMT 2005


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi All,
 
I keep getting this error when trying to submit a job through qrsh, has anyone else encountered this before? What is the problem?
 
Below is the error message e-mailed to admin.
 
Thanks.
John
 
Job 61013 caused action: none
 User        = johnt
 Queue       =  <mailto:mem4gb.q at designserver> mem4gb.q at designserver
 Host        = designserver
 Start Time  = <unknown>
 End Time    = <unknown>
failed before writing exit_status:can't read usage file for job 61013.1
 
Shepherd trace:
01/13/2005 14:38:12 [1010:26112]: shepherd called with uid = 0, euid = 1010
01/13/2005 14:38:12 [1010:26112]: starting up 6.0u1
01/13/2005 14:38:12 [1010:26112]: setpgid(26112, 26112) returned 0
01/13/2005 14:38:12 [1010:26112]: no prolog script to start
01/13/2005 14:38:12 [1010:26112]: forked "job" with pid 26114
01/13/2005 14:38:12 [1010:26112]: child: job - pid: 26114
01/13/2005 14:38:12 [1010:26114]: processing qlogin job
01/13/2005 14:38:12 [1010:26114]: pid=26114 pgrp=0 sid=0 old pgrp=26112 getlogin()=<no login set>
01/13/2005 14:38:12 [1010:26114]: setosjobid: uid = 0, euid = 1010
01/13/2005 14:38:12 [1010:26114]: RLIMIT_CPU setting: (soft 18446744073709551613 hard 18446744073709551613) resulting: (soft 18446744073709551613 hard 18446744073709551613)
01/13/2005 14:38:12 [1010:26114]: RLIMIT_FSIZE setting: (soft 18446744073709551613 hard 18446744073709551613) resulting: (soft 18446744073709551613 hard 18446744073709551613)
01/13/2005 14:38:12 [1010:26114]: RLIMIT_DATA setting: (soft 18446744073709551613 hard 18446744073709551613) resulting: (soft 18446744073709551613 hard 18446744073709551613)
01/13/2005 14:38:12 [1010:26114]: RLIMIT_STACK setting: (soft 18446744073709551613 hard 18446744073709551613) resulting: (soft 18446744073709551613 hard 18446744073709551613)
01/13/2005 14:38:12 [1010:26114]: RLIMIT_CORE setting: (soft 18446744073709551613 hard 18446744073709551613) resulting: (soft 18446744073709551613 hard 18446744073709551613)
01/13/2005 14:38:12 [1010:26114]: RLIMIT_VMEM setting: (soft 18446744073709551613 hard 18446744073709551613) resulting: (soft 18446744073709551613 hard 18446744073709551613)
01/13/2005 14:38:12 [162:26114]: closing all filedescriptors
01/13/2005 14:38:12 [162:26114]: further messages are in "error" and "trace"
01/13/2005 14:38:12 [0:26114]: calling qlogin_starter(/s72g/sge/spool/designserver/active_jobs/61013.1, /usr/sbin/in.rlogind);
01/13/2005 14:38:12 [0:26114]: uid = 0, euid = 0, gid = 1, egid = 1
01/13/2005 14:38:12 [0:26114]: uid = 0, euid = 0, gid = 0, egid = 0
01/13/2005 14:38:12 [0:26114]: using sfd 1
01/13/2005 14:38:12 [0:26114]: bound to port 36605
 
01/13/2005 14:38:12 [0:26114]: write_to_qrsh - data = 0:36605:/home/edamgr/GridEngine/sge6-1/utilbin/sol-sparc64:/s72g/sge/spool/designserver/active_jobs/61013.1:designserver
01/13/2005 14:38:12 [0:26114]: write_to_qrsh - address = designserver:36595
01/13/2005 14:38:12 [0:26114]: write_to_qrsh - host = designserver, port = 36595
01/13/2005 14:38:12 [0:26114]: waiting for connection.
01/13/2005 14:39:12 [0:26114]: nobody connected to the socket
01/13/2005 14:39:12 [0:26114]: forked "job" with pid 0
01/13/2005 14:39:12 [0:26114]: child: job - pid: 0
01/13/2005 14:39:12 [0:26114]: wait3 returned -1
01/13/2005 14:39:12 [0:26114]: reaped "job" with pid 0
01/13/2005 14:39:12 [0:26114]: job exited not due to signal
01/13/2005 14:39:12 [0:26114]: job exited with status 0
01/13/2005 14:39:12 [0:26114]: now sending signal KILL to pid 0
 
Shepherd pe_hostfile:
designserver 1  <mailto:mem4gb.q at designserver> mem4gb.q at designserver UNDEFINED




More information about the gridengine-users mailing list