[GE users] Trying to use starter_method with sge 6.0

Robert Olson olson at mcs.anl.gov
Mon Nov 8 16:33:26 GMT 2004


ah, I see. 

Now this is interesting. with the keep-exec stuff you pointed me at, I 
find that if I set the starter_method to a script that doesn't exist, I 
get this in the error file:

11/08/2004 10:14:29 [762:12301]: unable to find shell 
"/home/olson/bin/run_gendb_tool_foo"

however, if it's set to something valid, it doesn't appear to get used:

11/08/2004 10:12:29 [762:12212]: 
execvp(/Users/fig/FIGdisk/env/mac/bin/blastall, 
"/Users/fig/FIGdisk/env/mac/bin/blastall" "-d" 
"/Users/fig/FIGdisk/gendb/databases/fasta/nr" "-i" "/dev/fd/0" "-p" 
"blastp" "-m" "9")
11/08/2004 10:12:29 [762:12206]: wait3 returned 12212 (status: 6912; 
WIFSIGNALED: 0,  WIFEXITED: 1, WEXITSTATUS: 27)

the config file in the active_jobs dir does show the starter_method, but's 
not being used. Full trace attached at end. I also don't get the log 
message that is present in the starter_method script.

Is it at all significant that the starter script is a perl script? I tried 
adding perl to the shells lst; no difference.

--bob


11/08/2004 10:25:43 [762:12553]: shepherd called with uid = 762, euid = 762
11/08/2004 10:25:43 [762:12553]: starting up 6.0u1
11/08/2004 10:25:43 [762:12553]: warning: starting not as root (uid=762)
11/08/2004 10:25:43 [762:12553]: setpgid(12553, 12553) returned 0
11/08/2004 10:25:43 [762:12554]: pid=12554 pgrp=12554 sid=12554 old pgrp=12553 getlogin()=<no login set>
11/08/2004 10:25:43 [762:12553]: forked "prolog" with pid 12554
11/08/2004 10:25:43 [762:12553]: using signal delivery delay of 120 seconds
11/08/2004 10:25:43 [762:12553]: child: prolog - pid: 12554
11/08/2004 10:25:43 [762:12554]: tried to change uid/gid without being root
11/08/2004 10:25:43 [762:12554]: try running further with uid=762
11/08/2004 10:25:43 [762:12554]: closing all filedescriptors
11/08/2004 10:25:43 [762:12554]: further messages are in "error" and "trace"
11/08/2004 10:25:43 [762:12554]: using "/bin/bash" as shell of user "olson"
11/08/2004 10:25:43 [762:12554]: execvp(/home/olson/SGE/transfer-prolog, "/home/olson/SGE/transfer-prolog" "1" "a6" "/Users/fig/FIGdisk/FIG/Tmp/cCydH4EW/1.fas" "/tmp/208.1.tg/1.fas")
11/08/2004 10:25:44 [762:12553]: wait3 returned 12554 (status: 0; WIFSIGNALED: 0,  WIFEXITED: 1, WEXITSTATUS: 0)
11/08/2004 10:25:44 [762:12553]: prolog exited with exit status 0
11/08/2004 10:25:44 [762:12553]: reaped "prolog" with pid 12554
11/08/2004 10:25:44 [762:12553]: prolog exited not due to signal
11/08/2004 10:25:44 [762:12553]: prolog exited with status 0
11/08/2004 10:25:44 [762:12559]: pid=12559 pgrp=12559 sid=12559 old pgrp=12553 getlogin()=<no login set>
11/08/2004 10:25:44 [762:12553]: forked "job" with pid 12559
11/08/2004 10:25:44 [762:12559]: setosjobid: uid = 762, euid = 762
11/08/2004 10:25:44 [762:12553]: child: job - pid: 12559
11/08/2004 10:25:44 [762:12559]: RLIMIT_CPU setting: (soft 18446744073709551615 hard 18446744073709551615) resulting: (soft 18446744073709551615 hard 18446744073709551615)
11/08/2004 10:25:44 [762:12559]: RLIMIT_FSIZE setting: (soft 18446744073709551615 hard 18446744073709551615) resulting: (soft 18446744073709551615 hard 18446744073709551615)
11/08/2004 10:25:44 [762:12559]: RLIMIT_DATA setting: (soft 18446744073709551615 hard 18446744073709551615) resulting: (soft 18446744073709551615 hard 18446744073709551615)
11/08/2004 10:25:44 [762:12559]: RLIMIT_STACK setting: (soft 18446744073709551615 hard 18446744073709551615) resulting: (soft 18446744073709551615 hard 18446744073709551615)
11/08/2004 10:25:44 [762:12559]: RLIMIT_CORE setting: (soft 18446744073709551615 hard 18446744073709551615) resulting: (soft 18446744073709551615 hard 18446744073709551615)
11/08/2004 10:25:44 [762:12559]: RLIMIT_VMEM/RLIMIT_AS setting: (soft 18446744073709551615 hard 18446744073709551615) resulting: (soft 18446744073709551615 hard 18446744073709551615)
11/08/2004 10:25:44 [762:12559]: RLIMIT_RSS setting: (soft 18446744073709551615 hard 18446744073709551615) resulting: (soft 18446744073709551615 hard 18446744073709551615)
11/08/2004 10:25:44 [762:12559]: tried to change uid/gid without being root
11/08/2004 10:25:44 [762:12559]: try running further with uid=762
11/08/2004 10:25:44 [762:12559]: closing all filedescriptors
11/08/2004 10:25:44 [762:12559]: further messages are in "error" and "trace"
11/08/2004 10:25:44 [762:12559]: execvp(/Users/fig/FIGdisk/env/mac/bin/blastpgp, "/Users/fig/FIGdisk/env/mac/bin/blastpgp" "-d" "/Users/fig/FIGdisk/gendb/databases/fasta/sprot.fas" "-i" "/dev/fd/0" "-j" "5" "-m" "9")
11/08/2004 10:25:44 [762:12553]: wait3 returned 12559 (status: 6912; WIFSIGNALED: 0,  WIFEXITED: 1, WEXITSTATUS: 27)
11/08/2004 10:25:44 [762:12553]: job exited with exit status 27
11/08/2004 10:25:44 [762:12553]: reaped "job" with pid 12559
11/08/2004 10:25:44 [762:12553]: job exited not due to signal
11/08/2004 10:25:44 [762:12553]: job exited with status 27
11/08/2004 10:25:44 [762:12553]: now sending signal KILL to pid -12559
11/08/2004 10:25:44 [762:12553]: no tasker to notify
11/08/2004 10:25:44 [762:12553]: failed starting job
11/08/2004 10:25:44 [762:12560]: pid=12560 pgrp=12560 sid=12560 old pgrp=12553 getlogin()=<no login set>
11/08/2004 10:25:44 [762:12553]: forked "epilog" with pid 12560
11/08/2004 10:25:44 [762:12553]: using signal delivery delay of 120 seconds
11/08/2004 10:25:44 [762:12553]: child: epilog - pid: 12560
11/08/2004 10:25:44 [762:12560]: tried to change uid/gid without being root
11/08/2004 10:25:44 [762:12560]: try running further with uid=762
11/08/2004 10:25:44 [762:12560]: closing all filedescriptors
11/08/2004 10:25:44 [762:12560]: further messages are in "error" and "trace"
11/08/2004 10:25:44 [762:12560]: using "/bin/bash" as shell of user "olson"
11/08/2004 10:25:44 [762:12560]: execvp(/home/olson/SGE/transfer-epilog, "/home/olson/SGE/transfer-epilog" "1" "a6" "/Users/fig/FIGdisk/FIG/Tmp/cCydH4EW/1.stdout" "/tmp/208.1.tg/1.stdout" "1" "a6" "/Users/fig/FIGdisk/FIG/Tmp/cCydH4EW/1.stderr" "/tmp/208.1.tg/1.stderr")
11/08/2004 10:25:46 [762:12553]: wait3 returned 12560 (status: 0; WIFSIGNALED: 0,  WIFEXITED: 1, WEXITSTATUS: 0)
11/08/2004 10:25:46 [762:12553]: epilog exited with exit status 0
11/08/2004 10:25:46 [762:12553]: reaped "epilog" with pid 12560
11/08/2004 10:25:46 [762:12553]: epilog exited not due to signal
11/08/2004 10:25:46 [762:12553]: epilog exited with status 0
11/08/2004 10:25:46 [762:12553]: no tasker to notify

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list