[GE users] jobs in queue always going to "transfer" status

Sean Davis sdavis2 at mail.nih.gov
Thu Oct 2 19:41:12 BST 2008


    [ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

On Thu, Oct 2, 2008 at 2:20 PM, Rayson Ho <rayrayson at gmail.com> wrote:
> Looks like OpenLDAP bug #215904:
>
> https://bugs.launchpad.net/ubuntu/+source/openldap2.3/+bug/215904

The machine is using openldap-2.4.9.  It looks like this bug was fixed
some time ago (unless is has reemerged), or am I reading the bug
report incorrectly?

Sean


> On 10/2/08, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>> Program received signal SIGABRT, Aborted.
>> [Switching to Thread 0x7fbcfc4fa6f0 (LWP 18677)]
>> 0x00007fbcfba5b5c5 in raise () from /lib64/libc.so.6
>> (gdb) bt
>> #0  0x00007fbcfba5b5c5 in raise () from /lib64/libc.so.6
>> #1  0x00007fbcfba5cbb3 in abort () from /lib64/libc.so.6
>> #2  0x00007fbcfba541e9 in __assert_fail () from /lib64/libc.so.6
>> #3  0x00007fbcfad91613 in ber_flush2 () from /usr/lib64/liblber-2.4.so.2
>> #4  0x00007fbcfafbb34c in ldap_int_flush_request ()
>>   from /usr/lib64/libldap-2.4.so.2
>> #5  0x00007fbcfafbb75f in ldap_send_server_request ()
>>   from /usr/lib64/libldap-2.4.so.2
>> #6  0x00007fbcfafbba10 in ldap_send_initial_request ()
>>   from /usr/lib64/libldap-2.4.so.2
>> #7  0x00007fbcfafab360 in ldap_search () from /usr/lib64/libldap-2.4.so.2
>> #8  0x00007fbcfafab47a in ldap_search_st () from /usr/lib64/libldap-2.4.so.2
>> #9  0x00007fbcfb1e4703 in ?? () from /lib64/libnss_ldap.so.2
>> #10 0x00007fbcfb1e3a13 in ?? () from /lib64/libnss_ldap.so.2
>> #11 0x00007fbcfb1e44ce in ?? () from /lib64/libnss_ldap.so.2
>> #12 0x00007fbcfb1e4b5f in ?? () from /lib64/libnss_ldap.so.2
>> #13 0x00007fbcfb1e5197 in _nss_ldap_getpwnam_r () from /lib64/libnss_ldap.so.2
>> #14 0x00007fbcfb61814b in ?? () from /lib64/libnss_compat.so.2
>> #15 0x00007fbcfb618417 in _nss_compat_getpwnam_r ()
>>   from /lib64/libnss_compat.so.2
>> #16 0x00007fbcfbaca01d in getpwnam_r () from /lib64/libc.so.6
>> #17 0x000000000050a3cc in sge_getpwnam_r ()
>> #18 0x00000000004280de in sge_exec_job ()
>> ---Type <return> to continue, or q <return> to quit---
>> #19 0x000000000042e60c in exec_job_or_task ()
>> #20 0x000000000042e160 in sge_start_jobs ()
>> #21 0x000000000042def0 in do_ck_to_do ()
>> #22 0x0000000000427835 in sge_execd_process_messages ()
>> #23 0x0000000000424b6d in main ()
>>
>> I didn't mention that we are running openSUSE 11 on this machine.
>>
>> uname -a
>> Linux mahfouz 2.6.25.16-0.1-default #1 SMP 2008-08-21 00:34:25 +0200
>> x86_64 x86_64 x86_64 GNU/Linux
>>
>> And the libc major version is 2.8, if I recall.
>>
>> Any other ideas before I try to compile a debugging version with some
>> print statements?
>>
>> Thanks,
>> Sean
>>
>>
>> > On Wed, Oct 1, 2008 at 8:34 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>> >> And a couple more lines of interest, all from qmaster:
>> >>
>> >> 10/01/2008 20:24:17| timer|shakespeare|W|failed to deliver job 3265.1
>> >> to queue "all.q at grass.nci.nih.gov"
>> >> 10/01/2008 20:24:17| timer|shakespeare|E|got max. unheard timeout for
>> >> target "execd" on host "grass.nci.nih.gov", can't deliver job "3265"
>> >>
>> >> The eight jobs before this one went into "run" status, one completed,
>> >> and the next one was job 3265; it remains in "transfer" status.
>> >>
>> >> Sean
>> >>
>> >>> Thanks, Rayson.  This looks suspicious.  I'm not sure what to do with
>> >>> this.  How does one end up with an unknown queue?  The timing was such
>> >>> that I had submitted several jobs for testing to one of the machines
>> >>> in question (i.e., qsub -q all.q at machine sleeper.sh).
>> >>>
>> >>> Sean
>> >>>
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Sean
>> >>>>>
>> >>>>> ---------------------------------------------------------------------
>> >>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> >>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>> >>>>>
>> >>>>>
>> >>>>
>> >>>> ---------------------------------------------------------------------
>> >>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> >>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>> >>>>
>> >>>>
>> >>>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> >> For additional commands, e-mail: users-help at gridengine.sunsource.net
>> >>
>> >>
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> > For additional commands, e-mail: users-help at gridengine.sunsource.net
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list