[GE users] need help interpreting drmaa errors in messages

Andreas.Haas at Sun.COM Andreas.Haas at Sun.COM
Mon Mar 12 09:34:12 GMT 2007


On Fri, 9 Mar 2007, John Saalwaechter wrote:

> I have several developers who have recently started using DRMAA.
> Now I'm getting lots and lots of errors reported each day in
> qmaster/messages of the form:
>
> 03/07/2007 11:09:07|qmaster|XXX|E|can't send asynchronous
> message to commproc (drmaa:20997) on host "YYY": no valid
> port number
>
> and also,
>
> 03/07/2007 11:09:02|qmaster|XXX|E|commlib error: got read
> error (closing "YYY/drmaa/21000")
>
> Any ideas on what's causing this?  The developers report that
> things seem fine from the application.
>
> SGE version:  N1GE 6.0u4

Hi John,

apriori I can't tell where this comes from, but you should consider 
upgrading to more recent version u10. The list of (DRMAA-) bugs 
fixed since u4 can be found here

    http://gridengine.sunsource.net/project/gridengine/60patches.txt

some of them could in fact cause awkward problems when DRMAA 
was used at large scale. For an example see this

    http://gridengine.sunsource.net/issues/show_bug.cgi?id=2125

Regards,
Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list