[GE users] job was rejected cause it can't be written: Bad address

Krzysztof Witkowski krzywit at man.poznan.pl
Sat Mar 25 10:29:24 GMT 2006


Hi,

I'm experiencing strange problems when submitting jobs to sge u7_1.
I spawn simple script:

for a in `seq 1 256`
do
	qsub /bin/date
done

Result:
Your job 1575 ("date") has been submitted.
//1575 to 1608 are the same
Your job 1609 ("date") has been submitted.
Unable to run job: job 1610 was rejected cause it can't be written: Bad
address.
Exiting.
Your job 1611 ("date") has been submitted.
Unable to run job: job 1612 was rejected cause it can't be written: Bad
address.
Exiting.
Your job 1613 ("date") has been submitted.
Unable to run job: job 1614 was rejected cause it can't be written: Bad
address.
Exiting.
Your job 1615 ("date") has been submitted.
//1615-1631 same
Your job 1632 ("date") has been submitted.
Unable to run job: job 1633 was rejected cause it can't be written: Bad
address.
Exiting.
Unable to run job: job 1634 was rejected cause it can't be written: Bad
address.
Exiting.
Your job 1635 ("date") has been submitted.
Unable to run job: job 1636 was rejected cause it can't be written: Bad
address.
Exiting.
//1637 to 1662 same error
Unable to run job: job 1663 was rejected cause it can't be written: Bad
address.
Exiting.
Your job 1664 ("date") has been submitted.
Your job 1665 ("date") has been submitted.
Unable to run job: job 1666 was rejected cause it can't be written: Bad
address.
Exiting.
//1667 to 1684 same error
Unable to run job: job 1685 was rejected cause it can't be written: Bad
address.
Exiting.
Your job 1686 ("date") has been submitted.
//same thing here
Your job 1792 ("date") has been submitted.
Unable to run job: job 1793 was rejected cause it can't be written: Bad
address.
Exiting.
Unable to run job: job 1794 was rejected cause it can't be written: Bad
address.
Exiting.
Your job 1795 ("date") has been submitted.
//same thing here
Your job 1805 ("date") has been submitted.
Unable to run job: job 1806 was rejected cause it can't be written: Bad
address.
Exiting.
Your job 1807 ("date") has been submitted.
Unable to run job: job 1808 was rejected cause it can't be written: Bad
address.
Exiting.
error: commlib error: got read error (closing
"hedera.man.poznan.pl/qmaster/1")
error: commlib error: can't connect to service (Connection refused)
Unable to run job: unable to contact qmaster using port 536 on host
"hedera.man.poznan.pl".
Exiting.
//same errors here
error: commlib error: can't connect to service (Connection refused)
Unable to run job: unable to contact qmaster using port 536 on host
"hedera.man.poznan.pl".
Exiting.

Line:
error: commlib error: got read error (closing
"hedera.man.poznan.pl/qmaster/1")
seems to be important.

Is it SGE bug or some misconfiguration? When adding sleep 1 between
submissions everything seems fine.

Cheers
Krzysztof Witkowski

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list