[GE users] shepherd problem

John Hearns john.hearns at streamline-computing.com
Fri Mar 16 17:30:18 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Philippe Caussignac wrote:
> Hello,
> 
>
> 
> error:
> cannot get connection to "shepherd" at host "node06"
> error:
> cannot get connection to "shepherd" at host "node02"
> error:
> cannot get connection to "shepherd" at host "node03"

This is a stupid question, ,but do ANY jobs run on nodes 06 02 and 03?
What I mean is - your eight-processor jobs may be running on nodes
01 (four cores) and  04 (four cores).

Could you try just running lots of serial jobs, which are long sleeps to 
make sure all the cores (slots) on nodes 02 03 and 06 are being used?
Also in your queue definition have you set slots = 4

-- 
      John Hearns
      Senior HPC Engineer
      Streamline Computing,
      The Innovation Centre, Warwick Technology Park,
      Gallows Hill, Warwick CV34 6UW
      Office: 01926 623130 Mobile: 07841 231235

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list