[GE users] "cannot execute binary file"

Reuti reuti at staff.uni-marburg.de
Mon Dec 19 11:11:03 GMT 2005


Am 19.12.2005 um 11:19 schrieb Paul Smith:

> Reuti wrote:
>> well, in case of an intermediate shell the things are working,  
>> but  not running it directly. What were the $PATH and other  
>> environment  settings at the time the execd was started on the node?
>
> Possible but if so it is consistent accross 64 nodes.
>
>> Maybe there  was something (not correctly) defined, and it will be  
>> overridden by a  started shell, but inherited by a direct  
>> execution? - Reuti
>
> sgeadmin has the same PATH
> /usr/local/bin:/usr/bin:/bin:/gridware/sge/bin/lx26-amd64:/opt/ 
> mpich/ch-p4/bin:/usr/local/lam/bin:.

The sge_execd should run as root (although you might see s different  
effective UID). What is:

ps -e f -o pid,ppid,pgrp,ruser,user,rgroup,group,command

telling you. Something like:

reuti at node47:~> ps -e f -o pid,ppid,pgrp,ruser,user,rgroup,group,command
   PID  PPID  PGRP RUSER    USER     RGROUP   GROUP    COMMAND
  1795     1  1795 root     sgeadmin root     gridware /usr/sge/bin/ 
lx24-x86/sge_execd
  9212  1795  1795 root     root     root     root      \_ /bin/sh / 
usr/sge/tools/tmpspace.sh
28221  1795 28221 root     sgeadmin root     gridware  \_  
sge_shepherd-5711 -bg
28222 28221 28222 root     root     users    users         \_ /usr/ 
sbin/in.rlogind
28223 28222 28223 root     root     users    users             \_  
login -- reuti
28224 28223 28224 reuti    reuti    users    users                 \_  
-bash
...

If this is the case, the question would be which $PATH root had  
during startup.

>> BTW: I'd suggest not to put /opt/mpich/ch-p4/bin and /usr/local/ 
>> lam/ bin at once in the PATH, as only the MPICH-version will be  
>> found for  mpirun I think.
>
> The mpirun the users get is actually the one in /usr/local/bin.
> I wish I did not need two mpi environments but I spent two months  
> trying to get an application Rmpi to use mpich and having failed  
> installed LAM which it had been written for.
>
>> Are you replacing the PATH there completely, instead  of extending  
>> it, as the definition of $TMPDIR is missing there (which  you  
>> would need for a Tight Integration of parallel jobs)?
>
> Yes we are replacing the PATH. We find problems easier to debug if  
> we always set the users PATH to something 'reasonable'.
>
> I am trying to get tight integration to work but I can't do so  
> until I can get rsh to work.
>
> Paul
>
> -- 
> ************************************************************
> Paul Smith, System Manager, HPCF, http://www.hpcf.cam.ac.uk/
> pas50 at cam.ac.uk, 01223 763517, Computing Service,
> New Museums Site, Pembroke Street, Cambridge CB2 3QH
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list