[GE users] sgeexecd doesn't know find libraries at boot time

Reuti reuti at staff.uni-marburg.de
Tue Sep 11 10:30:07 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Am 11.09.2007 um 10:48 schrieb Maxime Kinet:

> The environment variable LD_LIBRARY_PATH is indeed set when the  
> system boots, in a script located in /etc/profile.d/

I wouldn't be sure, that this is sourced for all started tasks during  
the boot process, but when a user logs in. You could put some echo  
command in sgeexecd script to a file to check the actual set PATH  
during startup.

-- Reuti

> . Altough I was thinking that this sgeexecd might be started before  
> that variable is set, and wondering if I could change something to  
> that.
> Altough this is not a big deal since I know how to fix the problem  
> and the cluster is not restarted very often.
> Thanks for your help
>
> ------------------
> Maxime Kinet
> Université Libre de Bruxelles
> Physique Statistique et Plasmas, CP 231
> Campus Plaine - Boulevard du Triomphe,
> 1050 Bruxelles.
>
> Tel.   : +32-2-650.59.08
> e-mail : mkinet at ulb.ac.be
>
>
> On 10 Sep 2007, at 22:01, Beadles, Jeff wrote:
>
>> By default (and imo it's a horrible default), sge_execd inherits  
>> any environment variables set when it is run.
>>
>> I would bet that you have an environment variable set, say  
>> LD_LIBRARY_PATH that contains the path to libimf.so.  When the  
>> system boots, it doesn't have this variable set, and fails your  
>> jobs.  When you restart it, it picks it up from your environment,  
>> and passes it thru to your jobs.
>>
>> The fix, is to have your job script set LD_LIBRARY_PATH, or if you  
>> only have one platform type in your grid, you could use;
>>
>> $ qsub -v LD_LIBRARY_PATH ...
>>
>> Regards,
>>  -Jeff
>>
>> From: Maxime Kinet [mailto:mkinet at ulb.ac.be]
>> Sent: Mon 9/10/2007 6:43 AM
>> To: users at gridengine.sunsource.net
>> Subject: [GE users] sgeexecd doesn't know find libraries at boot time
>>
>> Hello,
>> At boot time the nodes doesn't seem to know the path to the shared
>> library. When launching a job I get the following error :
>>
>> $path_to_openmpi/bin/mpirun: error while loading shared libraries:
>> libimf.so: cannot open shared object file:
>> No such file or directory
>>
>> The problem disappear when I restart sgeexecd on each node. Is there
>> a way to avoid this restart?
>>
>> thanks for helping.
>> ------------------
>> Maxime Kinet
>> Université Libre de Bruxelles
>> Physique Statistique et Plasmas, CP 231
>> Campus Plaine - Boulevard du Triomphe,
>> 1050 Bruxelles.
>>
>> Tel.   : +32-2-650.59.08
>> e-mail : mkinet at ulb.ac.be
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list