[GE users] prolog script for parallel jobs

Neeraj Chourasia neeraj at crlindia.com
Thu Jan 31 08:32:24 GMT 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Reuti,

    Please ignore the previous mail. Its bit incomplete.


   I did disable user login to compute nodes and also patched openssh to 
compile SGE with -tight-ssh option. The compilation went fine, but i 
have some doubts regarding its configuration and use.
   I copied modified sshd in "/usr/local/sge/utilbin/lx26-amd64" 
directory. But while doing qrsh id, it gives an error
                ssh_exchange_identification: Connection closed by remote 
host
                can't open file /tmp/552.1.mpipg.q/pid: No such file or 
directory

  Since my existing sshd is of version 3.9p1, i downloaded and modified 
the same flavour.

 Here are my doubts:
   1) Do i need to replace existing sshd in /usr/sbin with SGE patched 
SSH. or i just have to change rsh_daemon,qlogin_daemon,rlogin_daemon in 
cluster configuration file to apropriate path say 
"/usr/local/sge/utilbin/lx26-amd64/sshd"
  2) My current configuration looks like

         [root at n0 lx26-amd64]# qconf -sconf
               global:
                           execd_spool_dir              /var/spool/sge
                           mailer                       /bin/mail
                           xterm                        /usr/bin/X11/xterm
                           load_sensor                  
/usr/local/sge/util/load.sh
                           prolog                       none
                           epilog                       none
                           shell_start_mode             posix_compliant
                           login_shells                 sh,ksh,csh,tcsh
                           min_uid                      0
                           min_gid                              0
                           user_lists                   none
                           xuser_lists                  none
                           projects                     none
                           xprojects                    none
                           enforce_project              false
                           enforce_user                 auto
                           load_report_time             00:00:40
                           max_unheard                  00:05:00
                           reschedule_unknown           00:00:00
                           loglevel                     log_warning
                           administrator_mail           none
                           set_token_cmd                none
                           pag_cmd                      none
                           token_extend_time            none
                           shepherd_cmd                         none
                           qmaster_params               none
                           execd_params                 none
                           reporting_params             accounting=true 
reporting=false \
                                                    flush_time=00:00:15 
joblog=false sharelog=00:00:00
                           finished_jobs                100
                           gid_range                    20000-21000
                           qlogin_command               
/usr/bin/qlogin_wrapper
                           qlogin_daemon                
/usr/local/sge/utilbin/lx26-amd64/sshd -i
                           rlogin_daemon                
/usr/local/sge/utilbin/lx26-amd64/sshd -i
                           max_aj_instances             2000
                           max_aj_tasks                 75000
                           max_u_jobs                   0
                           max_jobs                     0
                           max_advance_reservations     0
                           auto_user_oticket            0
                           auto_user_fshare                     0
                           auto_user_default_project    none
                           auto_user_delete_time        86400
                           delegated_file_staging       false
                           rsh_daemon                   
/usr/local/sge/utilbin/lx26-amd64/sshd -i
                           rsh_command                  /usr/bin/ssh -X
                           rlogin_command               /usr/bin/ssh -X
                           reprioritize                 false
       3) Even if i succeed using modified ssh, does mpirun regards 
modified sshd when estabilishing connection to remote hosts?
                 Because if no, program will still fail, as we have 
disabled user login.

Regards
Neeraj

Reuti wrote:
> Hi,
>
> Am 30.01.2008 um 14:39 schrieb Neeraj Chourasia:
>
>>    My intention is to discourage people running MPI jobs directly on 
>> compute nodes. If i modify ssh, and assuming ssh to be common across 
>> all kind of jobs, (via SGE/ without SGE ), i wont be able to achieve 
>> user restriction.
>
> my usual solution to this is: disable rsh and ssh in the cluster 
> completely (or: limit ssh login to admin staff in /etc/ssh/sshd_config).
>
> As Andy said, SGE will start its own daemons and parallel jobs will 
> run happy with them, if you have a Tight Integration of the parallel 
> programs. In this case also the default rsh might be sufficient, as it 
> will only allow connections inside the cluster.
>
> -- Reuti
>
>
>> -Neeraj
>>
>>
>> Andy Schwierskott wrote:
>>> Hi,
>>>
>>> you can't run prolog scripts on the slvaes nodes.
>>>
>>> Since SGE starts its own sshd (and does not use the system wide sshd)
>>> perhaps there is a possibility to use a different limits.conf for 
>>> the sshd
>>> started by SGE by using a command line argument for the sshd startup?
>>>
>>> If this is not possible and you are anyhow using a modified sshd (or
>>> consider to do it) to get proper accounting when ssh is used for 
>>> starting
>>> parallel jobs it might be possible to do further modifications to 
>>> sshd to
>>> disregard that setting in limits.conf.
>>>
>>> In case you are unaware why there is a motivation for a sshd patch 
>>> please
>>> see Ron's and Rayson's presentation on it:
>>>
>>>   
>>> http://gridengine.sunsource.net/download/workshop10-12_09_07/SGE-WS2007-openSSHTightIntegration_RonChen.pdf 
>>>
>>>
>>> Regards,
>>> Andy
>>>
>>>
>>>
>>> On Wed, 30 Jan 2008, Neeraj Chourasia wrote:
>>>
>>>> hello all,
>>>>
>>>>   Is there a way to run prolog or some kind of script under root 
>>>> privileges in parallel kind of job? What i have observed is that 
>>>> prolog script runs on master node and not on slaves.
>>>> Actually i want to disable user login directly onto compute nodes, 
>>>> and only make it possible via SGE. Since prolog script runs under 
>>>> root privileges, i can change say /etc/security/limits.conf for the 
>>>> time job is being run and hence can restrict normal SSH(without SGE).
>>>>
>>>> Can i run prolog on slaves or is there any other way to restrict 
>>>> normal user login?
>>>>
>>>> -Neeraj
>>>>
>>>> The information contained in this electronic message and any 
>>>> attachments to this message are intended for the exclusive use of 
>>>> the addressee(s) and may contain proprietary, confidential or 
>>>> privileged information. If you are not the intended recipient, you 
>>>> should not disseminate, distribute or copy this e-mail. Please 
>>>> notify the sender immediately and destroy all copies of this 
>>>> message and any attachments contained in it.
>>>>
>>>> Contact your Administrator for further information.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>
>>
>> The information contained in this electronic message and any 
>> attachments to this message are intended for the exclusive use of the 
>> addressee(s) and may contain proprietary, confidential or privileged 
>> information. If you are not the intended recipient, you should not 
>> disseminate, distribute or copy this e-mail. Please notify the sender 
>> immediately and destroy all copies of this message and any 
>> attachments contained in it.
>>
>> Contact your Administrator for further information.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>


The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments contained in it.

Contact your Administrator for further information.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list