AW: [GE users] Help: PE details

Amy Lee openlinuxsource at gmail.com
Thu Nov 1 15:22:08 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Ochtrup, Carsten wrote:
> Amy,
>
> as John said, SGE is a batch/load levelling system. It selects the hosts for you and can take care about some runtime conditions (like max CPU time, or memory)
>
> The event chain when a job starts is:
>
> - prolog method of the queue
> - starter method of the queue
> - start command from the PE, which will call the start program of your parallel implementation
>
> - stop method of the PE
> - epilog of the queue
>
> What do you mean by "improve the parallel programs efficiency"?
>
> As said before, the start of the underlying mpi, pvm, ... relies on you.  It will not do any optimisation for you.
>
> Regards,
> Carsten Ochtrup
>
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: Amy Lee [mailto:openlinuxsource at gmail.com] 
> Gesendet: Donnerstag, 1. November 2007 14:49
> An: users at gridengine.sunsource.net
> Betreff: Re: [GE users] Help: PE details
>
> John Hearns wrote:
>   
>> On Thu, 2007-11-01 at 21:26 +0800, Amy Lee wrote:
>>   
>>     
>>> Hello,
>>>
>>> I wanna know more details about how PE works and what's the principle 
>>> of PE when using parallel program.
>>>
>>>     
>>>       
>> Amy,
>> fundamentally (I may be wrong) SGE is a batch/load levelling system.
>> When you wish to run a parallel program, using a Parallel Environment, 
>> SGE will allocate you a set of machines, which is passed on as the 
>> file $PE_HOSTFILE
>>
>> It is up to you, as the administrator, to determine how a particular 
>> parallel program should be started, and stopped, on all the hosts in 
>> $PE_HOSTFILE.
>>
>> Man sge_pe tells you all the parameters for a given PE, but the main 
>> things you need to get started are:
>>
>> start_proc_args    defines how to start the job
>> stop_proc_args     how to kill a job
>>
>>
>>
>> The best thing to do is to go to $SGE_ROOT/mpi  or pvm and look at the 
>> READDME files
>>
>>
>> Make sure to put your new PE name into pe_list for one of your 
>> existing queues, or your jobs will never run!
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>   
>>     
> Thank you very much. And I still have some problems. As you said, when I using PE to run parallel programs, the SGE just allocate a set of machines which I've selected, then the post-working is functions of MPI, right? Furthermore, can I use PE to improve the parallel programs efficiency by using SGE? If I can, how?
>
> Thank you again.
>
> Regards,
>
> Amy Lee
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>   
Thank you. I don't know whether SGE can auto-schedule resource for 
parallel programs by PE. And for example, I run a parallel program for X 
seconds, then I run this program using parallel tight integration by SGE 
for Y seconds. Can Y smaller than X? Can I adjust PE to improve the 
performance of PE?

Thank you very much~

Regards,

Amy Lee

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list