[GE users] SGE 6.2: qsub -sync y option for large number of jobs

elauzier elauzier2 at perlstar.com
Mon Dec 14 02:25:22 GMT 2009


Reuti, thanks for the feedback...

I'll try to be clearer with my inquiries...

Here are a couple work flows that I am working with...

(1)

Simple work flow in the foreground:

========================================

./setup.sh
./do_something.sh
qsub ... -sync y -t 1-1000 ./fan_out.sh
./do_something_else.sh
./cleanup.sh

========================================

(2)

Alternative simple work flow using pure batch:

==================================

qsub ... -N Setup_unique_name ./setup.sh
qsub ... -t 1-1000 -hold_jid "Setup_unique_name" -N fan_out_unique_name ./fanout.sh
qsub ... -N do_something_else_unique_name -hold_jid "fan_out_unique_name" ./do_something_else.sh
qsub ... -N cleanup_unique_name -hold_jid "do_something_else_unique_name" ./cleanup.sh

=================================

I guess you can say that the first flow is more of an interactive flow and the second one is a pure batch flow.

Considering scalability of say 500 people running similar flows at the same time, I would tend to go with (2) especially if the flows are large and long, where (1) can be used for smaller and shorter flows.

The main reason I would choose (2) over (1) is for stability of the system, but I'm still looking into the pros and cons of such flows and how they are implemented.

For example, what happens if the SGE system becomes unresponsive with users using flows as in (1)?  How will the system behave?  Will these flows break?  Likewise for (2), if the SGE system becomes unresponsive, will the flows in (2) better handle a relatively short SGE interruption?

thanks,

Ed Lauzier

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=233169

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list