[GE users] parallel environment, advanced reservation

peeter phimmelf at hotmail.com
Tue Oct 26 01:27:31 BST 2010

Users submit single slot jobs that are spawning/forking cpu bound threads.  In some cases eventually taking up 16 cores on a 24 core host. Oftentimes the job will run for 5 days, yet only spawn/fork to 16 threads(slots) for 10 hours of the entire 5 day run.

I understand how to configure a pe and then have user pass, for example, '-l pe smp 16' on the command line for the job described above. But, obviously, that will take up all 16 slots for the entire 5 days. 

I'm looking for techniques/strategies that would allow other jobs to use the 15 cores when the job isn't spawning/forking. Some other characteristics and comments to issue:

1. The spawning and forking cycles from 1 to 16 threads over the course of 5 days

2. if advanced reservationing is used, what if user can't provide exact times for when the job is increasing and decreasing thread count for that single job?

3. when advanced reservationing is used, and let's say a job makes a reservation for 16 slots in a single host for 23 hours on Saturday, do all other jobs submitted have to provide start and stop times so that the reservation system knows what jobs to prevent from running? For instance, what if user submits a job Friday night to 20 slots on the same host but doesn't know how long it's going to run, and doesn't provide start/stop time? What will happen to the Friday night job if it runs into Sunday?

Forever indebted.



To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list