[GE users] complex use of complexes
ikaufman at ucsd.edu
Tue May 11 21:47:30 BST 2010
[ The following text is in the "utf-8" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some characters may be displayed incorrectly. ]
On Tue, May 11, 2010 at 1:37 PM, reuti <reuti at staff.uni-marburg.de<mailto:reuti at staff.uni-marburg.de>> wrote:
Am 11.05.2010 um 21:52 schrieb gragghia:
> Are you suggesting to break the job up into two jobs with different
> resource requests? They would have to be running at the same time
> (something that I don't think you can guarantee), and MPI wouldn't
> how to communicate with the processes of a different job.
In principle it's possible to hijack slots from another parallel job.
So you could submit one job with a request for 128 GB, and one
parallel job (which will only have a `sleep` or alike inside and
"job_is_first_task FALSE" [it could also wait for a file "+DONE"
written by the master job to quit automatically]) with e.g. 7 slots
requesting 2 GB for each slot as usual. Then the master job can submit
something with `qrsh -inherit` to the slots from the other job when
you change the $JOB_ID to be the one from the 7-slots job. Depending
on the used MPI version, it might be tricky anyway.
Bigger problem as you mentioned: how to force SGE to run both jobs at
the same time or not at all.
I wonder if it would be possible to use DRMAA calls to launch the jobs, and each MPI call uses DRMAA to request the appropriate amount of RAM.
But, it sounds like that, although the job is highly parallel, the rank 0 process is more of a master process to which the other processes are subservient to. Maybe a better understanding of how the process works would help.
Research Systems Administrator
UC San Diego, Jacobs School of Engineering ikaufman AT ucsd DOT edu
More information about the gridengine-users