[GE users] Scaling up GE for huge number of jobs
rayrayson at gmail.com
Fri Jan 4 20:19:04 GMT 2008
[ The following text is in the "ISO-8859-1" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some special characters may be displayed incorrectly. ]
What does qstat -j say??
That should include information from the scheduler on why new jobs are
not starting on nodes...
On Jan 4, 2008 3:06 PM, Gary L Fox <garylfox at hotmail.com> wrote:
> However, our 2 core nodes remain half empty, in spite of the fairly low load
> values. Always in the past, the ideal has been to have 2 jobs per node (1
> job per CPU/core). We are using classic spooling with version 6.0u10. Is
> there something that may have gotten corrupted by this really large number
> of queued jobs? Is there any easy way to clear things out and reset SGE?
> Thank you,
> To: users at gridengine.sunsource.net
> From: Brett_W_Grant at raytheon.com
> Date: Wed, 2 Jan 2008 14:03:36 -0700
> Subject: Re: [GE users] Scaling up GE for huge number of jobs
> I think that this corresponds, but maybe not. I have a number of what I
> call semi-large clusters that I use to run simulations. Our IT dept goes by
> cores, not nodes, but one has 332 cores, one 268 cores, one 256 cores, and
> one 192 cores. The 268 core cluster are macs, everything else are RH4 Linux
> Basically I have a simulation that takes a number of inputs, two of which
> are x & y positions and calculates a result at that x,y location. Depending
> upon the other inputs, there are between 500 and 20,000 x,y positions for
> each set of inputs. Each x,y point takes between 5 seconds and 5 minutes to
> simulate. The important thing to know here is that all of the inputs except
> for x & y remain the same.
> The first thing that IT noticed was that due to the fast finish times of
> some of the sims, cores would sit idle. They hypothesized that the jobs are
> finishing before the scheduler can get back to that node, so what they did
> was really up the number of slots per que instance. Something like 3X the
> number of cores per node. This made it so the computer is never idle. That
> isn't really the approach that I would take, but that is what they did. I
> don't know what the proper term for this is, but I call it, "way overloading
> the processor".
> I wrote a script that uses the SGE_TASK_ID parameters so that I could submit
> array jobs. From the command line, this makes a 500k job look like one
> line, more of a user convenience than anything else, although it does make
> the job submission easier and faster.
> We have had issues with file locking, slow nfs response, running out of
> inodes, and a myriad of other issues that crept up when we scaled up, so
> watch out for that, too.
> In the end, we rewrote the simulation, so rather than each instance of the
> simulation simulating 1 x,y pair, each instance would simulate all of the
> x,y pairs for that input condition. This works fairly well, if you make
> sure not to send the jobs to ques that overload the processors. It does
> change the amount of time a grid job takes, though. We went from short
> jobs, to jobs that take hours and days to complete. The actual cpu time is
> the same, but if one user submits 70, 48 hour jobs, user B has to wait 48
> hours before his jobs start. I know that people have made priority ques,
> but our IT dept has not.
> Rewriting the simulation was rather drastic, so I later developed a submit
> script that would look at the input file, which for us is one set of input
> conditions per line, and submit every 20th line. The runscript was changed
> to run the line provided by SGE_TASK_ID and the next 19 past it. I picked
> 20, because that gave a job run time of about 15 minutes, which seem to work
> well in our situation.
> I am sure that there are lot of other ways to do this, but this is the path
> that we have taken.
> Good Luck,
> Brett Grant
> Gary L Fox <garylfox at hotmail.com> 01/02/2008 01:24 PM
> Please respond to
> users at gridengine.sunsource.net
> To <users at gridengine.sunsource.net>
> Subject [GE users] Scaling up GE for huge number of jobs
> I have a Linux cluster that is running RH4update 4 across all nodes (about
> 70 nodes total).
> We have SGE 6.0u10 running and have had very little problems for quite a
> However, our users have recently added a new type of job they run and they
> run these new jobs by the tens of thousands at a time.
> Currently, the queue contains 160K jobs.
> Well needless to say, things seem to be running in slow motion now. The
> scheduler is running at around 100% CPU constantly.
> We were not getting any meaningful response in qmon and to qsub and qstat
> commands, so I restarted SGE. I increased the schedule_interval from 15secs
> to 2 mins. Between the restart and the increased interval, things seem to
> be working better, as we can now get a response from qmon and qstat and we
> can submit jobs too. But things are still very much like slow motion.
> The cluster does not seem to remain full with jobs. Some nodes have only
> one job running and a few even have no jobs. (each node is 2CPU and normally
> would have 2 jobs running).
> We also have noticed that jobs from different users do not balance out
> (through fair share) as they have in the past. Newly submitted jobs remain
> at the bottom of the queue with a priority of 0.0000. Earlier queued jobs
> from another user have a priority around 0.55 to 0.56.
> I have always had reservations turned off with max_reservation=0. I have
> the default value for max_functional_jobs_to_schedule set to 200. I also
> just changed maxujobs to 136 from a value of 0.
> What can I do to optimize the settings for this scenario and get better
> Thank you,
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users