[GE users] how do you use -dl with qrsh? (qrsh will flag an error for the value of the flag, but if the value is valid it disowns the flag)
bill at baddogconsulting.com
Wed Dec 9 18:22:00 GMT 2009
On Wed, Dec 9, 2009 at 3:11 AM, reuti <reuti at staff.uni-marburg.de> wrote:
> Am 09.12.2009 um 03:37 schrieb bdbaddog:
>> On Tue, Dec 8, 2009 at 6:10 PM, reuti <reuti at staff.uni-marburg.de>
>>> Hi Bill,
>>> Am 09.12.2009 um 00:47 schrieb bdbaddog:
>>>> I'm running 6.2U4
>>>> I beleive the argument to -dl here is valid, but then it errors with
>>>> -dl not known
>>>> qrsh -dl 12081339 -P lp -l s_rt=1200 -l h_rt=1380 -l coupons=2 -
>>>> now n
>>>> -nostdin -cwd -N blah /bin/sh
>>>> error: Unknown option -dl
>>>> Here I expect it's invalid, but it admits there's a -dl flag.
>>>> qrsh -dl 1339 -P lp -l s_rt=1200 -l h_rt=1380 -l coupons=2 -now n
>>>> -nostdin -cwd -N blah /bin/sh
>>>> Invalid format of date/hour-minute field.
>>>> error: ERROR! Wrong date/time format "1339" specified to -dl option
>>>> Any idea what's going on here?
>>> it seems, that SGE is first checking the format of the supplied
>>> values. When it's fine, it discovers thereafter that it's not a valid
>>> option for qrsh at all. There is either a bug in the documentation or
>>> the behavior of qrsh. I suggest to file a bug.
>>> I'm not sure, whether a deadline is best for an interactive job while
>>> you are waiting for the results in front of the terminal, as only the
>>> priority up to the given date will rise - there is no guarantee that
>>> it wil start for sure at that time. When you want something to run
>>> for sure, it's best to submit an advance reservation first - when
>>> it's granted the slots are reserved for your job. Then you can submit
>>> the actual (interactive) job into this advance reservation.
>>> In SGE, an interactive job is more handled like an immediate job by
>>> $ qrsh ... => will run in an interactive queue
>>> $ qsub ... => will run in an batch queue
>>> $ qrsh -now no ... => will run in a batch queue
>>> $ qsub -now yes ... => will run in an interactive queue
>>> So, to keep you qrsh job hanging around until the advance reservation
>>> starts, you need:
>>> $ qrsh -now no -ar 1234 ...
>> For historical reasons, the scripts I'm working on use qrsh in order
>> to wait for the completion of what is effectively a batch job.
>> They do use -now no.
>> The jobs will be asking for some consumable resources (Max for a give
>> type of host, as we're not yet running 6.2U3 or above) and can't use
>> exclusive access. The idea was to use -dl to have the priority bump up
>> and insure at some point the job would be able to get exclusive access
>> to a node.
> Then I would suggest to use an urgency policy with an attached
> complex for this followup job, so that this job is more important
> than others. I assume you use already -hold_jid, and as long as the
> predecessor isn't finished, it won't reserve anything. As soon as the
> first job finishes, the followup job will be on top of the waiting list.
We trickle up to 70 jobs per run per user via qrsh at a time (limited
by our scripting), no "-hold_jid", but "-now no" is on the command
We're trying to setup a mechanism to run benchmarks, insuring the same
node type and no other jobs on the machine. Because we're running
6.2u1, and don't have a maintenance window in the near term to upgrade
to 6.2u4, I'm looking for a way to enable this.
We current have a consumable resource on each node "coupons" which
represents the number of GB of RAM each machine has (I did see the
recent thread on a better way to implement this, and we'll be using
that for our next rev of cluster config), with each job requesting
it's expected memory footprint, to prevent oversubscribing memory.
To ensure the benchmark jobs get a machine to themselves the plan is
to request the max # of coupons for the node type the benchmark will
run on. I was hoping to use -dl to ensure the test job doesn't get
resource starved and never get dispatched, and not have to change the
waiting time weight.
Any guidance you have on how to achieve these goals would be most helpful.
I'm hoping we have a window in late January to update to the latest
SGE at that point.
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users