[GE users] Problems with PEs and resource quotas
reuti
reuti at staff.uni-marburg.de
Tue Dec 14 20:34:14 GMT 2010
Am 14.12.2010 um 21:02 schrieb mdsteeves:
> On 12/14/10 5:18 AM, reuti wrote:
>
> [SNIP]
>
>> I don't see any resource reservation in the above lines: #$ -R
>>
>> And to have an effect it's necessary to set "max_reservation 20" or an appropriate value in the scheduler configuration. Then slots should be reserved for this job, so that he won't die of starvation.
>>
>> Is this fixing the issue?
>
>
> Resource reservation for the resource quota piece? We don't use that at
> the moment -- the moe_limit that's currently in place limits each user
> to only be able to have 20 jobs running, which is the behavior that we
> want. The problem we're having is that other jobs, that don't need or
> use these licenses, get stuck in a "qw" state, and reference the
> moe_limit resource quota. If we go in and disable the resource quota,
> then the job gets dispatched to a node and runs without problem.
AFAICS you are limiting the number of potential queue instances with all the examples you mentioned as not working:
## #$ -l q=mpi.q
## #$ -l hostname="compute-0-2"
## #$ -l hostname...
Hence SGE has less options to schedule the job. Or does it also happen in an empty cluster?
Nevertheless: One bug to mention is, that you can't use -q in combination with -l h=. The workaround is to request the hostnames in the -q request:
-q mpi.q at compute-0-2
-- Reuti
> If we don't use either "-l qname=...." or "-l hostname=...." when we
> submit the job, then it launches without problem.
>
> If we don't specify a parallel environment, but leave the -l requests in
> the job submission, then it launches without a problem.
>
> While I haven't tested each and every resource that could be requested
> when a job is submitted, the jobs only seem to stick in a qw state if we
> try to request either a queue or a host.
>
>
> -Mike
> --
> Michael Steeves (mdsteeves at gmail.com)
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=305578
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=305586
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users
mailing list