[GE users] qalter-ing consumable resource requests

olesen Mark.Olesen at emconTechnologies.com
Tue Dec 1 09:39:47 GMT 2009


> > that someone can check the behavior of the job. Otherwise you might  
> > get confused when you look at "qhost -F" and discover some  
> > inconsistency.
> 
> I don't really understand. I mean - being able to change any other
> option (that can be changed) of a running job, you introduce
> inconsistency into qstat output anyway, so whats the big deal?

If the allocated resources are things like licenses, you may very well
run into very interesting problems with 'lying' about how many resources
are actually needed/used by a job.


> > (ssh to the headnode with hostbased authtication can help to avoid  
> > that every machine is also a submit-host, you could also submit local  
> > on the node of course.)
> 
> I know what you mean, but we chose not to allow job submission from
> within the job. Right now, all computations, no matter of how many
> interdependant job they consist of, are static set of jobs bound to
> one DRMAA session that share certain configurations. Such set is not
> altered very easily so thats why the choice.

Since you are using exit 99 to resubmit the job anyhow, you could try
something like this approach:

- use the job context to 'remember' information between various stages.
- submit with a context = estimate
- estimate the true resource requirements
- place these requirements in context string, flag with context = alter
- use qalter to place a hold on the job
- exit 99

The job is now in a hold state.
An external script that runs regularly (eg, bound into a load sensor or
as a separate daemon) checks for jobs in the hold state with context
'alter'. It adjusts the resource requirements with qalter, marks the
change in the job context and releases the hold.

I haven't tested if this actually works -- it's just an idea.

/mark






This e-mail message and any attachments may contain legally privileged, confidential or proprietary Information, or information otherwise protected by law of EMCON Technologies, its affiliates, or third parties. This notice serves as marking of its "Confidential" status as defined in any confidentiality agreements concerning the sender and recipient. If you are not the intended recipient(s), or the employee or agent responsible for delivery of this message to the intended recipient(s), you are hereby notified that any dissemination, distribution or copying of this e-mail message is strictly prohibited. 
If you have received this message in error, please immediately notify the sender and delete this e-mail message from your computer.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=230656

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list