[GE users] integrating matlab distributed computing engine

Ben Singer bdsinger at Princeton.EDU
Thu Mar 17 01:33:02 GMT 2005

On Mar 16, 2005, at 5:55 PM, Rayson Ho wrote:

>> In DCE,
>> matlab worker processes execute tasks submitted by users. However,
>> either the workers are run as a fixed user, and do work on behalf
>> of multiple users, or users can launch workers reserved for
>> themselves as needed.
> After reading your mail and visiting the Mathswork site, I am still no
> clear how things work in the distributed computing engine.
> You need to explain how MDCE works since not everyone here has access 
> to
> Matlab.

MDCE is a new product that came out last November 2004 I believe, so 
not many have heard of it. It is essentially a job submission and 
scheduler that lives in the matlab domain. Matlab clients on 
workstations submit matlab commands to a job manager which schedules 
them to run on matlab execution hosts. The execution hosts run engine 
daemons that are essentially headless matlab processes ("workers") that 
only accept tasks from the job manager, execute them, and send the 
results back.

> Does MDCE need to be started as root??

That's the way I'm running it now, but if I want to hold users 
accountable for the tasks they send the daemon, I need to do things 
differently I think.

> "locks workers to users" - what does this mean?

Sorry for the vagueness-- what I meant is that if users start up a mdce 
worker themselves, then that worker is only going to do work for that 
user. Since each worker uses up a matlab license, and is often idle 
between running tasks, it is better to make that worker available to 
all users and have it run as some fixed user. But that leads to the 
accounting problem.

As I say, this isn't a problem specific to mdce. I'm sure there are 
other cases like this. I can think of solutions like having the worker 
processes dynamically change owner via a watchdog process via setuid(2) 
or something, but I don't think SGE checks on process ownership 
post-submit, is that right? Another idea is to charge the cpu cycles 
used while a task ran on the worker post-hoc given the user and the 
start/stop time of each task, but I don't know how or in what form 
usage information could be inserted into SGE.

So perhaps I should have framed the question this way: how to tell SGE 
about cycles users have consumed outside of SGE, so that policies such 
as share-tree by project can be as accurate as possible?

> Rayson
>  The former is more convenient and efficient, and one will never
>> run out of licenses, but SGE cannot track usage by user-- it is all
>> accounted to the fixed user. The latter would allow user-level usage
>> accounting, but defeats much of the purpose of DCE since it locks
>> workers to users.
>> This seems like a general problem-- services running on clusters that
>> do work on behalf of users that connect to them. How can the resources
>> consumed by such processes be correctly accounted for in SGE policy?
> ---------------------------------------------------------
> Get your FREE E-mail account at http://www.eseenet.com !
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list