[GE users] array job accounting or scheduling problem

Pascal Wassam pascal at blur.compbio.ucsf.edu
Tue Jun 19 22:48:26 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

I seem to be having an interesting problem, which I have been unable to find any remedy or information about:

Array job usage seems to be accounted for incorrectly.

Example:

100 cpu's on cluster, OS fairshare policy, evenly balanced share tree, SGE 6.1 (I have also seen this on 6.0u10)
User a submits 1000 jobs.
User b submits 1000 jobs.
User c submits 1 array job, with 1000 members.

Results look something like:
48 of user a's jobs running at any time
48 of user b's jobs running at any time
4 of user c's array job members run at any time.

If the queue is empty, except for user c's jobs, they will all begin executing.

Looking at fairshare usage (via qmon) shows that user c's "Actual Resource
Share" (policy configuration -> share tree policy) is very high, (like 50-80%).

I can provide detailed configuration on request.

Has anyone seen this before, or have any ideas what this is?

Thanks,
-Pascal

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list