[GE users] Accounting of Parallel Jobs

Bradford, Matthew matthew.bradford at EDS.COM
Tue Jan 29 21:24:28 GMT 2008


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi all,

This is an old problem, and I don't think there is a solution, but thought I'd ask to see if anybody has any ideas. We are running a loose integration with the SCore parallel environment, and SGE is unable to record accurate usage of a job's CPU time. We are looking at the ACCT_RESERVED_USAGE and SHARETREE_RESERVED_USAGE flags in the execd params, which provides an improvement to the reporting as it gives us (Wallclock time X slots), but the problem is, all the SCore parallel jobs only use 1 slot per node, even though they are using all 4 cores on a node. This would be OK if every job was a parallel SCore job, but some of the jobs are simple serial jobs, which run within a serial queue, and use 1 slot per core. The accounting problem is then that a serial job using 1 slot is reported to use the same amount of CPU as a parallel job, using 1 slot but 4 cores. This will cause problems when looking at a sharetree set up, as a group which tends to run serial jobs will be penalised compared to a group that tends to run parallel jobs.

Is there any way of scaling the usage of the slots on a cluster queue basis, so that a single slot within a parallel queue is equivalent to 4 slots within a serial queue.

Alternatively, and in the longer term, is there any intention of providing the functionality where a user can request number of nodes, and then number of cores per node, rather than the single "slots" parameter. This would mean that the current configuration that we are using, where the parallel queues only offer 1 slot, could be changed so that SGE understands that a user is requesting multiple cores, and would reduce the reporting anomaly.

Any advice would be much appreciated.

Thanks,

Mat


Matthew Bradford
Information Analyst
Applications Services Field Operations EMEA
UKIMEA RABU 
EDS c/o Rolls-Royce Plc, Moor Lane
PO Box 31
Derby
DE24 8BJ

email:	matthew.bradford at eds.com
Office:	+44 01332 2 22059

This message contains information which may be confidential and privileged. Unless you are the intended addressee (or authorised to receive for the addressee) you may not use, copy or disclose to anyone the message or any information contained in this message. If you have received this message in error, please advise the sender by reply email and delete the message.
? 2005 Electronic Data Systems Corporation. All rights reserved.

Electronic Data Systems Ltd
Registered Office:, Lansdowne House, Berkeley Square, London  W1J 6ER 
Registered in England no: 53419
VAT number: 432 99 5915
 






More information about the gridengine-users mailing list