[GE users] SGE Job Memory Usage
mhanby at uab.edu
Wed Apr 14 17:44:50 BST 2010
Grid Engine 6.2u5
CentOS 5 x86_64 with kernel 2.6.18-128.7.1.el5
We are working on getting our users to consider how much memory their job tasks will use. Currently, we are requiring that they request mem_free, but ultimately I'd like to get them using h_vmem.
Before we can use h_vmem, the users have been asking the obvious question, how do I tell how much memory my job used, so that I know how much to request the next time I run it.
For a currently running job, if I look at:
$ qstat -j 230781 |grep ^usage
usage 1: cpu=01:23:58, mem=18004.07080 GBs, io=0.00000, vmem=3.859G, maxvmem=4.077G
Is this telling me that the job currently has used no more than 4.077GB of RAM (virtual and physical)?
If I look at the process for this job on the compute node, ps reports it as using 48% of the systems RAM (on a 16GB system), which would be roughly 7.6GB.
$ ssh compute-1-5 ps auxf|grep jsmith
jsmith 28100 0.0 0.0 67992 1336 ? Ss 10:04 0:00 | \_ -bash /opt/gridengine/default/spool/compute-1-5/job_scripts/230781
jsmith 28206 0.0 0.0 65900 1208 ? S 10:04 0:00 | \_ sh /share/apps/R/R-2.9.0/gnu/lib/R/bin/Rcmd BATCH mainCode_Xthin_5.R run_thin_5.out
jsmith 28210 99.4 48.5 8106856 7972440 ? R 10:04 90:11 | \_ /share/apps/R/R-2.9.0/gnu/lib/R/bin/exec/R -f mainCode_Xthin_5.R --restore --save --no-readline
And free -m reports
$ ssh compute-1-5 free -m
total used free shared buffers cached
Mem: 16050 12063 3986 0 7 1258
-/+ buffers/cache: 10798 5251
Swap: 996 0 996
If I sum up the other processes % usage reported by the ps command, they add up to the 10GB usage reported by 'free' so it appears that this R process really is using.
So I guess my question is really, are the resource usage values reported by qstat and the job email really accurate or do I need to gather the metrics elsewhere?
mhanby at uab.edu
Information Systems Specialist II
IT HPCS / Research Computing
To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
More information about the gridengine-users