[GE users] 6.0 qstat - possible RFE

Andreas Haas Andreas.Haas at Sun.COM
Wed Nov 10 18:47:02 GMT 2004


On Fri, 5 Nov 2004, Pacey, Mike wrote:

> Hi folks,
>
> I've been trying out 6.0 with a view to upgrading our system, and
> there's a couple of features that I think might be useful to add.
> Thought I'd post here to see if there's any general call for these
> before I make any request for enhancement.
>
> 1) Qtop / qstat -ext
>
> I'm running the old 5.2.2 here, and after a while I started to think
> it'd be really useful for users to be able to monitor how their jobs are
> doing in terms of memory and % cpu usage. Users can then see if their
> jobs are, e.g. consuming too much memory, have stalled (no cpu usage),
> or bottlenecking on i/o (low cpu usage). My solution was a fairly crude
> script "qtop" which simply skims the list of hosts which are running a
> user's jobs from qstat, and then iteratively invoking a remote call to
> top to show the user's processes. It's a little messy from a user's
> perspective, but it's still fairly readable, though it can be a little
> slow if an execution node is heavily loaded.
>
> Looking at the docs for 6.0 the "qstat -ext" caught my eye - having the
> execds keep track of this kind of info is an excellent idea. But looking
> at it, I see that these are cumulative stats (ie cpu seconds consumed,
> and gigabyte-seconds). Maybe this has great use in some sort of billing,
> but I'd think it much more useful for users to see % cpu usage and
> memory usage in the vein of top and prstat. I think such figures can be
> easily computed from the existing stats -ext stats, I guess some more
> digging in pdc.c will confirm that. Would anyone else find that a useful
> feature?
>
> 2) qstat output
>
> More a minor niggle this, but qstat output is something my users look at
> a lot, so it needs to be easily readable. Looking at my test qstat
> output, qstat now runs to 112 columns - not good for users with standard
> 80-column displays. Coming from the old 5.2.2 output, this is a bit of a
> leap, and there seems to be a couple of ways to rationalise this:
>
>    - the FQDN of an execution host isn't really necessary, even on
> systems with execution hosts in different domains, and it takes up a
> good few chars
>
>    - the year part of the submission/start date could be removed, as I
> can't
> imagine many users having year-long jobs (though of course it needs to
> preserved for qacct output)
>
>    - the task id field takes up a lot space simply because of the field
> name, the rather lengthy "ja-task-ID"

Thanks for the feedback! I filed it under

   #1326 Default qstat output should not exceed 80 columns


>
> I could write a screen scaper to modify this stuff more to my taste - or
> modify the 'status' script which appears to be an attempt to do
> something similar by someone who also thought the current output was a
> bit unwieldy. I already have a few scripts to summarise my cluster
> output based on qstat output, but I'm wondering in the current qstat
> output needs rationalising a bit more, e.g. hiding some of the extra
> info like FQDNs away in extra command line options, or giving site admin
> higher level control over field selection and length (e.g. I'd be quite
> happy to have just 8 chars for username, and I've never had much use for
> the 'prior' field), or maybe even switch to multi-line output?

This is on customizing qstat output. It's already covered by RFE

   #77 output format of commands like qstat should configurable and easier to parse

the ideal solution would be a qstat -o option that would allow to
specify the columns to be printed very much like ps -o option.

Regards,
Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list