[GE users] 6.0 qstat - possible RFE

Charu Chaubal Charu.Chaubal at Sun.COM
Wed Nov 10 18:55:58 GMT 2004


Hello,

On Nov 10, 2004, at 10:47 AM, Andreas Haas wrote:

> On Fri, 5 Nov 2004, Pacey, Mike wrote:
>
>> Hi folks,
>>
>> I've been trying out 6.0 with a view to upgrading our system, and
>> there's a couple of features that I think might be useful to add.
>> Thought I'd post here to see if there's any general call for these
>> before I make any request for enhancement.
>>
>> 1) Qtop / qstat -ext
>>
>> I'm running the old 5.2.2 here, and after a while I started to think
>> it'd be really useful for users to be able to monitor how their jobs 
>> are
>> doing in terms of memory and % cpu usage. Users can then see if their
>> jobs are, e.g. consuming too much memory, have stalled (no cpu usage),
>> or bottlenecking on i/o (low cpu usage). My solution was a fairly 
>> crude
>> script "qtop" which simply skims the list of hosts which are running a
>> user's jobs from qstat, and then iteratively invoking a remote call to
>> top to show the user's processes. It's a little messy from a user's
>> perspective, but it's still fairly readable, though it can be a little
>> slow if an execution node is heavily loaded.
>>
>> Looking at the docs for 6.0 the "qstat -ext" caught my eye - having 
>> the
>> execds keep track of this kind of info is an excellent idea. But 
>> looking
>> at it, I see that these are cumulative stats (ie cpu seconds consumed,
>> and gigabyte-seconds). Maybe this has great use in some sort of 
>> billing,
>> but I'd think it much more useful for users to see % cpu usage and
>> memory usage in the vein of top and prstat. I think such figures can 
>> be
>> easily computed from the existing stats -ext stats, I guess some more
>> digging in pdc.c will confirm that. Would anyone else find that a 
>> useful
>> feature?
>>
>> 2) qstat output
>>
>> More a minor niggle this, but qstat output is something my users look 
>> at
>> a lot, so it needs to be easily readable. Looking at my test qstat
>> output, qstat now runs to 112 columns - not good for users with 
>> standard
>> 80-column displays. Coming from the old 5.2.2 output, this is a bit 
>> of a
>> leap, and there seems to be a couple of ways to rationalise this:
>>
>>    - the FQDN of an execution host isn't really necessary, even on
>> systems with execution hosts in different domains, and it takes up a
>> good few chars
>>
>>    - the year part of the submission/start date could be removed, as I
>> can't
>> imagine many users having year-long jobs (though of course it needs to
>> preserved for qacct output)
>>
>>    - the task id field takes up a lot space simply because of the 
>> field
>> name, the rather lengthy "ja-task-ID"
>
> Thanks for the feedback! I filed it under
>
>    #1326 Default qstat output should not exceed 80 columns
>
>
>>
>> I could write a screen scaper to modify this stuff more to my taste - 
>> or
>> modify the 'status' script which appears to be an attempt to do
>> something similar by someone who also thought the current output was a
>> bit unwieldy. I already have a few scripts to summarise my cluster
>> output based on qstat output, but I'm wondering in the current qstat
>> output needs rationalising a bit more, e.g. hiding some of the extra
>> info like FQDNs away in extra command line options, or giving site 
>> admin
>> higher level control over field selection and length (e.g. I'd be 
>> quite
>> happy to have just 8 chars for username, and I've never had much use 
>> for
>> the 'prior' field), or maybe even switch to multi-line output?
>

I think it's important to point out that N1GE 6 qstat already has an 
option to output information in an XML format.  You can then send this 
to your favorite XML parser (Perl, Java, etc) and get exactly the 
information you need without worrying about field sizes, line lengths, 
multi-line records, etc.

Regards,
	Charu


> This is on customizing qstat output. It's already covered by RFE
>
>    #77 output format of commands like qstat should configurable and 
> easier to parse
>
> the ideal solution would be a qstat -o option that would allow to
> specify the columns to be printed very much like ps -o option.
>
> Regards,
> Andreas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
###############################################################
# Charu V. Chaubal				# Phone: (650) 786-7672 (x87672)
# Grid Computing Technologist	# Fax:   (650) 786-4591
# Sun Microsystems, Inc.			# Email: charu.chaubal at sun.com
###############################################################


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list