[GE users] Accounting

Reuti reuti at staff.uni-marburg.de
Wed Feb 23 02:37:49 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Here is a 6 CPU job, running on 3 machines with each granted 2 slots. Tight 
integration, used LAM/MPI, 5.3p6 (I can try it on a 6.0 later this day). Means: 
two forks per lamd daemon: CPU=ru_time=2*wallclock (okay, nearly).

With the other cluster having 56 CPUs granted for the job, there should be 56 
qrsh commands (or less when using shared memory on some machines) - with all 
three entries equal.

Cheers - Reuti


Quoting Ron Chen <ron_chen_123 at yahoo.com>:

> Reuti, do you have a tight PE? Can you see if qacct is
> reporting the sum of the CPU time for each MPI task
> for  ru_utime?
> 
> Thanks,
>  -Ron
> 
> --- Reuti <reuti at staff.uni-marburg.de> wrote:
> > Sorry, it was an aborted job I looked at. ru_utime
> > is also correct for a normal 
> > ended job. And it's like CPU only twice the value of
> > wallclock when 
> > thread/forks were used. - Reuti
> > 
> > Quoting Reuti <reuti at staff.uni-marburg.de>:
> > 
> > > Is this MPI implementation using qrsh? I have an
> > entry for each machine (i.e.
> > > 
> > > qrsh) in the accouting file (with indeed different
> > wallclock/cpu by factor 2
> > > on 
> > > dual headed nodes) (BTW: they are not summarized
> > when using -j, raw values
> > > are 
> > > diplayed instead).
> > > 
> > > But interesting: utime is always 0 for me, it's in
> > the entry CPU.
> > > 
> > > CU - Reuti
> > > 
> > > Quoting Rayson Ho <raysonho at eseenet.com>:
> > > 
> > > > I need someone who knows more about the qmaster
> > to take a look...
> > > > 
> > > > Vik sent me the accounting file, and qacct -j
> > reports:
> > > > ...
> > > > granted_pe   mpi
> > > > slots        56
> > > > failed       0
> > > > exit_status  0
> > > > ru_wallclock 15764
> > > > ru_utime     14427
> > > > ...
> > > > 
> > > > And here's the section of the accounting file
> > for the job:
> > > >
> >
> parallel:sub04n169:udo:udo:mpi_parallel:12981:sge:0:1108595203:
> > > >
> >
> 1108626737:1108642501:0:0:15764:14427:16:0.000000:0:0:0:0:63669:
> > > >
> >
> 83329:0:0.000000:0:0:0:0:0:0:NONE:Dept_GK:mpi:56:0:14443.000000:
> > > > 62.884431:0.000000:-U
> > parallel,opteron,deadlineusers,Dept_GK
> > > > -q parallel -pe mpi
> > 20-122:0.000000:NONE:106033152.000000
> > > > 
> > > > qacct is reporting the right thing, so I think
> > there is a problem in
> > > > qmaster when it adds the ru_utime for all the
> > tasks in the parallel job.
> > > > Since the job is tightly integrated, shouldn't
> > the ru_utime be much
> > > larger
> > > > than the ru_wallclock??
> > > > 
> > > > Rayson
> > > > 
> > > > 
> > > > >
> > > > >
> > > >
> >
> ---------------------------------------------------------
> > > > Get your FREE E-mail account at
> > http://www.eseenet.com !
> > > > 
> > > >
> >
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail:
> > users-unsubscribe at gridengine.sunsource.net
> > > > For additional commands, e-mail:
> > users-help at gridengine.sunsource.net
> > > > 
> > > 
> > > 
> > > 
> > 
> > 
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail:
> > users-help at gridengine.sunsource.net
> > 
> > 
> 
> 
> 
> 		
> __________________________________ 
> Do you Yahoo!? 
> All your favorites on one personal page ? Try My Yahoo!
> http://my.yahoo.com 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 




    [ Part 2, Application/X-MACBINARY (Name: "qacct.output") 11 KB. ]
    [ Unable to print this part. ]


    [ Part 3: "Attached Text" ]

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



More information about the gridengine-users mailing list