[GE users] Altix Linux 64 ia64 cpus

Wal walid.shaari at gmail.com
Tue Mar 22 15:42:40 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

On Tue, 22 Mar 2005 07:23:04 -0800 (PST), Ron Chen
<ron_chen_123 at yahoo.com> wrote:
 
> Have you read the user guide and admin guide? I went
> to the web training many years ago... when it was
> free. The content was OK, but I think now it is for
> SGE 6.0.

I am reading the admin guide slowly as there are so many distractions
at work, and that is just one of the machines i have to administer and
know nothing at all about, but i am coming to terms with it
 

> OpenPBS has no failover, and PBSPro has 1 level of
> failover. In SGE, you can have many shadow masters
> running. BTW, what features are you looking for? And
> what kind of jobs are you planning to run on the
> cluster?

mpi-openmp jobs that use large memory models to do reservoir
simulations, we do not run that many on that machine, they will be
around 4-5 jobs queued in/day ranging from 5-18-48-72 hours jobs. i
have not read the failover capabilities yet, can i even run that on a
low end machine, would that be useful to have, i am not sure yet?

> Also, anything specific they don't like about SGE? Or
> if anything you think PBS is better, send to this list
> and someone will help you :)

we are a team of 8 linux admins managing around 4000 nodes in
different clusters of 128 nodes each, almost all of the admins know
pbs inside/out, and load leveler but have problems with introducing
new software to the production environment, the main reason i have
installed SGE in first place was the slow response from altiare in
giving me an evluation license for the Altix.

> > 3- qmon segfaults in sge5.3, does any one know the
> > fix?
> 
> What motif version do you have?

openmotif-2.2.2-16
openmotif-devel-2.2.2-16
openmotif21-2.1.30-8
 
> > 4- how do i find out the stats of how many cpus, and
> > memory were used,
> > elapsed time from something like qstat/qmon?
> 
> To find accounting of jobs, use qacct -j <job id>

Thanks that was very useful

best regards

Walid.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list