[GE users] We get "GE maintrunk" when we start the daemons

Esteban Freire Garcia esfreire at cesga.es
Mon Dec 3 10:56:29 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Andy,

Thanks for your answer, I'm sorry, I didn't know the new features to 
last version of SGE and I didn't know about NTPL threading library, but 
I'm glad they aren't bugs. We have a paravirtual machine with Xen with 
kernel:

[root at sa3-ce ~]# uname -a
Linux sa3-ce.egee.cesga.es 2.6.18-1.2798.fc6xen #1 SMP Mon Oct 16 
14:59:01 EDT 2006 i686 i686 i386 GNU/Linux

I talked with the person who have compiled the code, and he checked the 
code and got the following info:
 
cat gridengine/CVS/Root
:pserver:guest at cvs.sunsource.net:/cvs

Thanks,
Esteban

Andy Schwierskott escribió:
> Hi Esteban,
>
> first congratulation to your successful compile! See my answers below.
>
>> We have installed SGE 6.1u2 compiled from de source on a virtual machine
>> with SL4(Red Hat enterprise 4), and we get some estrange things, as for
>> example:
>>
>> - If  we execute the command 'qstat -s z' as user root, we don't see 
>> the finished jobs, however, as user who send the job, a qstat -s z 
>> show the finished jobs for this user.
>
> This is the new 6.1 behavior. By default a user only sees his own jobs.
> Add a '-u "*"' to the command line or add
>
>    -u *
>
> to the system wide or local "sge_request/.sge_request" file to se all 
> jobs
> by default.
>
>> - When we execute  qconf -help, we don't see the installed version, 
>> and we obtain "GE maintruk", we also obtain "GE maintruk" when we 
>> start the daemons.
>> ------------------------------------------------------------------------------------------------------------------------------------------------- 
>>
>> [root at sa3-ce ~]# /etc/init.d/sgemaster start
>>  starting sge_qmaster
>>  starting sge_schedd
>> starting up GE maintrunk (lx26-x86)
>> ------------------------------------------------------------------------------------------------------------------------------------------------- 
>>
>
> Did you use the source tar.gz ball from the Document & files download 
> page
>
>    http://gridengine.sunsource.net/servlets/ProjectDocumentList
>
> or did you check the code yourself:
>
>    cvs co -r V61u2_TAG
>    cvs co -r V61u3_TAG   (this is out since 2 days)
>
> This is not SGE6.1u2 obviously, but a code from the maintrunk (or 
> there's a
> small bug in the code which gives the wrong version number).
>
> What content do your
>
>    CVS/Root
>
> file have? Or does it not exist at all?
>
>> [root at sa3-ce ~]# qconf -help | head -n 4
>> GE maintrunk
>> usage: qconf [options]
>>  [-aattr obj_nm attr_nm val obj_id_lst]   add to a list attribute of 
>> an object
>>  [-Aattr obj_nm fname obj_id_lst]         add to a list attribute of 
>> an object
>> ------------------------------------------------------------------------------------------------------------------------------------------------- 
>>
>>
>> - On the master node and on the wn have a lot of same SGE daemons 
>> running, when we start of daemons:
>> ------------------------------------------------------------------------------------------------------------------------------------------------- 
>>
>> [root at sa3-ce etc]# ps -ef | grep sge
>> root     26512     1  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_qmaster
>> root     26518 26512  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_qmaster
>> root     26520 26518  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_qmaster
>> root     26523 26518  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_qmaster
>> root     26524 26518  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_qmaster
>> root     26525 26518  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_qmaster
>> root     26526 26518  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_qmaster
>> root     26527 26518  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_qmaster
>> root     26528 26518  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_qmaster
>> root     26529 26518  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_qmaster
>> root     26530 26518  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_qmaster
>> root     26533     1  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_schedd
>> root     26534 26533  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_schedd
>> root     26535 26534  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_schedd
>> root     26536 26534  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_schedd
>> root     26537 26534  0 15:59 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_schedd
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------- 
>>
>> [root at sa3-wn001 ~]# ps -ef | grep sge
>> root     28757     1  0 13:55 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_execd
>> root     28758 28757  0 13:55 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_execd
>> root     28759 28758  0 13:55 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_execd
>> root     28760 28758  0 13:55 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_execd
>> root     28761 28758  0 13:55 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_execd
>> root     28762 28758  0 13:55 ?        00:00:00 
>> /usr/local/sge/pro/bin/lx26-x86/sge_execd
>> ------------------------------------------------------------------------------------------------------------------------------------------------- 
>>
>
> On older Linux systems without the NTPL threading library (and kernel) 
> all
> threads appear as an individual process wiht "ps". I'm somewhat surprised
> that RH4 is still using the old threading kernel/library. But it's not a
> bug.
>
> Andy
>
>> On the other hand, the jobs sent with "qsub" are executed correctly.  
>> Does anybody know what can be happening?
>>
>>
>> Thank you very much,
>> Esteban
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list