[GE users] kestore/passwd issue when adding GE adaptive service

cbyun cbyun at ll.mit.edu
Fri Jul 17 17:53:12 BST 2009


I've cleaned up my test environment and restarted to install everything from the scratch. But I got the following issue when I tried to add GE adaptive service:

# sdmadm ss
host            service    cstate  sstate
llgriddev.local gesvc      STOPPED ERROR
                spare_pool STARTED RUNNING

The log shows that 

07/17/2009 11:31:44|21|e.impl.ge.GEServiceAdapterImpl.doStartService|I|Service gesvc: Starting Grid Engine service
07/17/2009 11:31:44|21|rm.service.impl.AbstractServiceAdapter$1.call|E|Service startup failed: Cannot create keystore from /var/sgeCA/port6444/default/userkeys/sge/keystore: Keystore was tampered with, or password was incorrect
07/17/2009 11:31:44|22|rm.impl.AbstractComponent$3.performTransition|W|Componentgesvc: Error in startup procedure: Service gesvc: Unexpected error in state transition UnknownStateHandler[UNKNOWN] -> StartingStateHandler[STARTING]: Service startup failed: Cannot create keystore from /var/sgeCA/port6444/default/userkeys/sge/keystore: Keystore was tampered with, or password was incorrect

The only thing I did differently this time was that I generated sge admin user keystore with a password so that I can use the sge admin user account to access the SGE inspect tool. Apparently this new step caused the issue.

Is this not allowed?

Here is what I did:

1. Install SGE qmaster on llgriddev
2. Create SGE Admin user (sge) kestore with password [additonal step]
   # $SGE_ROOT/util/sgeCA/sge_ca -ks sge -ksout \   
      /var/sgeCA/port6444/default/userkeys/sge/keystore \
      -kspwf ./mysecret.txt 

    This step will overwrite the existing admin keystore.
    Is this not allowed?
3. Install SGE inspect
4. Check if the SGE admin user to connect to the cluster via SGE inspect 
  (This worked fine using the password used in 2nd step)

5. Install SDM master hosts
6. Install SDM managed hosts
   For this procedure, I used the following sge keystore and cacert:
    /var/spool/sdm/sdm62u3/security/users/sge.keystore \
    /var/spool/sdm/sdm62u3/security/ca/ca_top/cacert.pem \

   All JVMs are up and running fine:

# sdmadm sj
name  host            state      used_mem  max_mem   message
cs_vm blade-0-1.local STARTED           9M       28M
      blade-0-2.local STARTED           3M       28M
      blade-0-3.local STARTED           5M       28M
      blade-0-4.local STARTED           8M       28M
      blade-0-5.local STARTED           5M       28M
      blade-0-6.local STARTED           8M       28M
      blade-0-8.local STARTED           8M       28M
      blade-0-9.local STARTED           4M       28M
      llgriddev.local STARTED          11M      341M

 7. Add GE adaptive service:

    # sdmadm add_ge_service -h llgriddev -j cs_vm -s gesvc -start

    Then, added the following the cluster information:

password=""  [empty if keystore used]

- Chansup


To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

More information about the gridengine-users mailing list