[GE users] SDM issues

zwierzak Ryszard.Macidlowski at sun.com
Wed Jul 22 21:12:38 BST 2009


Hi Chansup,

I'll just asnwer to 1 question :) to question 2 you should get answer 
from adapter experts :)

cbyun pisze:
> I have set up grid engine and cloud services using the SGE 6.2u3 release. 
> Now I have a GE service in unknown state and I don't know how to delete it.
>
> Also, my cloud service is running but I see some issues from the log file.
>
>
> Issue 1: How to delete the service in unknown state but started?
>   
You need to stop the component. When service is in unknown state it 
means it cannot be connacted by whaever reason that why you could stop 
it and free resources.
To stop component just use

sdmadm sdc -c gesvc

And after that you should be able to remove service.

BTW. I believe sdmadm remove_service should have also -force option to 
remove running service.

Rys
> # sdmadm ss
> host            service    cstate  sstate
> ------------------------------------------
> llgriddev.local gesvc      STARTED UNKNOWN
>                 gesvc2     STARTED RUNNING
>                 power      STARTED RUNNING
>                 spare_pool STARTED RUNNING
>
> # sdmadm sds -s gesvc -fr
> service result message
> -----------------------------------------------------
> gesvc   ERROR  Can not stop service, it is not active
> Error: Command has generated error.
>
> # sdmadm remove_service -s gesvc
> Error: Operation on component cannot be performed. Component in illegal state: STARTED
>
> Issue 2: How to resolve the VPN server issue?
>
> I used the following config for cloud service:
>
>     <cloud_adapter:vpn xsi:type="cloud_adapter:OpenVPNConfig"
>                        vpnBinDir="/usr/sbin"
>                        vpnConfigDir="/tmp"
>                        vpnRequired="true"/>
>
> Which is taken from the Richard's blog: http://blogs.sun.com/rhierlmeier/entry/using_sdm_cloud_adapter_to
>
> What does the following actually mean?
> Service power:Problem: VPN server is corrupted! Registered but server-less resources:
>
>
> 07/22/2009 13:37:22|48|vice.impl.cloud.CloudSnapshot.checkCloudState|W|Service power:Problem: VPN server is corrupted! Registered but server-less resources: [[hostname: blade-0-0.local, instanceId: i-blade-0-0, launchTime: 2009-07-21T09:56:03.000Z] , [hostname: blade-0-1.local, instanceId: i-blade-0-1, launchTime: 2009-07-21T09:56:03.000Z] , [hostname: blade-0-2.local, instanceId: i-blade-0-2, launchTime: 2009-07-21T09:56:03.000Z] , [hostname: blade-0-3.local, instanceId: i-blade-0-3, launchTime: 2009-07-21T09:56:03.000Z] , [hostname: blade-0-4.local, instanceId: i-blade-0-4, launchTime: 2009-07-21T09:56:03.000Z] , [hostname: blade-0-5.local, instanceId: i-blade-0-5, launchTime: 2009-07-21T09:56:03.000Z] , [hostname: blade-0-6.local, instanceId: i-blade-0-6, launchTime: 2009-07-21T09:56:03.000Z] , [hostname: blade-0-7.local, instanceId: i-blade-0-7, launchTime: 2009-07-21T09:56:03.000Z] , [hostname: blade-0-8.local, instanceId: i-blade-0-8, launchTime: 2009-07-21T09:56:03.000Z] , [hostname: blade-0-9.local, instanceId: i-blade-0-9, launchTime: 2009-07-21T09:56:03.000Z] ].
> 07/22/2009 13:37:22|48|ice.impl.cloud.CloudResourceAmountOptTask.run|I|Service power:Service is in error recovery mode. Skipping resource amount optimization cylce.
>
>
> # sdmadm sr
> service    id              state    type flags usage annotation
> ---------------------------------------------------------------------------------------------
> gesvc2     blade-0-1.local ASSIGNED host SA    1     Got execd update event
>            blade-0-2.local ASSIGNED host SA    1     Got execd update event
>            blade-0-3.local ASSIGNED host SA    1     Got execd update event
>            blade-0-4.local ASSIGNED host SA    1     Got execd update event
>            blade-0-5.local ASSIGNED host SA    1     Got execd update event
>            blade-0-6.local ASSIGNED host SA    1     Got execd update event
>            blade-0-8.local ASSIGNED host SA    1     Got execd update event
>            blade-0-9.local ASSIGNED host SA    1     Got execd update event
> power      blade-0-0.local ASSIGNED host A     2     Resource is used by two or more services
>            blade-0-1.local ASSIGNED host A     2     Resource is used by two or more services
>            blade-0-2.local ASSIGNED host A     2     Resource is used by two or more services
>            blade-0-3.local ASSIGNED host A     2     Resource is used by two or more services
>            blade-0-4.local ASSIGNED host A     2     Resource is used by two or more services
>            blade-0-5.local ASSIGNED host A     2     Resource is used by two or more services
>            blade-0-6.local ASSIGNED host A     2     Resource is used by two or more services
>            blade-0-7.local ASSIGNED host A     2     Resource is used by two or more services
>            blade-0-8.local ASSIGNED host A     2     Resource is used by two or more services
>            blade-0-9.local ASSIGNED host A     2     Resource is used by two or more services
> spare_pool blade-0-0.local ASSIGNED host A     1     Resource is used by two or more services
>            blade-0-7.local ASSIGNED host A     1     Resource is used by two or more services
>
>
> Thanks,
> - Chansup
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=208953
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=208990

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list