Skip to content

After operator restart, StatefulSet dependent resource fails to update #1464

@thomasdraebing

Description

@thomasdraebing

Bug Report

What did you do?

A StatefulSet is reconciled as part of a managed reconciliation workflow. If the CustomResource is updated while the operator was not restarted since the initial creation, the StatefulSet gets updated as expected. If the operator is restarted, the following upgrades fail. Other dependent resources, e.g. configmaps, in the workflow update as expected.

What did you expect to see?

The StatefulSet is updated with the new configuration.

What did you see instead? Under which circumstances?

The STatefulSet is not updated and he following exception is thrown:

16:41:47.062 [ERROR] io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher:61 [PID:57294] - Error during event processing ExecutionScope{ resource id: ResourceID{name='gerrit', namespace='gerrit'}, version: 10121738} failed.
io.javaoperatorsdk.operator.AggregatedOperatorException: Exception during workflow.
	at io.javaoperatorsdk.operator.processing.dependent.workflow.WorkflowReconcileResult.createFinalException(WorkflowReconcileResult.java:67) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.dependent.workflow.WorkflowReconcileResult.throwAggregateExceptionIfErrorsPresent(WorkflowReconcileResult.java:61) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:123) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:83) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.api.monitoring.Metrics.timeControllerExecution(Metrics.java:197) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.Controller.reconcile(Controller.java:82) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.reconcileExecution(ReconciliationDispatcher.java:135) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:115) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:86) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:59) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.event.EventProcessor$ControllerExecution.run(EventProcessor.java:395) ~[operator-framework-core-3.1.1.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:829) ~[?:?]
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://example.com/apis/apps/v1/namespaces/gerrit/statefulsets. Message: statefulsets.apps "gerrit" already exists. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=apps, kind=statefulsets, name=gerrit, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=statefulsets.apps "gerrit" already exists, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=AlreadyExists, status=Failure, additionalProperties={}).
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:684) ~[kubernetes-client-5.12.3.jar:?]
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:664) ~[kubernetes-client-5.12.3.jar:?]
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:615) ~[kubernetes-client-5.12.3.jar:?]
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558) ~[kubernetes-client-5.12.3.jar:?]
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521) ~[kubernetes-client-5.12.3.jar:?]
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:308) ~[kubernetes-client-5.12.3.jar:?]
	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:644) ~[kubernetes-client-5.12.3.jar:?]
	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:83) ~[kubernetes-client-5.12.3.jar:?]
	at io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61) ~[kubernetes-client-5.12.3.jar:?]
	at io.javaoperatorsdk.operator.processing.dependent.kubernetes.KubernetesDependentResource.create(KubernetesDependentResource.java:128) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.dependent.kubernetes.CRUDKubernetesDependentResource.create(CRUDKubernetesDependentResource.java:17) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.dependent.AbstractDependentResource.handleCreate(AbstractDependentResource.java:83) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.dependent.kubernetes.KubernetesDependentResource.handleCreate(KubernetesDependentResource.java:108) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.dependent.kubernetes.KubernetesDependentResource.handleCreate(KubernetesDependentResource.java:32) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.dependent.AbstractDependentResource.reconcile(AbstractDependentResource.java:39) ~[operator-framework-core-3.1.1.jar:?]
	at io.javaoperatorsdk.operator.processing.dependent.workflow.WorkflowReconcileExecutor$NodeReconcileExecutor.run(WorkflowReconcileExecutor.java:165) ~[operator-framework-core-3.1.1.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	... 3 more

Environment

Kubernetes cluster type:

Development cluster; Kubernetes 1.24.3. The cluster is managed by Gardener [1].

$ Mention java-operator-sdk version from pom.xml file

3.1.1

$ java -version

openjdk version "17.0.3" 2022-04-19 LTS
OpenJDK Runtime Environment SapMachine (build 17.0.3+6-LTS-sapmachine)
OpenJDK 64-Bit Server VM SapMachine (build 17.0.3+6-LTS-sapmachine, mixed mode

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.3", GitCommit:"aef86a93758dc3cb2c658dd9657ab4ad4afc21cb", GitTreeState:"clean", BuildDate:"2022-07-13T14:21:56Z", GoVersion:"go1.18.4", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.3", GitCommit:"aef86a93758dc3cb2c658dd9657ab4ad4afc21cb", GitTreeState:"clean", BuildDate:"2022-07-13T14:23:26Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}

Possible Solution

Additional context

Some debugging showed that the operator tries to create the StatefulSet instead of applying/updating it. The reason is that the DefaultPrimaryToSecondaryIndex for the dependent StatefulSet resource is empty. the onAddOrUpdate()-method of the index is never called, when the operator starts. For the other dependent resources in the workflow onAddOrUpdate() is called.

[1] https://gardener.cloud/

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions