From legacy approach to server-side apply

From version 5 of Java Operator SDK server side apply is a first-class feature and is used by default to update resources. As we will see, unfortunately (or fortunately), using it requires changes for your reconciler implementation.

For this reason, we prepared a feature flag, which you can flip if you are not prepared to migrate yet: ConfigurationService.useSSAToPatchPrimaryResource

Setting this flag to false will make the operations done by UpdateControl using the former approach (not SSA). Similarly, the finalizer handling won’t utilize SSA handling. The plan is to keep this flag and allow the use of the former approach (non-SSA) also in future releases.

For dependent resources, a separate flag exists (this was true also before v5) to use SSA or not: ConfigurationService.ssaBasedCreateUpdateMatchForDependentResources

Resource handling without and with SSA

Until version 5, changing primary resources through UpdateControl did not use server-side apply. So usually, the implementation of the reconciler looked something like this:


 @Override
  public UpdateControl<WebPage> reconcile(WebPage webPage, Context<WebPage> context) {
        
    reconcileLogicForManagedResources(webPage);
    webPage.setStatus(updatedStatusForWebPage(webPage));
    
    return UpdateControl.patchStatus(webPage);
  }

In other words, after the reconciliation of managed resources, the reconciler updates the status of the primary resource passed as an argument to the reconciler. Such changes on the primary are fine since we don’t work directly with the cached object, the argument is already cloned.

So, how does this change with SSA? For SSA, the updates should contain (only) the “fully specified intent”. In other words, we should only fill in the values we care about. In practice, it means creating a fresh copy of the resource and setting only what is necessary:


@Override
public UpdateControl<WebPage> reconcile(WebPage webPage, Context<WebPage> context) {

    reconcileLogicForManagedResources(webPage);

    WebPage statusPatch = new WebPage();
    statusPatch.setMetadata(new ObjectMetaBuilder()
            .withName(webPage.getMetadata().getName())
            .withNamespace(webPage.getMetadata().getNamespace())
            .build());
    statusPatch.setStatus(updatedStatusForWebPage(webPage));

    return UpdateControl.patchStatus(statusPatch);
}

Note that we just filled out the status here since we patched the status (not the resource spec). Since the status is a sub-resource in Kubernetes, it will only update the status part.

Every controller you register will have its default field manager. You can override the field manager name using ControllerConfiguration.fieldManager. That will set the field manager for the primary resource and dependent resources as well.

Migrating to SSA

Using the legacy or the new SSA way of resource management works well. However, migrating existing resources to SSA might be a challenge. We strongly recommend testing the migration, thus implementing an integration test where a custom resource is created using the legacy approach and is managed by the new approach.

We prepared an integration test to demonstrate how such migration, even in a simple case, can go wrong, and how to fix it.

To fix some cases, you might need to strip managed fields from the custom resource.

See StatusPatchSSAMigrationIT for details.

Feel free to report common issues, so we can prepare some utilities to handle them.

Optimistic concurrency control

When you create a resource for SSA as mentioned above, the framework will apply changes even if the underlying resource or status subresource is changed while the reconciliation was running. First, it always forces the conflicts in the background as advised in Kubernetes docs, in addition to that since the resource version is not set it won’t do optimistic locking. If you still want to have optimistic locking for the patch, use the resource version of the original resource:

@Override
public UpdateControl<WebPage> reconcile(WebPage webPage, Context<WebPage> context) {

    reconcileLogicForManagedResources(webPage);

    WebPage statusPatch = new WebPage();
    statusPatch.setMetadata(new ObjectMetaBuilder()
            .withName(webPage.getMetadata().getName())
            .withNamespace(webPage.getMetadata().getNamespace())
            .withResourceVersion(webPage.getMetadata().getResourceVersion())
            .build());
    statusPatch.setStatus(updatedStatusForWebPage(webPage));

    return UpdateControl.patchStatus(statusPatch);
}

Using k8s' ETCD as your application DB

FAQ: Is Kubernetes’ ETCD the Right Database for My Application?

Answer

While the idea of moving your application data to Custom Resources (CRs) aligns with the “Cloud Native” philosophy, it often introduces more challenges than benefits. Let’s break it down:


Top Reasons Why Storing Data in ETCD Through CRs Looks Appealing

  1. Storing application data as CRs enables treating your application’s data like infrastructure:
    • GitOps compatibility: Declarative content can be stored in Git repositories, ensuring reproducibility.
    • Infrastructure alignment: Application data can follow the same workflow as other infrastructure components.

Challenges of Using Kubernetes’ ETCD as Your Application’s Database

Technical Limitations:

  • Data Size Limitations 🔴:

    • Each CR is capped at 1.5 MB by default. Raising this limit is possible but impacts cluster performance.
    • Kubernetes ETCD has a storage cap of 2 GB by default. Adjusting this limit affects the cluster globally, with potential performance degradation.
  • API Server Load Considerations 🟡:

    • The Kubernetes API server is designed to handle infrastructure-level requests.
    • Storing application data in CRs might add significant load to the API server, requiring it to be scaled appropriately to handle both infrastructure and application demands.
    • This added load can impact cluster performance and increase operational complexity.
  • Guarantees 🟡:

    • Efficient queries are hard to implement and there is no support for them.
    • ACID properties are challenging to leverage and everything holds mostly in read-only mode.

Operational Impact:

  • Lost Flexibility 🟡:

    • Modifying application data requires complex YAML editing and full redeployment.
    • This contrasts with traditional databases that often feature user-friendly web UIs or APIs for real-time updates.
  • Infrastructure Complexity 🟠:

    • Backup, restore, and lifecycle management for application data are typically separate from deployment workflows.
    • Storing both in ETCD mixes these concerns, complicating operations and standardization.

Security:

  • Governance and Security 🔴:
    • Sensitive data stored in plain YAML may lack adequate encryption or access controls.
    • Applying governance policies over text-based files can become a significant challenge.

When Might Using CRs Make Sense?

For small, safe subsets of data—such as application configurations—using CRs might be appropriate. However, this approach requires a detailed evaluation of the trade-offs.


Conclusion

While it’s tempting to unify application data with infrastructure control via CRs, this introduces risks that can outweigh the benefits. For most applications, separating concerns by using a dedicated database is the more robust, scalable, and manageable solution.


A Practical Example

A typical “user” described in JSON:

{
  "username": "myname",
  "enabled": true,
  "email": "myname@test.com",
  "firstName": "MyFirstName",
  "lastName": "MyLastName",
  "credentials": [
    {
      "type": "password",
      "value": "test"
    },
    {
      "type": "token",
      "value": "oidc"
    }
  ],
  "realmRoles": [
    "user",
    "viewer",
    "admin"
  ],
  "clientRoles": {
    "account": [
      "view-profile",
      "change-group",
      "manage-account"
    ]
  }
}

This example represents about 0.5 KB of data, meaning (with standard settings) a maximum of ~2000 users can be defined in the same CR. Additionally:

  • It contains sensitive information, which should be securely stored.
  • Regulatory rules (like GDPR) apply.

References