Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Storageclass overrides in PVC defininition #209

Open
boedy opened this issue Aug 10, 2023 · 4 comments
Open

Support Storageclass overrides in PVC defininition #209

boedy opened this issue Aug 10, 2023 · 4 comments

Comments

@boedy
Copy link

boedy commented Aug 10, 2023

I would like to suggest a feature that I believe would substantially augment the adaptability and usability of the LINSTOR CSI plugin within Kubernetes environments.

Feature:

Introduce the ability for users to override certain StorageClass parameters directly within a Persistent Volume Claim (PVC) using labels.

Motivation:

The current setup mandates that once a StorageClass is defined and referenced by PVCs, any change to its parameters necessitates the creation of a new StorageClass as StorageClass are immutable. Not only can this become cumbersome, but it also poses significant challenges when there's a need to avoid workload downtime. Being unable to adjust the storage parameters without redeploying the workload or redefining a StorageClass is a limitation, especially when it's hard to predict if an initial StorageClass definition will remain suitable over time. By permitting parameter overrides at the PVC label level, we can introduce flexibility without requiring workload disruptions.

Benefits:

  1. Adaptability: Ability to fine-tune storage parameters on the fly for specific needs without the creation of multiple StorageClass definitions.
  2. Simplified Management: Mitigates the need to proliferate StorageClasses for minor variations.
  3. Seamless Workflow Continuity: Eliminates the requirement to redeploy workloads when StorageClass parameters need to change, thus preventing potential downtime.
  4. Optimized Resource Allocation: Enables precision in storage resources allocation based on evolving workload needs.

Proposed Usage Example

Using PVC labels to override StorageClass parameters:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-dynamic-pvc
  labels:
    linstor.csi.linbit.com/placementCount: "2"
    linstor.csi.linbit.com/allowRemoteVolumeAccess: "true"
    # Add other parameters as required...
spec:
  storageClassName: my-default-linstor-storageclass
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Some of the Parameters that come to mind which could be eligible

  • linstor.csi.linbit.com/placementCount
  • linstor.csi.linbit.com/allowRemoteVolumeAccess
  • linstor.csi.linbit.com/replicasOnSame
  • linstor.csi.linbit.com/replicasOnDifferent
  • property.linstor.csi.linbit.com/DrbdOptions/auto-quorum
  • property.linstor.csi.linbit.com/DrbdOptions/Resource/on-no-data-accessible
  • property.linstor.csi.linbit.com/DrbdOptions/Resource/on-suspended-primary-outdated
  • property.linstor.csi.linbit.com/DrbdOptions/Net/rr-conflict

A Personal Note on StorageClass Design:

I'd like to share a genuine concern I face when working with Linstor's storage layer within Kubernetes. Designing a StorageClass always feels like a high-stakes commitment. I am frequently apprehensive about the configurations I set, knowing that they can impact performance, resilience, and even the very accessibility of the data. It's a daunting task to anticipate all the future needs and behaviours of the workloads we deploy.

This fear is exacerbated by the understanding that, once a workload begins using a specific StorageClass, making adjustments or corrections can be a complex and potentially disruptive process. Being bound to an initial StorageClass configuration, especially when one discovers it's not optimal, is a significant pain point.

It is this very concern that makes the proposed feature so appealing. The ability to override parameters on a per-PVC basis provides a safety net, allowing for adjustments and refinements without the weight of recreating or significantly altering existing infrastructure. It would be a game-changer in alleviating the pressures and apprehensions of StorageClass design.

Incorporating this feature would offer a much-appreciated layer of flexibility for managing storage with LINSTOR CSI in Kubernetes without the need

@WanzenBug
Copy link
Member

Something like this is already being worked on upstream: There is this alpha-level call in the CSI spec: https://github.com/container-storage-interface/spec/blob/master/spec.md#controllermodifyvolume
And here is the related KEP: https://github.com/kubernetes/enhancements/tree/fcc6bbdf1161e369ea156e6221a3f5820acbb749/keps/sig-storage/3751-volume-attributes-class

That being said, it might a long time until this is ready. What I could see instead is making LINSTOR CSI less strict with matching resource groups to parameters. What do I mean by that?

Right now, when you create a Storage Class, LINSTOR CSI will create a Resource Group matching the parameters you created. It will also keep that resource group in sync with all the implicit defaults it has (like layer list "drbd storage", even if you have not set a layer list).

What we could do instead is only caring about those parameters in the SC that are actually set by the user, letting the rest be defaulted. This also means a user can change RG level settings without interfering with the CSI driver.

I would like to keep the CSI Driver as generic as possible, which also means I do not want to introduce any dependencies on K8s if I don't have to. The CSI driver can also be used in other projects, such as Nomad. Watching the labels on a PV would defeat that.

@boedy
Copy link
Author

boedy commented Aug 25, 2023

Thanks for the swift response @WanzenBug and for pointing out the ongoing work in the CSI spec and the related KEP. It's encouraging to hear that the broader ecosystem is also recognising the need for this kind of flexibility.

I appreciate your suggestion to make LINSTOR CSI less strict when it comes to matching Resource Groups to parameters set in the StorageClass. This certainly adds a layer of flexibility, allowing users to alter Resource Group-level settings without affecting the CSI driver's behavior. However, this approach does still leave us with the problem that certain parameters are effectively "set in stone" once they're defined. I wonder if this might encourage a behavior where users avoid setting defaults just to retain the ability to modify parameters later on.

Another angle to consider is perhaps Linstor's existing feature that allows moving a volume definition from one resource group to another. This could be a powerful mechanism to more dynamically manage storage parameters. For example, one could define multiple Resource Groups with various configurations and move volume definitions between them as needs change. This would offer a path to adjust storage parameters dynamically post-deployment.

Could this feature be integrated into the CSI driver's behavior to offer a more adaptive storage management experience? I understand that you want to keep the driver as generic and flexible as possible, especially to be compatible with platforms beyond Kubernetes, like Nomad. However, if this existing Linstor feature could be leveraged, it might offer a solution that aligns with your design goals while also addressing this need for greater adaptability.

@js185692
Copy link

js185692 commented Apr 5, 2024

Hi @WanzenBug, I was wondering if there was an update on this feature? Is there an issue tracking this work or are we waiting for the VolumeAttributesClass object to become available in Kubernetes? From what I can understand, in Linstor it's possible to define the placement count (to take as an example) on a resource by resource basis, but this doesn't seem to be supported in Piraeus?

@WanzenBug
Copy link
Member

There hasn't been much to report. We will eventually implement VolumeAttributeClass, but we will need to take a look at what parameters we actually can support.

Regarding something like a per-resource placement count for example: this is no longer viable, as more recent LINSTOR versions actually have a periodic task that enforces the placement count of the RG on all derived RD.

With the VolumeAttributeClass, this isn't a problem, as that again works on the StorageClass level. I believe "per resource" options will always run the risk of being later overriden by a change in the RG, with perhaps the exception of some DRBD options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants