Statefulset vs deployment (HA-readiness) #426

Closed
opened 2023-04-01 07:46:03 +00:00 by pat-s · 7 comments
pat-s commented 2023-04-01 07:46:03 +00:00 (Migrated from gitea.com)

As per https://gitea.com/gitea/helm-chart/pulls/350#issuecomment-717098

Need to better.understand possible migration consequences and other implications.

Statefulset

What is already not working when using statefulsets (even though I hoped it) is when one writes an issue/PR with text that is not yet applied, i.e. just appearing in a temporary state and then the pod is recreated, all content is being lost.

This applies to any site reload though, even if the pod is going to be touched.
This is different to GH for example, which preserves the content when the site is reloaded. Though this looks like a general Gitea shortcoming that is not related to whether a statefulset or deployment is being used: https://github.com/go-gitea/gitea/issues/23290

Deployment

As per https://gitea.com/gitea/helm-chart/pulls/350#issuecomment-717098 Need to better.understand possible migration consequences and other implications. ## Statefulset What is already not working when using statefulsets (even though I hoped it) is when one writes an issue/PR with text that is not yet applied, i.e. just appearing in a temporary state and then the pod is recreated, all content is being lost. This applies to any site reload though, even if the pod is going to be touched. This is different to GH for example, which preserves the content when the site is reloaded. Though this looks like a general Gitea shortcoming that is not related to whether a statefulset or deployment is being used: https://github.com/go-gitea/gitea/issues/23290 ## Deployment
justusbunsi commented 2023-04-01 12:05:57 +00:00 (Migrated from gitea.com)

What would be the benefit of a Deployment over an RWX volume used by the StatefulSet?

And I'm not sure how the site reload topic is related to StatefulSets. Isn't this the browser "caching/saving" the form state while not submitted? Or am I misunderstanding you?

What would be the benefit of a Deployment over an RWX volume used by the StatefulSet? And I'm not sure how the site reload topic is related to StatefulSets. Isn't this the browser "caching/saving" the form state while not submitted? Or am I misunderstanding you?
justusbunsi commented 2023-04-01 12:22:50 +00:00 (Migrated from gitea.com)

Nevermind my question about benefit. Found your discussion on that regards in #350.

Nevermind my question about benefit. Found your discussion on that regards in #350.
pat-s commented 2023-04-01 12:34:31 +00:00 (Migrated from gitea.com)

What would be the benefit of a Deployment over an RWX volume used by the StatefulSet?

I don't know yet for sure, doing some research myself.
I've found this article helpful: https://www.portainer.io/blog/why-are-stateful-containers-so-confusing

I think with a deployment one can have a central RWX for all pods (if the app supports it, i.e. doesn't place a lock when doing something which then prevents the other replica from doing something) whereas for statefulsets you always have dedicated PVs attached to each replica of which each has their own state.

Yet I think Gitea would allow us to use deployments with a central RWX as most of the "problematic" lock-related actions are happening in possibly external components (DB, memcache, etc).

Yet please take these statements with a grain of salt as I am still doing research, hence they might be wrong.

And I'm not sure how the site reload topic is related to StatefulSets. Isn't this the browser "caching/saving" the form state while not submitted? Or am I misunderstanding you?

Yes, this is something which is unrelated to the helm chart behavior and an "issue" of Gitea itself. I commented on that in the OP already a bit. (At first I thought this could be related to k8s behavior).

> What would be the benefit of a Deployment over an RWX volume used by the StatefulSet? I don't know yet for sure, doing some research myself. I've found this article helpful: https://www.portainer.io/blog/why-are-stateful-containers-so-confusing I think with a deployment one can have a central RWX for all pods (if the app supports it, i.e. doesn't place a lock when doing something which then prevents the other replica from doing something) whereas for statefulsets you always have dedicated PVs attached to each replica of which each has their own state. Yet I think Gitea would allow us to use deployments with a central RWX as most of the "problematic" lock-related actions are happening in possibly external components (DB, memcache, etc). Yet please take these statements with a grain of salt as I am still doing research, hence they might be wrong. > And I'm not sure how the site reload topic is related to StatefulSets. Isn't this the browser "caching/saving" the form state while not submitted? Or am I misunderstanding you? Yes, this is something which is unrelated to the helm chart behavior and an "issue" of Gitea itself. I commented on that in the OP already a bit. (At first I thought this *could* be related to k8s behavior).
justusbunsi commented 2023-04-01 12:44:55 +00:00 (Migrated from gitea.com)

Don't we already achieve a centralized storage with StatefulSet when requiring a RWX volume claim to be provisioned externally and use it as existingClaim in persistence object? Surely, letting the StatefulSet create PVCs for each replica would create different storages. But last time I checked on the HA PR and checked it out, I think this did the trick. IIRC I even checked that by creating tmp files in one pod and checked their existence in another pod.

Don't we already achieve a centralized storage with StatefulSet when requiring a RWX volume claim to be provisioned externally and use it as existingClaim in persistence object? Surely, letting the StatefulSet create PVCs for each replica would create different storages. But last time I checked on the HA PR and checked it out, I think this did the trick. IIRC I even checked that by creating tmp files in one pod and checked their existence in another pod.
justusbunsi commented 2023-04-01 12:51:32 +00:00 (Migrated from gitea.com)

Additional note: PVCs created by StatefulSets won't be deleted when deleting the StatefulSet as resource. IIRC this is different with Deployment related storage and could cause data loss if not cautious enough.

And StatefulSets provide an ordered update rollout and initial installation. One pod after another. Not all together. Which could make leader election easier if not built into Gitea itself.

Additional note: PVCs created by StatefulSets won't be deleted when deleting the StatefulSet as resource. IIRC this is different with Deployment related storage and could cause data loss if not cautious enough. And StatefulSets provide an ordered update rollout and initial installation. One pod after another. Not all together. Which could make leader election easier if not built into Gitea itself.
pat-s commented 2023-04-01 18:13:51 +00:00 (Migrated from gitea.com)

Don't we already achieve a centralized storage with StatefulSet when requiring a RWX volume claim to be provisioned externally and use it as existingClaim in persistence object? Surely, letting the StatefulSet create PVCs for each replica would create different storages. But last time I checked on the HA PR and checked it out, I think this did the trick.

I don't know. Could be. It most likely depends on the specific definition in the YAML? From what I've seen in other charts we use, most which use statefulsets go the way of having multiple RWO's attached, i.e. each replica has their own PV.

Instead, most other charts which claim to be HA-ready often work with deployments and a central RWX for persistent storage. For these charts, the PVC is not being deleted if the chart is uninstalled. I guess this is also a config setting (?).

Again note that all of these are just observations of mine, not facts I am sure about.

Additional note: PVCs created by StatefulSets won't be deleted when deleting the StatefulSet as resource. IIRC this is different with Deployment related storage and could cause data loss if not cautious enough.

See above, I think this can also be achieved with deployments - at least this is how it works for (deployment)charts we use in our clusters.

And StatefulSets provide an ordered update rollout and initial installation. One pod after another. Not all together. Which could make leader election easier if not built into Gitea itself.

Yeah that's also what I got as a main difference. Yet I don't know if Gitea would really require that and given that there is already a discussion for a built-in leader/cluster feature, I am not sure if we should aim for a "manual" solution in the chart.
But again, I don't know if Gitea would really need this. All the DB and cache tasks would be outsourced to other chart deps or external resources anyhow in my currently envisioned setup?

I guess overall there's no way around testing all the different options a bit to get a better feeling what is going on and how Gitea reacts 😄

EDIT:

Kubernetes Deployment is usually used for stateless applications. However, we can save the state of Deployment by attaching a Persistent Volume to it and make it stateful. The deployed pods will share the same Volume, and the data will be the same across all of them.

https://www.baeldung.com/ops/kubernetes-deployment-vs-statefulsets#2-deployment-components-in-kubernetes

> Don't we already achieve a centralized storage with StatefulSet when requiring a RWX volume claim to be provisioned externally and use it as existingClaim in persistence object? Surely, letting the StatefulSet create PVCs for each replica would create different storages. But last time I checked on the HA PR and checked it out, I think this did the trick. I don't know. Could be. It most likely depends on the specific definition in the YAML? From what I've seen in other charts we use, most which use statefulsets go the way of having multiple RWO's attached, i.e. each replica has their own PV. Instead, most other charts which claim to be HA-ready often work with deployments and a central RWX for persistent storage. For these charts, the PVC is not being deleted if the chart is uninstalled. I guess this is also a config setting (?). Again note that all of these are just observations of mine, not facts I am sure about. > Additional note: PVCs created by StatefulSets won't be deleted when deleting the StatefulSet as resource. IIRC this is different with Deployment related storage and could cause data loss if not cautious enough. See above, I think this can also be achieved with deployments - at least this is how it works for (deployment)charts we use in our clusters. > And StatefulSets provide an ordered update rollout and initial installation. One pod after another. Not all together. Which could make leader election easier if not built into Gitea itself. Yeah that's also what I got as a main difference. Yet I don't know if Gitea would really require that and given that there is already a discussion for a built-in leader/cluster feature, I am not sure if we should aim for a "manual" solution in the chart. But again, I don't know if Gitea would really need this. All the DB and cache tasks would be outsourced to other chart deps or external resources anyhow in my currently envisioned setup? I guess overall there's no way around testing all the different options a bit to get a better feeling what is going on and how Gitea reacts 😄 EDIT: > Kubernetes Deployment is usually used for stateless applications. However, we can save the state of Deployment by attaching a Persistent Volume to it and **make it stateful**. The deployed pods will share the same Volume, and the data will be the same across all of them. https://www.baeldung.com/ops/kubernetes-deployment-vs-statefulsets#2-deployment-components-in-kubernetes
pat-s commented 2023-05-03 07:49:31 +00:00 (Migrated from gitea.com)

Conclusion for now: deployments are fine if no internal leader selection is required and pods can act "standalone". Both are possible in HA, it's in some ways a matter of taste.

For the planned HA setup in #437 we most likely go with a Deployment.

Conclusion for now: deployments are fine if no internal leader selection is required and pods can act "standalone". Both are possible in HA, it's in some ways a matter of taste. For the planned HA setup in #437 we most likely go with a Deployment.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: lunny/helm-chart#426
No description provided.