[Breaking] HA-support via Deployment
#437
Reference in New Issue
Block a user
No description provided.
Delete Branch "refs/pull/437/head"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Changes
A big shoutout to @luhahn for all his work in #205 which served as the base for this PR.
Documentation
docs/ha-setup.md
to not have a very long README (which is already quite long).Most of the information below should go into it with more details and explanations behind all of the individual components.
Chart deps
- Addsmeilisearch
as a chart dependency for a HA-ready issue indexer. Only works with >= Gitea 1.20- Addsredis
as a chart dependency for a HA-ready session and queue store.redis-cluster
as a chart dependency for a HA-ready session and queue store (alternative toredis
). Only works with >= Gitea 1.19.2.memcached
instead ofredis-cluster
postgresql-ha
as default DB dep in favor ofpostgres
Adds smart HA chart logic
The goal is to set smart config values that result in a HA-ready Gitea deployment if
replicaCount
> 1.replicaCount
> 1,gitea.config.session.PROVIDER
is automatically set toredis-cluster
gitea.config.indexer.REPO_INDEXER_ENABLED
is automatically set tofalse
unless the value iselasticsearch
ormeilisearch
redis-cluster
is used for[queue]
and[cache]
and[session]
mode or notConfiguration of external instances of
meilisearch
andminio
are documented in a new markdown doc.Deployment vs Statefulset
Given all the discussions about this lately (#428), I think we could use both.
In the end, we do not have the requirement for a sequential pod scale up/scale down as it would happen in statefulsets.
On the other side, we do not have actual stateless pods as we are attaching a RWX to the deployment.
Yet I think because we do not have a leader-election requirement, spawning the pods as a deployment makes "Rolling Updates" easier and also signals users that there is no "leader election" logic and each pod can just be "destroyed" at anytime without causing interruption.
Hence I think we should be able to switch from a statefulset to a deployment, even in the single-replica case.
This change also brought up a templating/linting issue: the definition of
.Values.gitea.config.server.SSH_LISTEN_PORT
inssh-svc.yaml
just "luckily" worked so far due to naming-related lint processing. Due to the change from "statefulset" to "deployment", the processing queue changed and caused a failure complaining aboutconfig.server.SSH_LISTEN_PORT
not being defined yet.The only way I could see to fix this was to "properly" define the value in
values.yaml
instead of conditionally definining it inhelpers.tpl
. Maybe there's a better way?Chart PVC Creation
I've adapted the automated PVC creation from another chart to be able to provide the
storageClassName
as I couldn't get dynamic provisioning for EFS going with the current implementation.In addition the naming and approach within the Gitea chart for PV creation is a bit unusual and aligning it might be beneficial.
A semi-unrelated change which will result in a breaking change for existing users but this PR includes a lot of breaking changes already, so including another one might not make it much worse...
persistence.mount
: whether to mount an existing PVC (viapersistence.existingClaim
persistence.create
: whether to create a new PVCTesting
As this PR does a lot of things, we need proper testing.
The helm chart can be installed from the Git branch via
helm-git
as follows:It is highly recommended to test the chart in a dedicated namespace.
I've tested this myself with both
redis
andredis-cluster
and it seemed to work fine.I just did some basic operations though and we should do more niche testing before merging.
Examplary
values.yml
for testing (only needs a valid RWX storage class):values.yaml
Preferred setup
The preferred HA setup with respect to performance and stability might currently be as follows:
This will result in a ~ 10-pod HA setup overall.
All pods have very low resource requests.
fix #98
your sample values seems wrong. indexer settings below minio persistence?
I don't understand what you mean. Can you be more specific or comment directly in the diff?Ah got it now! Indeed, a c/p mistake.
We could drop the memcached dependency and use redis by default. I mean, this PR adds both redis and redis-cluster as dependency. For non-HA redis dependency can be used where memcached is used. What's your opinion, @pat-s?
Dropping memcached is a good idea, redis can do everything memcached is currently doing. And we would have to manage less dependencies. (Especially since this introduces redis + redis-ha as dependency)
Yeah I guess so. Also having less decisions to take makes things easier for users.
The only "downside" is that the default deployment would then have 5-6 pods by default instead of 2 (gitea + memcached).
I am not 100% sure but atm I think both are HA-ready - the second option, i.e. redis-cluster, is just a different way of how the redis components/workers talk to each other and are set up in the first place. Just because you labeled the second one as
redis-ha
and by this implying that the other one would not be HA-ready 🙂 (but please correct me if I'm wrong!)We should probably use the "normal"
redis
as the default asredis-cluster
was not functional until lately and just got fixed in Gitea HEAD. I did a backport but this means one needs at least 1.19.2 - and to use 1.18 or older, one would need to useredis
and notredis-cluster
.Oops. I meant redis-cluster instead redis-ha. Both redis variants can have multiple replicas. But when it comes to truly high availability, I experienced a downtime with the non-cluster redis when updating the deployment and replacing the pods. With the cluster setup I did not experienced that. That's why I drew the thin line between redis and redis-cluster in my previous comment.
For non-HA Gitea the common redis would work fine. Both dependencies have their use cases.
Non-HA would have 3: 1xGitea, 1xPostgres, 1xRedis
HA would have: Gitea, Postgres, Redis(-cluster), MinIO. All replicated by at least 2. MinIO must have at least 3 or 4 to be stable. Not sure about the recommended replicas for the others.
Is that what you mean by 5-6 Pods?
Ah that's interesting! I will double-check that as I had in my mind that I had no disconnects with both of them. But if this would be the case, we could also think of only keeping
redis-cluster
and backport the Gitea-core support to 1.18 and just keep the dependency list "small" then?EDIT: just tested again and did not face any disconnects when using
redis
with 1 master and 2 replicas while destroying/deleting one of the two Gite replicas. Was your setup different in some way?I left minio out here as it's optional with a RWX keeping the storage (probably more performant but not mandatory). It's certainly recommended but that's why I left it as 5-6 (Gitea, + redis). In addition I forgot to count a PG-HA deployment as I always work with an external Postgres.
So I'd see the following pod counts:
All: Gitea HA (2) + PG HA (2) + Redis Cluster (4) + Minio (4) + Meilisearch (2) = 14
Minimum: Gitea HA (2) + PG HA (2) + Redis Cluster (4) + Meilisearch (2) = 10
I had disconnects to Redis when actually upgrading Redis, not replacing Gitea pods. IIRC, this was not the case when using redis-cluster. Although, I could not test Gitea itself due to missing cluster support back then. So I tested it otherwise. But maybe I did something wrong.
It would be great if we keep the simplicity, even if Gitea is HA capable. Consider a new user, ready to get Gitea up and running in a Kubernetes cluster, trying to get their code self-hosted. What you need for that:
Everything else is optional, IMO. Especially repository code search.
Therefore, we should keep both redis dependencies, IMO. Or is it possible to "misuse"
redis-cluster
as a single-instance? In that case we could drop theredis
dependency.Additionally I don't think we should add MinIO as a dependency. If I see it right, attachments and avatars would work within the RWX storage that is shared across the Deployment replicas. If someone wants to use MinIO for that, they should deploy it outside of this Helm Chart. MinIO itself is a massive setup - for a truly HA setup, you have to run at least 4 replicas in
distributed
mode (absolute minimal requirement for leader election). The non-default production values set for that would be ~130 lines (no blank lines). This includes different buckets for avatars and attachments each. That's probably nothing we want to have in a Gitea Helm Chart that should be much lightweight than those of unnamed alternatives 😉. Currently the proposed setup does not look like its simpler.Ah! Did you have
redis
running in HA with multiple replicas? I just tried it and the problems seems to be theredis-master-0
pod. One can destroy the replicas without issues but once the master pod goes down, Gitea has an issue.There is experimental support for multiple masters in the chart (https://github.com/bitnami/charts/blob/main/bitnami/redis/README.md#multiple-masters-experimental).
Overall I guess it's just easier to go with
redis-cluster
only then and removeredis
-> less choices and work for both HA and non-HA.Definitely agree. The question is only: do we want provide a skeleton in the chart to "easily" add these additional components or do we "just" provide documentation on how this can be done?
I guess we could just go with
redis-cluster
even for single-instance. Yes it requires some additional pods but users can then also easily go HA if they want to. And don't need to switch betweenredis
andredis-cluster
.I see your point. I am fine with not adding it and only documenting it as an alternative to the RWX storage.
That's true, yes.
Not sure if you always need distinct buckets for each and in my test install right now I don't have 130 config lines but again, I am fine with leaving minio out and just documenting it.
I think this is also a question how you look at it: "lightweight" in terms of defaults and/or resources but rich in terms of expansion options - or lightweight in terms of general capabilities and chart dependencies.
Overall I think we might make our life easier with not adding too many dependencies into the chart but just documenting which services one needs to provision in addition to fully go HA. And only adding
redis-cluster
as a required dependency which would also directly work for HA then. Sounds like we all agree on this? If so, I would strip the PR down to onlyredis-cluster
and just documenting all other optional HA deps.(for testing purposes, it was still good/nice to be able to bootstrap a HA setup with all deps in this PR.)
I tested both master/replica and sentinel approach. In both cases the k8s service did not properly switch from the downscaled leader to a replica or sentinel, causing the disconnect.
Sounds great. Let's go this way. 👍
+1 for documenting MinIO as an alternative.
A
yes
from my side. Stripping it down toredis-cluster
with a simple templating skeleton to easily toggle HA and scale the necessary components properly will be a huge improvement for the Chart.Such large-scale skeleton would require the other dependencies, but would be neat in a long-run. Maybe we should start with documenting them first. As soon as we have more stability in the chart (e.g. tests, automated e2e deploys, values integrity validation, renovate automation), it will become much easier to add new dependencies and handle such complexity.
💯. I appreciate the time you already spent on this PR. It's awesome to see Gitea being a solid competitor to the well-known big players.
In other Charts I've seen example values for various use cases. Those in pair with documenting the options of MinIO and Meilisearch may help users to find the right setup for their needs. But such example values are nothing that needs to be done in a first step.
@ -3,6 +3,26 @@
Expand the name of the chart.
*/}}
{{- /* multiple replicas assertions */ -}}
@justusbunsi I've found a neat (but hacky) solution to apply some assertions. Unfortunately helm does not provide a native way to check for certain incompatibilities of settings - or at least I couldn't find one.
I've tested these and they work as expected. I think in the longterm these checks can save quite some users from some faulty HA-deployments as it specficially looks for settings which are incompatible in a HA setup.
What do you think?
@justusbunsi I've exposed these settings in
values.yml
as these are their actual defaults. These are applied conditionally inhelpers.tpl
(https://gitea.com/gitea/helm-chart/src/branch/main/templates/_helpers.tpl#L262) without a clear exposure to the user.With these settings being defined, we could also remove the condition in
helpers.tpl
?@justusbunsi @luhahn I wonder if we also should swap
postgres
forpostgres-ha
, similar as we are switching toredis-cluster
instead ofredis
.This would make a functional HA deployment easier right from the start for new users as they would not need to migrate from a single-instance PG deployment to a HA-ready PG one (and only move storage).
The downside is that it would be a hard break for all current users as they would be forced to migrate their DB. We could also have both side by side but then we are again entering "maintenance hell" to some degree.
This PR is breaking in so many ways (cache dependency, PVC creation, statefulset -> deployment) that introducing another breaking change would arguably not be too disruptive.
What are your thoughts?
Overall I am pretty much done (aside of the PG discussion above) and everything works "fine" so far in all of my tests. The biggest challenge would most likely be to provide migratin instructions to users and also test this ourselves.
@justusbunsi
I made a brave move and switched our production instance to this branch (ofc after having tested in dev before) and I can report that everything seems to work well so far.
I am running with the following setup now
'db'
The transition was not that complicated, I only had backup and move
/data
to the new RWX PVC.This PR also adds
topologySpreadConstraints
which helps to distribute the replicas across nodes.Site operations are slower now due to EFS compared to the former RWO but that was expected. And is nothing Gitea can do anything about.
I have added a note about the required PV transition. User will need to specify
existingClaim
as otherwise the chart will (probably) create a new PVC due to the new resource.Last, I'd like to point out again that I added a few assertions in
config.yaml
which apply some checks during startup which should prevent erroneous configurations. AFAICS there is no other, more elegant way to do this in charts at the moment.Currently there are no blockers for merging this. Meilisearch will be supported with > 1.20 but 'db' for the ISSUE_INDEXER works.
The only open question is whether we should switch from postgres to postgres-HA as the former might cause issues when going HA. And given that we have already opted for redis-cluster by default, I think we should also switch to postgres-HA.
For backward-comp I think we need to keep the normal
postgres
chart as a dependency for a bit so that users do not have to make a forced DB transition.My suggestion would be to have
postgres-ha
as the default and remove the deprecatedpostgres
dependency (which is off by default) at some point in the future.Ref: https://github.com/go-gitea/gitea/issues/13791
There are still some problems when running a HA mode.
Good to know but without checking how it goes in a running instance, we won't know if they are "real" issues or just "would be good to have/clean up" 🙂️
The good news is that the first day(s) have been without issues in my org - haven't checked on the CRON things yet though.
Merging after confirmation from within Discord.