replicaCount fails on other values than 1 #604
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Is this line intended?
https://gitea.com/gitea/helm-chart/blame/branch/main/templates/_helpers.tpl#L8
I'm not sure about bleve few lines below, but I think it could work correctly when there is RWX PV. Updates are synchronized with queue and ran on the single instance.
Multiple replicas with individual RWO PVs won't work as there will be write conflicts between different sessions. There is no central managament within Gitea which would handle/queue those parallel writes, causing potential conflicts or failures if they would occur. Which is why we are requiring a single RWX with multiple replicas.
This is hard to test on your own as you would need to mock almost parallel write requests from multiple sessions to trigger the issue.
(Yet I am not fully sure I understand your second comment. -> What "could work if there is a RWX PVC"?)
Closing as likely stale, feel free to continue commenting if needed.
Probably I'm wrong. Please explain.
Lines 9-23 seems to be a dead code. I might misunderstood the purpose of it, but I don't think you can actually trigger those lines.
If I'm wrong, what is the values.yaml file, that could trigger the error messages from lines 9-23?
My sugesstion is remove dead code or refactor missplaced failure (I can prepare that change, but I'm not sure if I'm missing something).
In the case of the second comment I meant gitea is using jobs and queues to manage indexing and it is designed to handle multiple instances. As far as I know it is well tested behavior and I was using it in the past without error (although maybe I was just lucky and there are some known bugs).
Or let me put it in other words: this line means: if replicas is more than 1, fail with unrelevant message.
What is the point of the rest of the checks if this chart is designed to work with single instance only?
To answer the initial question: yep, looks like dead code.
replicaCount: 2
should go into 9-23 and then all the other conditions apply.L.8 is a mistake and should be removed, yes.
829bca241d/templates/_helpers.tpl (L8)
EDIT: I just realized that all assertions in
helpers.tpl
are not effective. Only the ones inconfig.yaml
. They were duplicated anyhow. I did a cleanup and added tests in the linked PR.Given that I am running Gitea in HA and not encountering the failure, something seems off, yes.
I've tested all of the assertions though in a DEV instance back when I added the code, so they are working in general. I will add some tests.
As far as we/I know, Bleve is not "simply" working in multi-instance mode which is why there has been effort to push for another alternative besides elastic which works in such scenarios. I cannot fully proof it though. It surely would be great if bleve would definitely work in HA.
I don't understand that comment. The chart is clearly aiming for HA, see the README etc. l8. is just a mistake.
I wasn't sure if the pointed line is on purpose, and I wasn't sure how to understand your previous comment (I was pointing the line that forces this chart to work in single instance and you were saying this is correct and I wasn't sure why).
I think I was misunderstood. I'm not saying blave can handle multiple instances. As far as I know, gitea uses queues, and do not update blave on multiple instances.
You do need to have queues that can handle multiple instances (like redis).
Is there any documentation that states this is not thread safe and that those assertions are actually needed?
Classic "works on my machine"? 😆
It fails for me. That's why I created this issue.
Otherwise dead code should probably be removed.
I think this was a double-sided misunderstanding.
As the README prominently states, the charts provides (and aims) for a HA setup. And the setup works and is in good use since many months (also on different clusters I am managing).
There were some initial discussions in issues of this repo and in Gitea core where people stated that
bleve
won't work in HA. Which is why all the elastic & friends efforts were pushed. Don't have the links right now though.Hmm strange. I couldn't trigger the failure when creating the unittest and if it would have been effective, nobody should be able to use >1 replica ever since the failure would not be conditioned on anything (it was a c/p mistake likely).
So you're saying
helm install
fails for you withreplicaCount: 2
and the error points to the line inhelpers.tpl
?I am using ArgoCD, so it is not a direct helm install.
It was failing when I was creating the issue.
I will setup test environment and try to reproduce it once again and compare it with plain helm. I will come back with the result.
Hmm interesting, this would mean that the code in
helpers.tpl
is not getting ignored.Wondering what the difference is for Argo compared to
helm install
or the unittest plugin we're using - since I couldn't trigger it there and I also did not fail over it in my own k8s install, which uses multiple replicas.Let's see what you can find out!
As far I have no luck in reproducing exactly the same issue.
Here is what ArgoCD says when I'm trying to set replicaCount to 2:
I am able to reproduce the same result with
Same result for:
This is gitea-10.1.1
EDIT:
It seems like this is other issue. I probably should create a separate ticket for that.
The error might come from
Values.gitea.config.cron.GIT_GC_REPOS
not being defined and hence helm cannot check it.In #611, specifically in
250214aa9e/templates/gitea/config.yaml (L28-L34)
I've updated the existence checking of the value and I think this should solve the issue you're seeing.Can you run the checks on the PR? Otherwise we merge and you can check on the next release.