fix subchart dns macros #359

Closed
jouve wants to merge 1 commits from dns into main
jouve commented 2022-09-10 15:19:23 +00:00 (Migrated from gitea.com)

Description of the change

inspired from https://github.com/bitnami/charts/blob/master/bitnami/common/templates/_names.tpl#L35, add a macro to calculate subchart fullname and use it to calculate dns

trim all metadata.name to not go above 63 chars

Benefits

fix for https://gitea.com/gitea/helm-chart/issues/345

Possible drawbacks

Applicable issues

Additional information

⚠ BREAKING

Checklist

  • Parameters are documented in the values.yaml and added to the README.md using readme-generator-for-helm
  • Breaking changes are documented in the README.md
<!-- Before you open the request please review the following guidelines and tips to help it be more easily integrated: - Describe the scope of your change - i.e. what the change does. - Describe any known limitations with your change. - Please run any tests or examples that can exercise your modified code. Thank you for contributing! We will try to review, test and integrate the change as soon as we can. --> ### Description of the change <!-- Describe the scope of your change - i.e. what the change does. --> inspired from https://github.com/bitnami/charts/blob/master/bitnami/common/templates/_names.tpl#L35, add a macro to calculate subchart fullname and use it to calculate dns trim all metadata.name to not go above 63 chars ### Benefits <!-- What benefits will be realized by the code change? --> fix for https://gitea.com/gitea/helm-chart/issues/345 ### Possible drawbacks <!-- Describe any known limitations with your change --> ### Applicable issues <!-- Enter any applicable Issues here (You can reference an issue using #). Please remove this section if there is no referenced issue. --> - fixes #345 ### Additional information <!-- If there's anything else that's important and relevant to your pull request, mention that information here. Please remove this section if it remains empty. --> ### ⚠ BREAKING <!-- If there's a breaking change, please shortly describe in which way users are affected and how they can mitigate it. If there are no breakings, please remove this section. --> ### Checklist <!-- [Place an '[X]' (no spaces) in all applicable fields. Please remove unrelated fields.] --> - [x] Parameters are documented in the `values.yaml` and added to the `README.md` using [readme-generator-for-helm](https://github.com/bitnami-labs/readme-generator-for-helm) - [x] Breaking changes are documented in the `README.md`
pat-s (Migrated from gitea.com) reviewed 2023-03-23 21:49:31 +00:00
pat-s (Migrated from gitea.com) left a comment

I don't mind this change given the issues of too long namespace names etc.

Yet I wonder why three different lenghts were chosen here (49, 58, 59). Should we agree on one and maybe use a round number (50?)?

I don't mind this change given the issues of too long namespace names etc. Yet I wonder why three different lenghts were chosen here (49, 58, 59). Should we agree on one and maybe use a round number (50?)?
pat-s (Migrated from gitea.com) commented 2023-03-23 21:49:31 +00:00

Is there a strong reason for the naming change here? Otherwise I'd probably keep it as is?

Is there a strong reason for the naming change here? Otherwise I'd probably keep it as is?
pat-s commented 2023-03-28 21:44:39 +00:00 (Migrated from gitea.com)

Yet I wonder why three different lenghts were chosen here (49, 58, 59). Should we agree on one and maybe use a round number (50?)?

Ah sorry, I'm dumb 🤦 The different values should add up to 63 with the additional characters in the respective sections.

@jouve Can you resolve the current conflicts? I guess we could merge this then, I would also ask @justusbunsi for a review then.

> Yet I wonder why three different lenghts were chosen here (49, 58, 59). Should we agree on one and maybe use a round number (50?)? Ah sorry, I'm dumb 🤦 The different values should add up to 63 with the additional characters in the respective sections. @jouve Can you resolve the current conflicts? I guess we could merge this then, I would also ask @justusbunsi for a review then.
justusbunsi commented 2023-04-01 11:58:32 +00:00 (Migrated from gitea.com)

Reviewing these changes once more I noticed that this should be considered a breaking change due to potential name changes in RBAC related matters and cluster cross connectivity (changing service name), requiring cluster adjustments outside of the Helm Chart. First coming in mind: backup cronjob accessing the secrets or another application using the services.
Please adjust the PR description accordingly.

As there are different trimming limits for a reason, I also vote for explicit unit tests here to detect changes of those values and ensure correct templating behavior.

Reviewing these changes once more I noticed that this should be considered a breaking change due to potential name changes in RBAC related matters and cluster cross connectivity (changing service name), requiring cluster adjustments outside of the Helm Chart. First coming in mind: backup cronjob accessing the secrets or another application using the services. Please adjust the PR description accordingly. As there are different trimming limits for a reason, I also vote for explicit unit tests here to detect changes of those values and ensure correct templating behavior.
jouve (Migrated from gitea.com) reviewed 2023-04-02 22:19:37 +00:00
jouve (Migrated from gitea.com) commented 2023-04-02 22:19:37 +00:00

this is the service defined in http-svc.yaml, otherwise it doesn't make sense

this is the service defined in http-svc.yaml, otherwise it doesn't make sense
jouve (Migrated from gitea.com) commented 2023-04-02 22:40:04 +00:00

this is the service defined in http-svc.yaml, to me it makes sense as %s-gitea reference nothing defined in this chart

this is the service defined in http-svc.yaml, to me it makes sense as `%s-gitea` reference nothing defined in this chart
jouve commented 2023-04-02 22:37:31 +00:00 (Migrated from gitea.com)

I removed change to the secret as it does not have the 63 length limit.

As for the services, I would say the change is backward compatible:

  • all services length was <=63 chars, nothing changes
  • any services length was >63 chars is impossible as it would have been rejected by the apiserver:
    The Service "abcdefghijklmno-abcdefghijklmno-abcdefghijklmno-abcdefghijklmnop" is invalid: metadata.name: Invalid value: "abcdefghijklmno-abcdefghijklmno-abcdefghijklmno-abcdefghijklmnop": must be no more than 63 characters
    
I removed change to the secret as it does not have the 63 length limit. As for the services, I would say the change is backward compatible: * all services length was <=63 chars, nothing changes * any services length was >63 chars is impossible as it would have been rejected by the apiserver: ``` The Service "abcdefghijklmno-abcdefghijklmno-abcdefghijklmno-abcdefghijklmnop" is invalid: metadata.name: Invalid value: "abcdefghijklmno-abcdefghijklmno-abcdefghijklmno-abcdefghijklmnop": must be no more than 63 characters ```
pat-s (Migrated from gitea.com) reviewed 2023-04-15 21:01:41 +00:00
pat-s (Migrated from gitea.com) commented 2023-04-15 21:01:41 +00:00

@jouve Can you explain why you changed the name from gitea to http here?

It probably won't cause any issues but I would still like to understand it 🙂

@jouve Can you explain why you changed the name from `gitea` to `http` here? It probably won't cause any issues but I would still like to understand it 🙂
pat-s (Migrated from gitea.com) commented 2023-04-18 21:29:39 +00:00

there is no {{ include "gitea.fullname" }}-gitea service defined in the chart, while -http exists:

helm install -n gitea gitea gitea/gitea :

k -n gitea get svc
NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
gitea-http     ClusterIP   None            <none>        3000/TCP   49d
gitea-ssh      ClusterIP   None            <none>        22/TCP     49d
curl -fsSL http://gitea-gitea.gitea.svc:3000
curl: (6) Could not resolve host: gitea-gitea.gitea.svc

curl -fsSL http://gitea-http.gitea.svc:3000 | head -n2
<!DOCTYPE html>
<html lang="en-US" class="theme-auto">
...
there is no `{{ include "gitea.fullname" }}-gitea` service defined in the chart, while -http exists: `helm install -n gitea gitea gitea/gitea` : ``` k -n gitea get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE gitea-http ClusterIP None <none> 3000/TCP 49d gitea-ssh ClusterIP None <none> 22/TCP 49d ``` ``` curl -fsSL http://gitea-gitea.gitea.svc:3000 curl: (6) Could not resolve host: gitea-gitea.gitea.svc curl -fsSL http://gitea-http.gitea.svc:3000 | head -n2 <!DOCTYPE html> <html lang="en-US" class="theme-auto"> ... ```
pat-s (Migrated from gitea.com) commented 2023-04-19 07:50:54 +00:00

Ah good catch! This looks like an uncaught bug then. Thanks for the reply!

Ah good catch! This looks like an uncaught bug then. Thanks for the reply!
pat-s (Migrated from gitea.com) commented 2023-04-19 15:39:32 +00:00

Awesome catch.

Awesome catch.
pat-s (Migrated from gitea.com) approved these changes 2023-04-19 07:55:58 +00:00
pat-s (Migrated from gitea.com) left a comment

LGTM, thanks @jouve!
I agree it won't cause a breaking change in it's current form.

Would be good to get a second review from @justusbunsi on it!

LGTM, thanks @jouve! I agree it won't cause a breaking change in it's current form. Would be good to get a second review from @justusbunsi on it!
justusbunsi (Migrated from gitea.com) reviewed 2023-04-19 15:36:31 +00:00
justusbunsi (Migrated from gitea.com) left a comment

Reject to prevent merge. Still testing.

Reject to prevent merge. Still testing.
justusbunsi (Migrated from gitea.com) commented 2023-04-19 15:36:31 +00:00

The proposed changes do not match the values specification for postgresql. Correct would be:

- {{- printf "%s.%s.svc.%s:%g" (include "gitea.dependency.fullname" (dict "chartName" "postgresql" "chartValues" .Values.postgresql "context" .)) .Release.Namespace .Values.clusterDomain .Values.postgresql.global.postgresql.servicePort -}}
+ {{- printf "%s.%s.svc.%s:%g" (include "gitea.dependency.fullname" (dict "chartName" "postgresql" "chartValues" .Values.postgresql "context" .)) .Release.Namespace .Values.clusterDomain .Values.postgresql.global.postgresql.service.ports.postgresql -}}
The proposed changes do not match the values specification for postgresql. Correct would be: ```diff - {{- printf "%s.%s.svc.%s:%g" (include "gitea.dependency.fullname" (dict "chartName" "postgresql" "chartValues" .Values.postgresql "context" .)) .Release.Namespace .Values.clusterDomain .Values.postgresql.global.postgresql.servicePort -}} + {{- printf "%s.%s.svc.%s:%g" (include "gitea.dependency.fullname" (dict "chartName" "postgresql" "chartValues" .Values.postgresql "context" .)) .Release.Namespace .Values.clusterDomain .Values.postgresql.global.postgresql.service.ports.postgresql -}} ```
justusbunsi (Migrated from gitea.com) commented 2023-04-19 15:38:43 +00:00

The proposed changes do not match the values specification for memcached. Correct would be:

- {{- printf "%s.%s.svc.%s:%g" (include "gitea.dependency.fullname" (dict "chartName" "memcached" "chartValues" .Values.memcached "context" .)) .Release.Namespace .Values.clusterDomain .Values.memcached.service.port -}}
+ {{- printf "%s.%s.svc.%s:%g" (include "gitea.dependency.fullname" (dict "chartName" "memcached" "chartValues" .Values.memcached "context" .)) .Release.Namespace .Values.clusterDomain .Values.memcached.service.ports.memcached -}}
The proposed changes do not match the values specification for memcached. Correct would be: ```diff - {{- printf "%s.%s.svc.%s:%g" (include "gitea.dependency.fullname" (dict "chartName" "memcached" "chartValues" .Values.memcached "context" .)) .Release.Namespace .Values.clusterDomain .Values.memcached.service.port -}} + {{- printf "%s.%s.svc.%s:%g" (include "gitea.dependency.fullname" (dict "chartName" "memcached" "chartValues" .Values.memcached "context" .)) .Release.Namespace .Values.clusterDomain .Values.memcached.service.ports.memcached -}} ```
justusbunsi commented 2023-04-19 17:31:00 +00:00 (Migrated from gitea.com)

Follow-up to my review. I've played a bit more with it.

From what I see in my tests, we don't need to truncate the app.ini settings in the first place. I was able to create a working release with FQDNs in app.ini that exceed the current limits by a lot:

helm upgrade -i -n some-really-long-namespace-with-lots-of-dashed-parts --create-namespace long-gitea-release-name ./ creates a memcached host with 104 characters (port excluded): long-gitea-release-name-memcached.some-really-long-namespace-with-lots-of-dashed-parts.svc.cluster.local:11211.

According to the Domain Name RFC, the size limit for FQDN is 255. With Helm limiting its release name to 53 characters and the Kubernetes internal label/name restriction of 63 characters, there is a much higher chance that such release will fail at a much earlier stage than seeing it fail due to a too long FQDN inside app.ini (if even possible with the mentioned restrictions). It seems much more important to ensure that the resource names and labels do not exceed their limits.

Especially when Kubernetes automatically adds suffixes to names.

When I stretch the limit of release and namespace names, I the release does not work, as I am getting errors from and Postgres StatefulSet.

helm upgrade -i --create-namespace --namespace some-really-long-namespace-with-lots-of-dashed-parts-in-it release-name-covering-so-much-more-components ./

This generates a Gitea StatefulSet that creates a pod with label controller-revision-hash=release-name-covering-so-much-more-components-gitea-f858b9499 which is 62 characters long. This is something we have under control by not allowing a very long release name. So far so good.
What we don't have under control is what the Postgres dependency does. In my case it tries to render a pod with the similar label controller-revision-hash=release-name-covering-so-much-more-components-postgresql-7c8cc97779, which is longer than 63 characters. This leads to a release that will never be ready-to-use.

I tested if tools like kubeval (or its successor kubeconform) would detect such cases. Unfortunately, this is Kubernetes internal naming and not detectable. We would need to reject name lengths that would potentially create names/labels longer than 63 characters.

Or a chart dependency trims on its own - incorrectly.

helm upgrade -i --create-namespace --namespace some-really-long-namespace-with-lots-of-dashed-parts-in-it release-name-covering-so-much-more-components-n-stuff ./ does not render the templates correctly. The Postgres chart trucates the two service names in such way that it leaves two identical service names.

Leading to

Error: 1 error occurred:
        * services "release-name-covering-so-much-more-components-n-stuff-postgresq" already exists

So my suggestion is to drop the trunc within gitea.default_domain (namespace limit restriction by Kubernetes catches that first) and memcached.dns helper functions while keeping the fix within gitea.default_domain. That is a valid fix. Good catch.

Those changes would be three lines and definitely no breaking change. Apologies to @jouve for not mentioning this earlier. I actually did not knew it behaves like it does before testing your Pull Request and trying to break Kubernetes. 😃

Follow-up to my review. I've played a bit more with it. From what I see in my tests, we don't need to truncate the `app.ini` settings in the first place. I was able to create a working release with FQDNs in `app.ini` that exceed the current limits by a lot: `helm upgrade -i -n some-really-long-namespace-with-lots-of-dashed-parts --create-namespace long-gitea-release-name ./` creates a memcached host with 104 characters (port excluded): `long-gitea-release-name-memcached.some-really-long-namespace-with-lots-of-dashed-parts.svc.cluster.local:11211`. According to the [Domain Name RFC](https://www.ietf.org/rfc/rfc1035.html#section-2.3.4), the size limit for FQDN is 255. With Helm limiting its release name to 53 characters and the Kubernetes internal label/name restriction of 63 characters, there is a much higher chance that such release will fail at a much earlier stage than seeing it fail due to a too long FQDN inside `app.ini` (if even possible with the mentioned restrictions). It seems much more important to ensure that the resource names and labels do not exceed their limits. <details> <summary>Especially when Kubernetes automatically adds suffixes to names.</summary> When I stretch the limit of release and namespace names, I the release does not work, as I am getting errors from and Postgres StatefulSet. `helm upgrade -i --create-namespace --namespace some-really-long-namespace-with-lots-of-dashed-parts-in-it release-name-covering-so-much-more-components ./` This generates a Gitea StatefulSet that creates a pod with label `controller-revision-hash=release-name-covering-so-much-more-components-gitea-f858b9499` which is 62 characters long. This is something we have under control by not allowing a very long release name. So far so good. What we don't have under control is what the Postgres dependency does. In my case it tries to render a pod with the similar label `controller-revision-hash=release-name-covering-so-much-more-components-postgresql-7c8cc97779`, which is longer than 63 characters. This leads to a release that will never be ready-to-use. I tested if tools like `kubeval` (or its successor [kubeconform](https://github.com/yannh/kubeconform)) would detect such cases. Unfortunately, this is Kubernetes internal naming and not detectable. We would need to reject name lengths that would potentially create names/labels longer than 63 characters. </details> <details> <summary>Or a chart dependency trims on its own - incorrectly.</summary> `helm upgrade -i --create-namespace --namespace some-really-long-namespace-with-lots-of-dashed-parts-in-it release-name-covering-so-much-more-components-n-stuff ./` does not render the templates correctly. The Postgres chart trucates the two service names in such way that it leaves two identical service names. Leading to ``` Error: 1 error occurred: * services "release-name-covering-so-much-more-components-n-stuff-postgresq" already exists ``` </details> --- So my suggestion is to drop the `trunc` within `gitea.default_domain` (namespace limit restriction by Kubernetes catches that first) and `memcached.dns` helper functions while keeping the fix within `gitea.default_domain`. That is a valid fix. Good catch. Those changes would be three lines and definitely _no_ breaking change. Apologies to @jouve for not mentioning this earlier. I actually did not knew it behaves like it does before testing your Pull Request and trying to break Kubernetes. 😃
justusbunsi (Migrated from gitea.com) reviewed 2023-04-19 20:16:37 +00:00
@ -158,2 +98,2 @@
{{- printf "%s-redis-headless.%s.svc.%s" .Release.Name .Release.Namespace .Values.clusterDomain -}}
{{- end -}}
{{- define "memcached.dns" -}}
{{- printf "%s.%s.svc.%s:%g" .Release.Name .Release.Namespace .Values.clusterDomain .Values.memcached.service.ports.memcached -}}
justusbunsi (Migrated from gitea.com) commented 2023-04-19 20:16:37 +00:00

What's the intention of using the common package from bitnami? Consistency with names?
This has the potential to break when updating memcached only and keep Postgres version, or vice versa. The "common" package is not a direct dependency of this chart.

What's the intention of using the common package from bitnami? Consistency with names? This has the potential to break when updating memcached only and keep Postgres version, or vice versa. The "common" package is not a direct dependency of this chart.
justusbunsi (Migrated from gitea.com) commented 2023-04-24 21:18:30 +00:00

it's to match how bitnami charts calculate the name :
own name with common.names.fullname and common.names.dependency.fullname when in the context of a subchart.

example in bitnami/wordpress

you don't need direct dependency to common, templates are shared between all charts / there is no scope: https://helm.sh/docs/chart_template_guide/subcharts_and_globals/#sharing-templates-with-subcharts

Same a you, I'd expect that a backward imcompatible of memcached or pg would break many things.

I'll revert that tomorrow anyway

it's to match how bitnami charts calculate the name : own name with [common.names.fullname](https://github.com/bitnami/charts/blob/main/bitnami/common/templates/_names.tpl#L21) and [common.names.dependency.fullname](https://github.com/bitnami/charts/blob/main/bitnami/common/templates/_names.tpl#L39) when in the context of a subchart. example in [bitnami/wordpress](https://github.com/bitnami/charts/blob/main/bitnami/wordpress/templates/_helpers.tpl#L7) you don't need direct dependency to common, templates are shared between all charts / there is no scope: https://helm.sh/docs/chart_template_guide/subcharts_and_globals/#sharing-templates-with-subcharts Same a you, I'd expect that a backward imcompatible of memcached or pg would break many things. I'll revert that tomorrow anyway
justusbunsi commented 2023-05-30 18:33:24 +00:00 (Migrated from gitea.com)

Dang it. I totally missed that you already committed changes.

Dang it. I totally missed that you already committed changes.
justusbunsi (Migrated from gitea.com) reviewed 2023-05-30 18:37:44 +00:00
justusbunsi (Migrated from gitea.com) left a comment

Sorry for not reviewing earlier.

Sorry for not reviewing earlier.
@ -99,15 +83,6 @@ version: {{ .Values.image.tag | default .Chart.AppVersion | quote }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
justusbunsi (Migrated from gitea.com) commented 2023-05-30 18:38:04 +00:00

Same question as above below.

EDIT: I expected this to be second comment 😆.

Same question as ~~above~~ below. EDIT: I expected this to be second comment 😆.
justusbunsi (Migrated from gitea.com) commented 2023-05-30 18:37:44 +00:00

Any reason you removed the -postgresql suffix? Running helm template --create-namespace --namespace gitea gitea-release ./ and looking at the generated Postgres service, its name is gitea-release-postgresql, leading to FQDN (+port) gitea-release-postgresql.gitea.svc.cluster.local:5432.

With these changes, the app.ini gets the value HOST=gitea-release.gitea.svc.cluster.local:5432, which is incorrect from what I see.

Any reason you removed the `-postgresql` suffix? Running `helm template --create-namespace --namespace gitea gitea-release ./` and looking at the generated Postgres service, its name is `gitea-release-postgresql`, leading to FQDN (+port) `gitea-release-postgresql.gitea.svc.cluster.local:5432`. With these changes, the app.ini gets the value `HOST=gitea-release.gitea.svc.cluster.local:5432`, which is incorrect from what I see.
justusbunsi commented 2023-09-22 15:24:50 +00:00 (Migrated from gitea.com)

@jouve Do you want to continue with this PR?

@jouve Do you want to continue with this PR?
pat-s commented 2023-11-06 08:15:45 +00:00 (Migrated from gitea.com)

This looks stale meanwhile. @justusbunsi Do you want to continue here and finish it? Since you've done so much reviewing already. Or should we close? You decide!

This looks stale meanwhile. @justusbunsi Do you want to continue here and finish it? Since you've done so much reviewing already. Or should we close? You decide!
justusbunsi commented 2023-11-06 19:23:39 +00:00 (Migrated from gitea.com)

@pat-s Yes, it contains valid fixes. I am going to create a superseding PR.

@pat-s Yes, it contains valid fixes. I am going to create a superseding PR.
justusbunsi commented 2023-11-11 16:39:53 +00:00 (Migrated from gitea.com)

Superseded by #560.

Superseded by #560.

Pull request closed

Sign in to join this conversation.
No description provided.