Init script fails without Logs #149
Closed
opened 2021-04-25 15:17:28 +00:00 by mattn
·
13 comments
No Branch/Tag Specified
main
renovate/postgresql-ha-15.x
renovate/postgresql-16.x
renovate/redis-20.x
renovate/redis-cluster-11.x
fix-674
app-ini-recreation
fix-env-to-ini
clean-app-ini
gitea-ha
v10.6.0
v10.5.0
v10.4.1
v10.4.0
v10.3.0
v10.2.0
v10.1.4
v10.1.3
v10.1.2
v10.1.1
v10.1.0
v10.0.2
v10.0.1
v10.0.0
v9.6.1
v9.6.0
v9.5.1
v9.5.0
v9.4.0
v9.3.0
v9.2.1
v9.2.0
v9.1.0
v9.0.4
v9.0.3
v9.0.2
v9.0.1
v9.0.0
v8.3.0
v8.2.0
v8.1.0
v8.0.3
v8.0.2
v8.0.1
v8.0.0
v7.0.4
v7.0.3
v7.0.2
v7.0.1
v7.0.0
v6.0.5
v6.0.4
v6.0.3
v6.0.2
v6.0.1
v6.0.0
v5.0.9
v5.0.8
v5.0.7
v5.0.6
v5.0.5
v5.0.4
v5.0.3
v5.0.2
v5.0.1
v5.0.0
v4.1.1
v4.1.0
v4.0.3
v4.0.2
v4.0.1
v4.0.0
v3.1.4
v3.1.3
v3.1.2
v3.1.1
v3.1.0
v3.0.0
v2.2.5
v2.2.4
v2.2.3
v2.2.2
v2.2.1
v2.2.0
v2.1.11
v2.1.10
v2.1.9
v2.1.8
v2.1.7
v2.1.6
v2.1.5
v2.1.4
v2.1.3
v2.1.2
v2.1.1
v2.1.0
v2.0.7
v2.0.6
v2.0.5
v2.0.4
v2.0.3
v2.0.2
v2.0.0
v1.5.5
v1.5.4
v1.5.3
v1.5.2
v1.5.1
v1.5.0
v1.4.9
v1.4.8
v1.4.7
v1.4.6
v1.4.5
v1.4.4
v1.4.3
v1.4.2
Labels
Clear labels
has/backport
in progress
invalid
kind/breaking
kind/bug
kind/build
kind/dependency
kind/deployment
kind/docs
kind/enhancement
kind/feature
kind/lint
kind/proposal
kind/question
kind/refactor
kind/security
kind/testing
kind/translation
kind/ui
need/backport
priority/critical
priority/low
priority/maybe
priority/medium
reviewed/duplicate
reviewed/invalid
reviewed/wontfix
skip-changelog
status/blocked
status/needs-feedback
status/needs-reviews
status/wip
upstream/gitea
upstream/other
No Label
Milestone
No items
No Milestone
Projects
Clear projects
No project
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: lunny/helm-chart#149
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The init script has been causing my helm deploy to fail for some time. I ended up modfying the init script to printout some context to help, and then... it worked?
I don't know if this is useful information for anyone. I'm still looking for a long-term solution or explaintation of the actual issue. Will update here if i discover.
Values.yaml
I'm running on a homelab environment, here's my value's file:
Chart apply
and here's how I'm applying the chart
pod description
Unmodifyed init script results
Editing init script
Download the script
Adding simple debug
Write script back and re-create pod
Wait a little while...
View output
Hi there,
it would be nice to have the logs of the failed init script.
Please keep in mind, that the init container might fail repeatedly until the postgresql database is ready to accept connections.
That's what made me go so far down this path to edit the script with simple debug output - there was no logs! I used kubectl get get logs from the init container
kubectl get logs gitea-0 -c init ...
and nothing, and then debug into pulling logs from containerd on the node host. I was able to see logs for everything else (including postgresql confirming it was up and ready to accept connections) but nothing from the init container. Even in the symlinked log files in /var/log were empty for this init container.As soon as I added printouts - it produced logs and everything worked!
this is weird, normally you wont get logs until the postgresql is available, because of the
nc -v -w2 -z gitea-postgresql 5432
Once the postgresql startet you will receive logs from the sql and gitea admin commands
I am running into the same problem. I modfied the init script to include a "while sleep" loop so that I could run a shell on the container.
In my case, the database is configured separately rather than part of the chart, however I have also confirmed that the database is accepting connections from pods in the same namespace.
I then tested with a few arbitrary public facing endpoints and got a similar result. It seems that the container is unable to make external connections.
Any ideas?
Actually, @mattn just looking at your configs again. I'm also running istio on this namespace.
It seems that the istio-proxy runs after all init containers are done, which could prevent outbound connections. I'm seeing some discussion of it on the istio forums.
Currently looking for a way around this but the general recommendation is: "avoid network IO in your init containers".
@EternalDeiwos Ah yes, thanks for the debug and references. This is absolutely the issue here!
It seems like it'll be a while before Istio can support this properly.
It seems like the only known-working
hack
would be to remove the init container and move the logic somewhere else.. I'll do some experimenting and let you know if I come up with anything.I'm not up to speed on developer practices here or expectations but I would assume a one-off hack to support a specific use-case isn't going to be something the devs at gitea want to support.
I had also some issues with istio, but this was due to filesystem permissions. The solution here was using the charts 3.0.0 version and enabling the rootless image since istio had some issues setting the correct file permissions.
However i did not test the current version with istio.
To move the logic out of the init container we would need to run another command on the main container, running the init script and invoking the original containers run command in this init script.
Not sure if I really like this approach :D. Will think about this issue and see if I can come up with some other ideas.
@luhahn I don't blame you, the approach would be a regression to code quality for sure.
I just ran into a version of this myself. I'm not running Istio, but I had the same problem (failing init container with no logs). I followed the steps in the OP and, sure enough, the init container ran and the process moved forward.
In my case, I think what's going on is the last command of the init script is failing. Since it returns a non zero exit code the init container itself is marked as failed by kubernetes. When I added the final
echo "Done"
to the script, the last command now succeedes and so kubernetes thinks the init container succeeded and moves on.I was able to see some logs and my issue appers to be that the database connection is being denied.
I checked with an interactive container and I can log in with the password configured in the environment variable
GITEA__database__PASSWD
, but I don't know if the init script is using that password or the dummy one I have provided in the helm values.yaml file.I'm using a Secret for the actual database password (see this issue), but if I don't also supply db details in gitea.config.database the chart failes to install. I undertood from the linked issue and this page that the
GITEA__*
env variables took precedence over other config, but maybe that doesn't apply in the init container?Any ideas on how I can debug this further?
Also, I'm happy to make a PR to add some basic output to the init script if that would be appreciated. Just let me know.
After posting the last comment I realized I could just use the actual database password in
gitea.config.database
in the values.yaml file as a test...it was late.Sure enough, that worked perfectly.
So, I see three main issues here:
When running with Istio, the init container won't be able to use the network like the init script needs.
The
GITEA__
environment variables don't seem to be respected in the init script. In addition to the password issue, I noticed that thenc
line didn't use the host configured inGITEA__database__HOST
but defaulted to localhost:3306 until I configuredgitea.config.database.HOST
.The init script doesn't produce any output on some failure modes which makes debugging challenging.
As mentioned, I'm happy to help with the thrid one, but unfortunately the first two are a bit out of my league (I don't know go at all).
will get back to this issue soon, sorry for the delay !
This problem still persists in
4.x
; is it possible to move the database connection testing and migration to a job rather than have it as an init container?Currently init containers run before sidecars are started and Istio prevents connection to the database otherwise.
The init container rely on the database connection if a DB other than SQLite is used. So moving it completely out of the init container would risk instability of the init container itself.
Maybe it is be possible to use Chart hooks for that. ?