This reverts commit 7dc6e77bc2a03e660cab2c4cbf52f235bc52683e, reversing
changes made to bce47ea9d5fa962736ddd4a254a27a5fd2cdee9a.
Motivation for the revert in #67563
by adding targets and curl wait loops to services to ensure services
are not started before their depended services are reachable.
Extra targets cfssl-online.target and kube-apiserver-online.target
syncronize starts across machines and node-online.target ensures
docker is restarted and ready to deploy containers on after flannel
has discussed the network cidr with apiserver.
Since flannel needs to be started before addon-manager to configure
the docker interface, it has to have its own rbac bootstrap service.
The curl wait loops within the other services exists to ensure that when
starting the service it is able to do its work immediately without
clobbering the log about failing conditions.
By ensuring kubernetes.target is only reached after starting the
cluster it can be used in the tests as a wait condition.
In kube-certmgr-bootstrap mkdir is needed for it to not fail to start.
The following is the relevant part of systemctl list-dependencies
default.target
● ├─certmgr.service
● ├─cfssl.service
● ├─docker.service
● ├─etcd.service
● ├─flannel.service
● ├─kubernetes.target
● │ ├─kube-addon-manager.service
● │ ├─kube-proxy.service
● │ ├─kube-apiserver-online.target
● │ │ ├─flannel-rbac-bootstrap.service
● │ │ ├─kube-apiserver-online.service
● │ │ ├─kube-apiserver.service
● │ │ ├─kube-controller-manager.service
● │ │ └─kube-scheduler.service
● │ └─node-online.target
● │ ├─node-online.service
● │ ├─flannel.target
● │ │ ├─flannel.service
● │ │ └─mk-docker-opts.service
● │ └─kubelet.target
● │ └─kubelet.service
● ├─network-online.target
● │ └─cfssl-online.target
● │ ├─certmgr.service
● │ ├─cfssl-online.service
● │ └─kube-certmgr-bootstrap.service
to protect services from crashing and clobbering the logs when
certificates are not in place yet and make sure services are activated
when certificates are ready.
To prevent errors similar to "kube-controller-manager.path: Failed to
enter waiting state: Too many open files"
fs.inotify.max_user_instances has to be increased.
+ isolate etcd on the master node by letting it listen only on loopback
+ enabling kubelet on master and taint master with NoSchedule
The reason for the latter is that flannel requires all nodes to be "registered"
in the cluster in order to setup the cluster network. This means that the
kubelet is needed even at nodes on which we don't plan to schedule anything.
- All kubernetes components have been seperated into different files
- All TLS-enabled ports have been deprecated and disabled by default
- EasyCert option added to support automatic cluster PKI-bootstrap
- RBAC has been enforced for all cluster components by default
- NixOS kubernetes test cases make use of easyCerts to setup PKI
VMs were starving, many of the daemons were unable to complete their
tasks resulting in tests failures.
Turned off verbose output from k8s components as it consumes even more resources, and useful error messages actually drown in debug-clutter
* Fix reference CNI plugins
* The plugins were split out of the upstream cni repo around version
0.6.0
* Fix RBAC and DNS tests
* Fix broken apiVersion fields
* Change plugin linking to look in ${package}/bin rather than
${package.plugins}
* Initial work towards a working e2e test
* Test still fails, but at least the expression evaluates now
Continues @srhb's work in #37199Fixes#37199
- add flannel support
- remove deprecated authorizationRBACSuperAdmin option
- rename from deprecated poratalNet to serviceClusterIpRange
- add nodeIp option for kubelet
- kubelet, add br_netfilter to kernelModules
- enable firewall by default
- enable dns by default on node and on master
- disable iptables for docker by default on nodes
- dns, restart on failure
- update tests
and other minor changes