Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HA enhancements #66

Open
2 tasks
camflan opened this issue Jan 5, 2018 · 4 comments
Open
2 tasks

HA enhancements #66

camflan opened this issue Jan 5, 2018 · 4 comments

Comments

@camflan
Copy link

camflan commented Jan 5, 2018

It looks like there's at least 2 more areas of improvement needed for kube-linode,

  • Multiple Masters
  • NodeBalancer + Traefik NodePort service

Is this something that should wait until Terraform provisioning is in progress?
Are these already possible?

@kahkhang
Copy link
Owner

kahkhang commented Jan 6, 2018

Thanks for the feedback! Multiple masters is possible by tainting a worker node with a master taint (see kubernetes-retired/bootkube#311). There are still some problems that need to be addressed in order to have a HA deployment:

  1. You'll need to have either a dedicated etcd cluster or deploy multiple etcd instances in different nodes. This was done and automated in previous commits of kube-linode with bootkube's self hosted etcd, but this feature has been depreciated by the bootkube team.

  2. HA deployment of Traefik needs to be supported. This means storing the SSL certificates in a distributed manner, and putting a node balancer in front of all the worker nodes pointing to port 80. However, I don't think the node balancer supports automatic SSL renewal, either that or a service will need to be written to support changing the certificates whenever Traefik updates it.

NodePorts currently do not support low port numbers such as port 80 (see kubernetes/kubernetes#9995), but a workaround can be done by listing the externalIPs (https://serverfault.com/questions/801189/expose-port-80-and-443-on-google-container-engine-without-load-balancer), which has some drawbacks since the externalIPs need to be updated dynamically, or by using a proxy service directly exposing host port 80 to port 80 of the pod network (see kubernetes/kubernetes#10405). More discussion about this is detailed here: kubernetes/kubernetes#9995.

In summary, there are multiple issues that need to be addressed to have a HA deployment, which can be easily addressed in other cloud providers through a LoadBalancer service, but unfortunately not available in Linode (unless someone writes a plugin for it). Another possible alternative is to deploy an external custom loadbalancer (see https://chmod666.org/2017/11/Hosting-a-self-made-Kubernetes-infrastructure-on-Scaleway), with a caveat to read Traefik's SSL certs hosted on an external distributed key-value store (see https://docs.traefik.io/user-guide/cluster/).

These are interesting challenges which I might try to attempt in the future (or if anyone reading this is inclined, feel free to take on this project). It's a good idea to use Terraform to automate this process, since I've come to realise bash scripts are somewhat messy to maintain. Thanks!

@camflan
Copy link
Author

camflan commented Jan 6, 2018

Wow, you're way ahead of me on this! I didn't know about tainting a worker for master, cool 👍

  1. Is the bootkube self-hosted feature deprecated, or self-hosting of etcd itself? If moving to terraform, can we simply install/host host etcd on the master nodes (assuming there is 3 or more)?
  2. Because rook is running, couldn't we use either rook object store or a rook fs storage pool to host the traefik SSL certs? Would we need the NodeBalancer to have certs on it, wouldn't traefik be able to handle it?

I didn't know that about NodePorts :/ I'll read more about the ExternalIP issues you posted.

I'm loving this project, it got our cluster up and running on Linode so that I could customize it. I had been messing around with my own k8s-linode project using Terraform but I couldn't get flannel or weave to come online using vxlan. My config was a bit more complicated, using Ubuntu as the base and then laying a VPN on top of the private networking interface. However, even without any firewall rules or VPN, I was still unable to get CNI working properly.

Anyways, you're use of coreOs let you avoid this issue entirely 👍 👍

Here's my project here: https://github.com/camflan/linode-k8s, feel free to use as much or as little as possible. I am happy to help with this project, I'd love a solid linode/kubernetes provisioning system to exist (if not only for selfish reasons :P), it seems like Linode gets left out of these projects, in favor of DigitalOcean.

@kahkhang
Copy link
Owner

kahkhang commented Jan 7, 2018

  1. The self-hosting of etcd is no longer in active development (see Clarify status of bootkube experimental self-hosted etcd kubernetes-retired/bootkube#738), but other parts of the k8s stack (except the kubelet service) is self hosted. I believe installing an odd number of etcd on the master nodes would work (though it probably should be kept small because of the consensus protocols going on)

  2. Yep that's a better idea, and possible, through first generating the certificates, then mounting the same PV in ReadOnlyMany mode across multiple Traefik pods.

Thanks for linking your project! I don't have any experience with Terraform yet, but would love to try it out some time :)

@kahkhang
Copy link
Owner

I've thought more about this and this HA setup is indeed possible, but we need some modifications:

  1. The K8S manifests/certificates should be generated outside of the cluster, then scp'ed in.
  2. We need to create an odd number of 3 or more masters.
  3. The server ip in the kubeconfig file needs to be replaced with an internal NodeBalancer ip pointing to the 3 master nodes.
  4. We need to bootstrap an odd number of 3 or more masters, all with etcd installed, with the kubeconfig file scp'ed in.
  5. We need another external NodeBalancer to forward traffic from outside to either only the masters, or all the nodes (running Traefik as a daemonset)

The internal NodeBalancer can be replaced with a similar setup as kubernetes-retired/bootkube#684. The total cost of the cluster will go up by at least $40/mo (2 extra 2gb instances + 1 NodeBalancer), but hopefully this will make it production ready.

I'm less motivated to embark on this because I did this for hobby purposes, but if someone is so inclined and reading this feel free to embark on this / give any feedback :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants