Proton is a gateway which houses all daystram's applications at daystram.com. It runs a Kubernetes cluster (the gateway runs the master node), allowing the addition of other worker nodes hosted on other premises to increase computing capacity while also keeping the VPS cost low. K3s distribution is selected for its lightweight resource requirements.
Proton also acts as a WireGuard VPN server, as the worker nodes will attach to the cluster via this virtual network. This allows the worker nodes to lie behind a NAT'd network (e.g. homelabs or home servers) and lose the requirement to have a public IP or to expose any ports.
This guide is valid as of K3s/Kubelet version v1.20.2+k3s1.
This guide will setup a master node (tailored to be setup in a VPS) and worker nodes, joined via a WireGuard VPN tunnel.
Install WireGuard.
$ apt install wireguard resolvconfEnsure IPv4 forwarding is enabled.
$ sysctl -w net.ipv4.ip_forward=1Use https://www.wireguardconfig.com/ to easily generate key pairs. Proton networks use the IP range 10.7.7.0/24. Save the configuration into /etc/wireguard/wg0.conf
💡 Note the
wg0interface name. This is kept consistent in the next steps.
Add the following to the [Interface] block in the wg0.conf, as described at https://www.reddit.com/r/WireGuard/comments/fqqqxz/connection_problem_with_wireguard_and_kubernetes/. (tl;dr: K8s kube-proxy in iptables mode messes with the IP table, this line masks K8s' configuration)
[Interface]
# ...
FwMark = 0x4000Add PersistentKeepalive to prevent VPN shutting off on idle traffic.
[Peer]
# ...
PersistentKeepalive = 25Enable the service.
$ systemctl enable wg-quick@wg0Start VPN.
$ wg-quick up wg0Ensure that port 51820 is opened on the VPS firewall.
Conntrack is required for kube-proxy to work (implicit in their docs).
$ apt install conntrackWe'll use K3s, ensure that Traefik is disabled, we will install it separately to get v2 (default installed is Traefik v1).
$ curl -sfL https://get.k3s.io | K3S_NODE_NAME=proton sh -s - --disable traefik --disable-cloud-controller # --dockerWe add the disable countroller flag to prevent K3s from running its own dummy CCM, which requires a large amount of resource. To further reduce the control plane resource requirement (at the cost of performance), GOGC=10 environment variable can be added to the K3s service at /etc/systemd/system/k3s.service.env (write permission restricted).
Use the --docker flag to use Docker backend. Docker backend is required for detailed metrics scraped by the built-in cAdvisor (as of writing, containerd is not yet supported by cAdvisor).
Set KUBECONFIG variable at /etc/profile for other tools (including Helm) to default to.
export KUBECONFIG=/etc/rancher/k3s/k3s.yamlUse this configuration to use access the cluster from a client machine by putting it into ~/.kube/config file, and settting the cluster IP from localhost to 10.7.7.1 (as configured on the WireGuard server configuration).
⚠️ Thisconfigfile has admin access to the cluster.
From here onwards, the setup can be done on a client machine connected to the WireGuard VPN.
Helm is a package manager for Kubernetes. daystram's applications are also deployed via Helm charts, using the repository at https://charts.daystram.com. See https://github.com/daystram/helm-charts/ for more info daystram's Helm charts. See https://helm.sh/ for more info about Helm.
Install Helm.
$ wget https://get.helm.sh/helm-v3.5.2-linux-amd64.tar.gz
$ tar -zxvf helm-v3.5.2-linux-amd64.tar.gz
$ mv linux-amd64/helm /usr/local/bin/helmSee https://github.com/traefik/traefik-helm-chart for more info about installing Traefik using their Helm chart.
Add the repository.
$ helm repo add traefik https://helm.traefik.io/traefik
$ helm repo updateInstall Traefik. This also install the CRDs by default.
helm -n ingress-traefik install traefik traefik/traefik --create-namespace --values ingress-traefik/values.ymlNote that we are overriding some of the default values from the original chart. Adjust service.spec.loadBalancerIP to the external IP of your the traffic entrypoint node.
We set externalTrafficPolicy for the LoadBalancer service to Local (from the default Cluster), because LoadBalancers will be SNAT'd (Source NAT) by default to allow cross-node requests (somehow still works in this use case), setting this to Local disables this action and thus preserves the client IP. We also have to ensure that the Traefik controller pod runs on the node where the traffic comes in, on our case its the master node, Proton -- otherwise, the control pod will read the incoming IPs already being NAT'd. Thus, we can edit the deployment configuration as follows:
$ kubectl -n ingress-traefik edit deployment/traefikand add the following to the template spec:
# ...
spec:
# ...
template:
# ...
spec:
# ...
nodeSelector:
kubernetes.io/hostname: protonAlso ensure that the LoadBalancer is provisioned to use the external IP of the node where yout traffic is coming from.
See https://kubernetes.io/docs/tutorials/services/source-ip/#source-ip-for-services-with-type-loadbalancer for more info.
💡 Note the
traefik-cert-manageringress class, this is used when we setup cert-manager in the next steps.
More Traefik endpoints are also added in this values file, if required.
To view the Traefik dashboard, proxy it to your local machine.
$ kubectl -n ingress-traefik port-forward $(kubectl -n ingress-traefik get pods --selector "app.kubernetes.io/name=traefik" --output=name) 9000Create a VPN whitelist Middleware. This is useful if we want to expose certain internal applications only to the VPN clients.
$ kubectl -n ingress-traefik apply -f ingress-traefik/vpn-whitelist.ymlNote the IP range is set to the VPN client IP range, keep these consistent. This Middleware will only work if Traefik's LoadBalancer's externalTrafficPolicy is set to Local.
cert-manager helps with issuing new X509 certificates for applications that require them. By creating Certificate objects (a CRD from cert-manger), a new certificate will be claimed from Let's Encrypt using the HTTP solver. This certificate is then used by Traefik's websecure endpoint. See https://cert-manager.io/docs/ for more info about cert-manager.
Add the repository.
$ helm repo add jetstack https://charts.jetstack.io
$ helm repo updateInstall cert-manager, also install CRDs.
$ helm -n cert-manager install cert-manager jetstack/cert-manager --create-namespace --version v1.1.0 --set installCRDs=trueInstall the ClusterIssuer.
kubectl apply -f cert-manager/letsencrypt.ymlISRG Root X1 chain is used due to Let's Encrypt deprecating the old chain in late 2021. Note the traefik-cert-manager ingress class. This tells cert-manager to use Traefik's ingress as the endpoint when performing auto certificate retrieval using the HTTP solver.
Install the dashboard.
$ kubectl -n kubernetes-dashboard apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0/aio/deploy/recommended.ymlCreate a ServiceAccount with its ClusterRoleBinding.
⚠️ This service account has admin access to the cluster.
$ kubectl -n kubernetes-dashboard apply -f kubernetes-dashboard/serviceaccount.ymlGet the access token from the secret.
$ kubectl -n kubernetes-dashboard get secret $(kubectl -n kubernetes-dashboard get sa/daystram -o jsonpath="{.secrets[0].name}") -o go-template="{{.data.token | base64decode}}"To open the dashboard, open an API proxy using kubectl proxy, then visit http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/ and authenticate using token from above.
Grafana with Prometheus datasource is used to monitor metrics of the cluster. The built-in cAdvisor in K3s distribution is scraped by Prometheus in this default configuration.
Add the repositories.
$ helm repo add grafana https://grafana.github.io/helm-charts
$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo updateCreate namespace, persistent volume, and persistent volume claim.
$ kubectl create namespace metrics
$ kubectl -n metrics apply -f metrics/grafana/persistentvolume.ymlInstall Grafana and Prometheus charts.
$ helm install -n metrics prometheus prometheus-community/prometheus
$ helm install -n metrics grafana grafana/grafana --values metrics/grafana/values.ymlWe are using Ratify OAuth to replace the default Grafana login form. User signing in via Ratify will be given Viewer roles. To set one as an admin, the starting generated admin user has to be used. Set "grafana.ini".auth.disable_login_form: true to disable the default login form afterwards. Set the OAuth client_id and client_secret accordingly based on the created application in Ratify.
Same method as installing WireGuard on the master node. See 1. Install and Setup WireGuard.
Use the one of the client configurations generated above.
Conntrack is required for kube-proxy to work (implicit in their docs).
$ apt install conntrackEnsure VPN is connected and master node is setup and server is on 10.7.7.1. Get the token from the master node at /var/lib/rancher/k3s/server/token. Install K3s worker node.
$ curl -sfL https://get.k3s.io | K3S_URL=https://10.7.7.1:6443 K3S_TOKEN=TOKEN_FROM_MASTER K3S_NODE_NAME=tambun sh -s - --flannel-iface wg0 # --dockerNote the interface name wg0 from what's set before. This affixes the IP bindings for flannel CNI to the VPN's interface (it defaults to the host's default interface, e.g. eth0), which wouldn't work since this node lies behind NAT.
Use the --docker flag to use Docker backend. Docker backend is required for detailed metrics scraped by the built-in cAdvisor (as of writing, containerd is not yet supported by cAdvisor).
Ensure that the K3s version used by the worker node is the same as the master node (not newer). Set the environment variable INSTALL_K3S_VERSION if necessary.
See https://rancher.com/docs/k3s/latest/en/installation/install-options/server-config/#agent-networking for more info about K3s agent networking configuration.