Pi Farm @ the river

First Time

Create credentials file in devops home, named .vault-credentials with permission 0600. This file contains the passphrase for the ansible vault.

Download roles from the ansible galixy

ansible-galaxy install -r roles/requirements.yml

New servers

To provision new (or existing systems) with the correct users and keys do the following:

Add system to the hosts file
Ensure the default user exists in group_vars/<distribution-name>.yaml Otherwise add a file with the variable ansible_user_first_run
Run ansible with ansible-playbook add-user-ssh.yaml --limit k3sworker05

Deploy the site

ansible-playbook site.yml

Upgrade all servers to latest versions

ansible-playbook upgrade.yml

References

https://github.com/prometheus/demo-site

Raspberry Firmware Update

$ sudo rpi-eeprom-update
$ sudo apt-get install rpi-eeprom
$ sudo rpi-eeprom-update -a

$ sudo raspi-config

TODO:

setup requirements

update hostname: sudo hostnamectl set-hostname k3s...
cgroupt in /boot/firmware/cmdline.txt: (on ubuntu /boot/fir) append cgroup_enable=cpuset cgroup_enable=memory cgroup_memory=1 swapaccount=1

install k3s server

$ export K3S_DATASTORE_ENDPOINT='postgres://k3s:12345678@192.168.42.221:5432/k3s?sslmode=disable'

$ curl -sfL https://get.k3s.io | sh -s - server --node-taint CriticalAddonsOnly=true:NoExecute --tls-san 192.168.42.60

--datastore-endpoint postgres://k3s:12345678@192.168.42.221:5432/k3s?sslmode=disable
--tls-san k3s.framsburg.ch

$ sudo journalctl -u k3s.service

get token sudo cat /var/lib/rancher/k3s/server/node-token

add second server

export database export Token

$ export K3S_DATASTORE_ENDPOINT='postgres://k3s:12345678@192.168.42.221:5432/k3s?sslmode=disable'
$ export K3S_TOKEN=K10c72f4e5f9fd862f0a1e91f9d1f91b16b2580621273705ee76528a66ed45f9819::server:5401d36937aa612639acb4d1083c2800

curl -sfL https://get.k3s.io | sh -s - server \
--datastore-endpoint postgres://k3s:12345678@192.168.42.221:5432/k3s?sslmode=disable \
--tls-san k3s.framsburg.ch

install agent

log into agent node

export K3S_URL=https://k3s.framsburg.ch:6443
export K3S_TOKEN=K10c72f4e5f9fd862f0a1e91f9d1f91b16b2580621273705ee76528a66ed45f9819::server:5401d36937aa612639acb4d1083c2800
curl -sfL https://get.k3s.io | sh -

MetalLB

kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)" --dry-run -o yaml | kubectl apply -f -

use ansible vault for secretkey.

Dual Stack Ingress

Ideas from Digitalis

Install Longhorn

https://www.ekervhen.xyz/posts/2021-02/troubleshooting-longhorn-and-dns-networking/

New disk

find out device lsblk -f on new devices wipefs -a /dev/{{ var_disk }}

$ sudo fdisk -l
$ sudo fdisk /dev/sdb

Command: n
Partition number: 1 (default)
First sector: (default)
Last sector: (default)
Command: w

$ sudo fdisk -l

$ sudo mkfs -t ext4 /dev/sdb1

install cert-manager

log into localhost or commander

$ export KUBECONFIG='~/.kube/config-k3s'


# If you have installed the CRDs manually instead of with the `--set installCRDs=true` option added to your Helm install command, you should upgrade your CRD resources before upgrading the Helm chart:

$ kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.5.1/cert-manager.crds.yaml

# Add the Jetstack Helm repository
helm repo add jetstack https://charts.jetstack.io

# Update your local Helm chart repository cache
$ helm repo update

# Install the cert-manager Helm chart
$ helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.5.4 \
  --set installCRDs=true \
  --debug

install rancher

log into localhost or commander

$ export KUBECONFIG='~/.kube/config-k3s'
$ helm repo add rancher-latest https://releases.rancher.com/server-charts/latest

$ kubectl create namespace cattle-system

# Update your local Helm chart repository cache
$ helm repo update

$ helm install rancher rancher-latest/rancher \
  --namespace cattle-system \
  --create-namespace \
  --set hostname=rancher.framsburg.ch \
  --set bootstrapPassword=admin \
  --set ingress.tls.source=rancher  --wait --debug --timeout 10m0s


$ kubectl -n cattle-system rollout status deploy/rancher

uninstall cert-manager: - helm uninstall cert-manager -n cert-manager - kubectl delete -f https://github.com/jetstack/cert-manager/releases/download/v1.5.4/cert-manager.crds.yaml - kubectl delete job.batch/cert-manager-startupapicheck -n cert-manager - kubectl delete rolebinding.rbac.authorization.k8s.io/cert-manager-startupapicheck:create-cert -n cert-manager - kubectl delete role.rbac.authorization.k8s.io/cert-manager-startupapicheck:create-cert -n cert-manager

argo cd

prerequisites

Install Go * [https://golang.org/doc/install]

Install Docker * [https://docs.docker.com/engine/install/ubuntu/]

Clone and build argo-cd according to ^argocdarm * git clone https://github.com/argoproj/argo-cd.git * cd argo-cd * make armimage

Plex

Plex image after k8s-at-home/plex has an initial issue with recognising itself as plex media server. The reason is the claim token (https://raw.githubusercontent.com/uglymagoo/plex-claim-server/master/plex-claim-server.sh)

if [ ! -z "${PLEX_CLAIM}" ] && [ -z "${token}" ]; then echo "Attempting to obtain server token from claim token" loginInfo="$(curl -X POST \ -H 'X-Plex-Client-Identifier: '${clientId} \ -H 'X-Plex-Product: Plex Media Server'\ -H 'X-Plex-Version: 1.1' \ -H 'X-Plex-Provides: server' \ -H 'X-Plex-Platform: Linux' \ -H 'X-Plex-Platform-Version: 1.0' \ -H 'X-Plex-Device-Name: PlexMediaServer' \ -H 'X-Plex-Device: Linux' \ "https://plex.tv/api/claim/exchange?token=${PLEX_CLAIM}")" token="$(echo "$loginInfo" | sed -n 's/.(.)</authentication-token>.*/\1/p')"

if [ "$token" ]; then setPref "PlexOnlineToken" "${token}" echo "Plex Media Server successfully claimed" fi fi

Tips & Tricks

Get rid of node:

kubectl drain kubectl delete

list all helm installations:

helm list -a -A

uninstall helm installation:

helm uninstall -n

Remove dangling namespaces:

( NAMESPACE=your-rogue-namespace kubectl proxy & kubectl get namespace $NAMESPACE -o json |jq '.spec = {"finalizers":[]}' >temp.json curl -k -H "Content-Type: application/json" -X PUT --data-binary @temp.json 127.0.0.1:8001/api/v1/namespaces/$NAMESPACE/finalize )

Remove old logs:

sudo journalctl --rotate --vacuum-time=5s sudo journalctl --rotate --vacuum-size=500M

Remove SSH keys

sudo ssh-keygen -f "/root/.ssh/known_hosts" -R "k3sworker02" ssh-keygen -f "/home/devops/.ssh/known_hosts" -R "k3sworker02"

Replace master node

Drain master node
Move master node to end of ansible master group list (if it is at the top, it will be initialized as new cluster)

Install etcdctl

$ VERSION="v3.5.4"
$ curl -L https://github.com/etcd-io/etcd/releases/download/${VERSION}/etcd-${VERSION}-linux-arm64.tar.gz --output etcdctl-linux-arm64.tar.gz
$ sudo tar -zxvf etcdctl-linux-arm64.tar.gz --strip-components=1 -C /usr/local/bin etcd-${VERSION}-linux-arm64/etcdctl

$ sudo etcdctl --cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt --cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt --key=/var/lib/rancher/k3s/server/tls/etcd/client.key version

Unix Utils

Find listening ports:

$ sudo lsof -i -P -n | grep LISTEN

Next Steps:

https://greg.jeanmart.me/2020/04/13/deploy-prometheus-and-grafana-to-monitor-a-k/
https://www.civo.com/learn/monitoring-k3s-with-the-prometheus-operator-and-custom-email-alerts
https://github.com/atoy3731/k8s-tools-app

References:

https://rpi4cluster.com/k3s/k3s-hardware/
argocd selfdeploy: [https://www.arthurkoziel.com/setting-up-argocd-with-helm/]