Overview¶
Welcome to the K3s Homelab GitOps Stack documentation. This documentation serves as both a reference guide and troubleshooting manual for a production-ready, GitOps-managed Kubernetes homelab cluster.
Purpose of This Documentation¶
This documentation has three primary goals, in order of priority:
- Troubleshooting & Operations - Quickly find solutions to common problems and execute maintenance tasks
- Architecture & Setup - Understand how the cluster is configured and why certain decisions were made
- Replication Guide - Follow step-by-step instructions to build a similar setup
Who Is This For?¶
Primarily for myself (Michael) as a reference when things break or when I need to remember how I set something up months ago. However, it's published on GitHub for anyone interested in building a similar homelab setup.
Design Philosophy¶
This homelab follows production-ready practices:
- β Everything is automated - No manual kubectl commands
- β Clear separation between public and private networks
- β Secure connections - All HTTPS with valid Let's Encrypt certificates
- β Disaster recovery - Easy backup and restore procedures
- β High Availability - Critical components (control plane, networking) run in HA mode
Quick Navigation¶
π§ I Need to Fix Something Now¶
Go to Troubleshooting & Operations for immediate help with:
- Certificate issues
- Storage problems
- Network debugging
- Application failures
- Disaster recovery
π I Want to Understand the Setup¶
Start with Architecture to understand the overall design, then explore:
π οΈ I Want to Build This¶
Follow the Quick Start guide, then proceed through:
Technology Stack¶
Hardware¶
- Control Plane: 3x Raspberry Pi 4 (8GB RAM)
- Worker Nodes: 4x Raspberry Pi 4 (8GB RAM) with external NVMe storage
- Compute Servers: Lenovo Thinkcentre M720q, M75q, Minisforum MS-01 (all 64GB RAM)
- Storage: HL15 with TrueNAS and 6x HDDs
Core Technologies¶
- Container Orchestration: k3s (lightweight Kubernetes)
- GitOps: ArgoCD
- Ingress: Traefik
- Load Balancer: MetalLB
- Storage: Longhorn (distributed block storage)
- Secrets Management: HashiCorp Vault + External Secrets Operator
- Authentication: Authentik (SSO/OIDC)
- Certificates: cert-manager + Let's Encrypt
- Monitoring: Prometheus + Grafana
- Logging: Loki
- Databases: CloudNativePG (PostgreSQL operator)
Cluster Architecture at a Glance¶
graph TB
subgraph "External"
Internet[Internet]
DNS[CloudFlare DNS]
end
subgraph "Edge"
Router[OPNsense Router]
MetalLB[MetalLB LoadBalancer]
end
subgraph "Ingress Layer"
Traefik[Traefik Ingress]
CertManager[cert-manager]
end
subgraph "Security Layer"
Authentik[Authentik SSO]
Vault[HashiCorp Vault]
end
subgraph "Platform Services"
ArgoCD[ArgoCD]
Monitoring[Prometheus/Grafana]
CNPG[PostgreSQL CNPG]
end
subgraph "Storage"
Longhorn[Longhorn]
TrueNAS[TrueNAS]
end
subgraph "Applications"
Apps[Media/Network/Monitoring Apps]
end
Internet --> DNS
DNS --> Router
Router --> MetalLB
MetalLB --> Traefik
Traefik --> Authentik
Traefik --> Apps
Authentik --> Apps
Vault --> Apps
CertManager --> Traefik
ArgoCD --> Platform Services
ArgoCD --> Apps
Apps --> Longhorn
Apps --> CNPG
Longhorn --> TrueNAS
Getting Help¶
- Check Common Issues first
- Look for your specific component in the Cluster Core section
- Search the documentation using the search bar
- Review Useful Commands for quick CLI reference
Next Steps¶
New to this setup? Start with the Quick Start Guide.
Already familiar? Jump to what you need:
- Troubleshooting
- Deploy New App
- Configure Ingress
-
[CSI]: Container Storage Interface
- [IOMMU]: Input-Output Memory Management Unit. Used to virualize memory access for devices. See Wikipedia