Skip to content

Overview

Welcome to the K3s Homelab GitOps Stack documentation. This documentation serves as both a reference guide and troubleshooting manual for a production-ready, GitOps-managed Kubernetes homelab cluster.

Purpose of This Documentation

This documentation has three primary goals, in order of priority:

  1. Troubleshooting & Operations - Quickly find solutions to common problems and execute maintenance tasks
  2. Architecture & Setup - Understand how the cluster is configured and why certain decisions were made
  3. Replication Guide - Follow step-by-step instructions to build a similar setup

Who Is This For?

Primarily for myself (Michael) as a reference when things break or when I need to remember how I set something up months ago. However, it's published on GitHub for anyone interested in building a similar homelab setup.

Design Philosophy

This homelab follows production-ready practices:

  • βœ… Everything is automated - No manual kubectl commands
  • βœ… Clear separation between public and private networks
  • βœ… Secure connections - All HTTPS with valid Let's Encrypt certificates
  • βœ… Disaster recovery - Easy backup and restore procedures
  • βœ… High Availability - Critical components (control plane, networking) run in HA mode

Quick Navigation

πŸ”§ I Need to Fix Something Now

Go to Troubleshooting & Operations for immediate help with:

  • Certificate issues
  • Storage problems
  • Network debugging
  • Application failures
  • Disaster recovery

πŸ“– I Want to Understand the Setup

Start with Architecture to understand the overall design, then explore:

πŸ› οΈ I Want to Build This

Follow the Quick Start guide, then proceed through:

  1. Hardware Setup
  2. OS Provisioning
  3. Cluster Bootstrap
  4. Core Services Installation

Technology Stack

Hardware

  • Control Plane: 3x Raspberry Pi 4 (8GB RAM)
  • Worker Nodes: 4x Raspberry Pi 4 (8GB RAM) with external NVMe storage
  • Compute Servers: Lenovo Thinkcentre M720q, M75q, Minisforum MS-01 (all 64GB RAM)
  • Storage: HL15 with TrueNAS and 6x HDDs

Core Technologies

  • Container Orchestration: k3s (lightweight Kubernetes)
  • GitOps: ArgoCD
  • Ingress: Traefik
  • Load Balancer: MetalLB
  • Storage: Longhorn (distributed block storage)
  • Secrets Management: HashiCorp Vault + External Secrets Operator
  • Authentication: Authentik (SSO/OIDC)
  • Certificates: cert-manager + Let's Encrypt
  • Monitoring: Prometheus + Grafana
  • Logging: Loki
  • Databases: CloudNativePG (PostgreSQL operator)

Cluster Architecture at a Glance

graph TB
    subgraph "External"
        Internet[Internet]
        DNS[CloudFlare DNS]
    end

    subgraph "Edge"
        Router[OPNsense Router]
        MetalLB[MetalLB LoadBalancer]
    end

    subgraph "Ingress Layer"
        Traefik[Traefik Ingress]
        CertManager[cert-manager]
    end

    subgraph "Security Layer"
        Authentik[Authentik SSO]
        Vault[HashiCorp Vault]
    end

    subgraph "Platform Services"
        ArgoCD[ArgoCD]
        Monitoring[Prometheus/Grafana]
        CNPG[PostgreSQL CNPG]
    end

    subgraph "Storage"
        Longhorn[Longhorn]
        TrueNAS[TrueNAS]
    end

    subgraph "Applications"
        Apps[Media/Network/Monitoring Apps]
    end

    Internet --> DNS
    DNS --> Router
    Router --> MetalLB
    MetalLB --> Traefik
    Traefik --> Authentik
    Traefik --> Apps
    Authentik --> Apps
    Vault --> Apps
    CertManager --> Traefik
    ArgoCD --> Platform Services
    ArgoCD --> Apps
    Apps --> Longhorn
    Apps --> CNPG
    Longhorn --> TrueNAS

Getting Help

Next Steps

New to this setup? Start with the Quick Start Guide.

Already familiar? Jump to what you need: