k3s: How a Steward Container and Podman Compose Turn VMs into Reproducible, Ephemeral Clusters
k3s with a steward container and Podman Compose lets operators bootstrap reproducible, ephemeral VMs fast—ideal for homelabs, edge clusters, and CI pipelines.
The steward container rethink: running k3s, Tailscale and more from purpose-built images
When you build a machine around containers rather than a container around a machine, k3s and the rest of your stack stop being tied to the host operating system and become portable artifacts you can version, test, and redeploy. That’s the promise realized by the steward container pattern: a tiny, single-purpose orchestrator image—built with Podman tools and a small shell script—that sequences and supervises upstream container images such as k3s and Tailscale so they behave like the services of a node. Calling out k3s and steward container in the same breath matters because it flips the infrastructure model: the VM becomes ephemeral cattle, the workloads live in immutable images, and restoring or replacing a node is a matter of image + block volume rather than hand-editing host packages.
This article walks through why the host OS still matters just once, what “thinking with containers” actually looks like, how a steward container is architected in practice, the trade-offs you’ll face (including storage choices), and what this approach implies for developers, platform teams, and homelab operators.
Why the host OS matters one time
The original conceit—that the host operating system is irrelevant—mostly holds if the host consents to be irrelevant. In practice the host’s kernel, security controls, and init behavior set the conditions that let your privileged container take control. Security subsystems like SELinux or Windows Defender are not bugs; they enforce constraints that prevent a container from hijacking the host. If your plan is to run containers with host networking, privileged mode, or nested container runtimes, the host must be configured to allow that bootstrapping step.
That means choosing a base image or cloud VM that “gets out of the way” at boot. Minimal Ubuntu builds, for example, typically do this: they let cloud-init drop in your startup scripts, enable unattended upgrades, and otherwise behave like predictable furniture. Several enterprise distributions ship with stricter defaults—SELinux enforcing, constrained init paths—that will block the steward container’s ability to manage the node. The one-time host decision is therefore a pragmatic trade: pick a host that allows your privileged orchestration to start, then treat the OS as an inert runtime for the lifecycle of your ephemeral VM.
What I learned by actually thinking with containers
A common mistake is to treat a container as a catch-all OS: put systemd, package managers, and dozens of services into one “everything” image and call it a day. That approach is closer to packaging a VM than leveraging containers. Purpose-built upstream images—rancher/k3s as a tiny k8s distribution, tailscale/tailscale as a single-purpose connectivity agent—exist for good reasons: they are minimal, maintained by their authors, and designed to run only their intended process set. Trying to extend or unpack those images often introduces more maintenance burden and fragility than it solves.
Thinking with containers means one container per concern. k3s should run from the k3s image. Tailscale should run from the Tailscale image. An overlay steward container should orchestrate their startup, sequencing, and configuration, but not attempt to subsume their internals. This separation reduces the surface area you maintain while retaining the flexibility to change sequencing, injection of secrets, or bootstrap arguments as your needs evolve.
The steward container pattern, explained
At its core a steward container is deliberately small: a lightweight Linux base with a handful of userland tools (shell, a container client such as Podman, and a small orchestration script). It does one job—bring up and manage other containers that provide node-level functionality: networking, the Kubernetes runtime, storage agents, observability agents, and a bootstrap control plane like ArgoCD.
Architecturally, the steward pattern follows these responsibilities:
- Host integration: mount the necessary host paths (container runtime socket, /dev, and any privileged device mounts) so the steward can create truly host-integrated child containers.
- Sequencing and lifecycle: run child containers in a deterministic order, wait for readiness signals (API endpoints, sockets), and handle restarts or upgrades by stopping and reapplying containers.
- Configuration injection: render manifests, inject secrets from a secure store or block volume, and pass runtime flags to child images without modifying their layers.
- Minimal ownership: the steward owns orchestration logic and the small scripts you edit; it does not absorb responsibility for the internals of the upstream images it launches.
Because the steward is a container, you can run it locally for testing, in CI, and in cloud VMs—your entire node bootstrap process becomes testable on a laptop. The steward is the only artifact you truly “own”; everything else is consumed as versioned, upstream images.
How it looks in practice: sequencing k3s and Tailscale
In a steward-based VM, you typically see two or three purpose-built images brought up by Podman Compose or a similar compose runtime. The steward mounts host networking and privileges where required, then sequentially starts:
- A network/overlay agent (e.g., tailscale/tailscale) with host networking so the node gains secure connectivity and name/addressability.
- The k3s image (rancher/k3s) with privileged access and host networking so K3s can manage containers, the kubelet, and the cluster control plane.
- An operator bootstrapper (e.g., a lightweight agent that waits for cluster readiness and applies ArgoCD manifests).
Because each image is upstream and version-tagged, upgrades mean bumping image tags in your compose or bootstrap configuration. If you need to change how you inject secrets or change the startup order, you edit the steward’s small script and build a new steward image—no longer a project to maintain k3s internals.
Security and host defaults: why SELinux matters
A steward-style approach often requires privileged containers and host networking—capabilities that security subsystems are designed to restrict. SELinux in enforcing mode will rightly block operations that look like host takeover; the practical workaround is to select a host that either runs permissive or offers a clear, documented method for consenting to privileged container behavior at boot. That consent is the one-time negotiation I mentioned earlier. It isn’t about bypassing security best practices so much as choosing an architecture boundary: if your node’s role is to be controlled entirely by containers, you must accept the operational implications and document them in your provisioning pipeline.
Trade-offs: where this pattern hurts and where it shines
The steward pattern simplifies orchestration and improves reproducibility, but it trades off certain host-native capabilities. A concrete example is storage: some advanced Kubernetes storage systems (for example, solutions that require kernel modules, iSCSI drivers, or host-side tooling) expect deep host integration. Longhorn, a popular local persistent storage solution for k3s, often depends on kernel modules and host-level components that do not fit neatly inside a container-only lifecycle. To adopt those systems you either:
- Build and maintain a custom k3s image that bundles the necessary host integrations, which undermines the “don’t own the internals” advantage; or
- Accept simpler host-aligned storage patterns—block volumes mounted at /data, local-path provisioners, or cloud provider-managed volumes—that satisfy many homelab and edge use cases without significant custom kernel modules.
For homelabs and many edge scenarios this trade-off is acceptable: local-path PVCs and block volumes provide reliable persistence without the maintenance tax of custom images. In production, teams often prefer managed control planes (EKS, GKE, AKS) and cloud-native storage, where the trade-off resolves differently.
Ephemerality in action: deployment speed and recoverability
One of the clearer benefits is operational velocity. When the steward pattern is combined with ephemeral boot volumes, provisioning a new node becomes a fast, automated task: cloud-init provisions the VM, the steward container starts, and within minutes the node is available in the cluster and ArgoCD begins reconciling workloads. For a tested homelab build, that timeline can be measured in minutes—far faster than manual host configuration—and is reproducible because the steward, Podman Compose files, and child images are all version-controlled.
Ephemerality also simplifies disaster recovery and fleet refreshes: set preserve_boot_volume = false in your provisioning tool of choice and treat the boot volume as replaceable. Everything stateful lives on a separate block volume or managed storage; everything ephemeral lives in immutable images. The VM becomes cattle: replace it and let the steward rehydrate the node.
Who should use the steward pattern and when it makes sense
The steward approach is a practical fit for:
- Homelab operators who want reproducible, testable nodes without embracing full host automation tools like Ansible for every change.
- Edge deployments where the control plane must be lightweight and nodes should be rebuildable from code.
- CI testing environments that need disposable clusters created from an image and a block volume.
- Teams that value image-based reproducibility more than host-level customization.
It’s less suited to environments that require heavy host customization, specialized kernel modules, or strict security policies that forbid privileged containers with host networking. In those cases, consider either negotiating acceptable host baselines with platform security or using managed services where host-level responsibilities are handled by the cloud provider.
Operational checklist: how to implement a steward-based node
If you want to replicate this pattern, here’s a practical checklist to make adoption predictable:
- Choose a host OS image that allows cloud-init-driven changes and does not enforce restrictive security defaults out of the box (document any security consents).
- Build a minimal steward image containing: a lightweight shell, Podman (or Docker if you prefer), a compose tool, and any templating tools you need to render configs.
- Keep your steward’s logic focused on sequencing, configuration injection, and simple observability. Do not embed service internals in the steward image.
- Use upstream images for k3s, Tailscale, ArgoCD agents, and other node services, referencing explicit version tags.
- Mount block volumes for persistent data, and keep the boot volume disposable.
- Add readiness checks and backoff restart logic in the steward to handle network flaps or transient dependency ordering failures.
- Test bootstrap flows locally (QEMU, Vagrant, or Docker Desktop) to ensure the steward behaves identically across environments.
These steps make the entire bootstrap pipeline testable and auditable in source control, which is a major win for platform reliability.
Broader implications for developers and platform teams
The steward container pattern isn’t just a homelab curiosity; it signals a broader operational philosophy: treat platform behaviors as composable, versioned artifacts rather than opaque host states. For developer experience, this reduces “works on my machine” drift—developers can run an identical steward locally to validate cluster bootstrap behaviors before a CI job or cloud provisioning. For platform engineering, it reduces the burden of long-lived host maintenance: upgrades become tag bumps and compose updates instead of long maintenance windows involving configuration management.
However, the pattern also surfaces governance concerns. Platform teams must define clear boundaries for what can be delegated to container images versus what remains a host responsibility. Security teams must reconcile the need for privileged containers during bootstrap with auditability and least-privilege principles. Business stakeholders must weigh the gains in deployment speed and reproducibility against the potential limitations in host-level capabilities.
How this pattern interacts with adjacent ecosystems
This approach meshes naturally with several adjacent toolsets:
- CI/CD: ArgoCD and GitOps workflows dovetail with image-based node bootstrapping, enabling fully automated application deployments immediately after cluster readiness.
- Connectivity: tools like Tailscale or WireGuard run well as single-purpose containers and provide consistent networking across ephemeral nodes.
- Observability and security stacks: Prometheus exporters, logging shippers, and runtime security agents can be launched as containers and sequenced by the steward for immediate telemetry after boot.
- Developer tooling: local Kubernetes development tools—minikube, kind, or local Podman setups—become natural testbeds for steward logic.
Mentioning these ecosystems is not lip service; they are the complementary pieces that make an image-driven node lifecycle operationally useful.
Common pitfalls and how to avoid them
A few traps are worth calling out explicitly:
- Don’t turn upstream scratch images into DIY distributions. Use them as-is and compose around them.
- Don’t assume every storage or kernel-dependent feature will work without host changes. If you need kernel modules, evaluate whether custom host images or managed services are more sustainable.
- Avoid opaque privileged scripts: put steward logic in git, include clear readiness probes, and add idempotency so repeated runs are safe.
- Don’t forget monitoring for bootstrap failures: a steward that silently fails to start children will leave you with VMs that appear healthy but provide no services.
Addressing these pitfalls early keeps the steward approach maintainable and predictable.
k3s, Podman, GitOps and the future of image-first nodes
The steward container pattern is a pragmatic distillation of container-first principles: delegate behavior to upstream, versioned images, orchestrate them with a small, owned control surface, and make boot volumes ephemeral. For teams that prioritize reproducibility, fast recovery, and testability, steering node behavior with a steward container and a compose runtime like Podman Compose is an effective operational model. It reduces configuration drift, enables rapid provisioning, and lets developers and operators reason about cluster behavior in a single, testable artifact.
Looking forward, expect this idea to influence how edge and CI environments are provisioned: smaller, immutable steward images; richer readiness and policy hooks for security teams; and tighter integration with GitOps workflows so that a node is fully declared in source before it ever boots. As container runtimes and lightweight Kubernetes distributions evolve, the steward pattern will likely become a common way to manage ephemeral compute in scenarios where full managed control planes are not desirable or possible.

















