← reefy.ai

Reefy.ai: An Operating System for AI Computers

Why we built Reefy

Reefy began as a solution to our own problem: a growing, diverse home lab that we wanted to run like a cloud instead of managing each machine by hand.

Setting all of that up on a regular distro like Ubuntu or Red Hat - drivers, updates, remote access, storage, backups, monitoring - is days of work even for an experienced sysadmin, and largely out of reach for everyone else. Reefy folds it all into the operating system, so any machine you own gets the cloud's operational ease.

A Reefy home lab: a stack of mini-PCs and two custom Nvidia GPU machines on a desk
Reefy home lab AI machines - mini-PCs and two Nvidia GPU machines.

What is Reefy.ai?

At its core it is a Linux distribution, like Ubuntu or Red Hat, but built from first principles for one job: running AI workloads on bare metal with zero fuss - like Android for phones.

It is built entirely from source (with Buildroot): a mainline long-term-support kernel (Linux 6.18 LTS, maintained upstream into 2028), a systemd init, and a read-only SquashFS root - no package manager, no dpkg/rpm database, none of the decades of accumulated daemons and defaults nobody remembers turning on. What is left is just enough OS to bring up the hardware and start the Docker containers where your app actually lives.

Reefy takes care of everything below the container - the hardware, the kernel, storage, networking, updates. Your apps run on top.

Architecture at a glance

Reefy architecture stack: user apps run in Docker containers, managed by reefy-control (MQTT/control) and reefy-reconciler (data/storage), on the Linux kernel, on PC bare metal

A Reefy device runs a deliberately small split of two privileged processes:

The control plane is kept deliberately simple and independent, and it comes up immediately on boot - so you can see a machine's state and control it even when something else has gone wrong. The data plane, meanwhile, keeps your apps running from the last-applied state if the control link drops. The unit of truth is a single desired-state.json, persisted on the device and re-applied on every boot.

How Reefy compares to other Linux distros

A traditional Linux server is a sequence of manual steps - install distro, install drivers, configure Docker, wire up networking, expose services, set up monitoring and backups, then babysit updates. Reefy collapses that into: flash a USB, boot, adopt the device, start an app.

Capability Other distros Reefy Why it matters
OS install Image any machine in minutes - nothing to set up per box
GPU drivers GPU apps run out of the box - no driver / CUDA version hell
App deployment Launch apps in one click instead of SSH + Docker wrangling
Remote access Reach devices anywhere - no VPN, port-forwarding, or static IP
Updates A bad update auto-rolls back; a remote box never bricks
Storage Encrypted, snapshot-ready storage without the LVM / LUKS dance
Backups Move to a new machine like restoring a phone
Monitoring CPU / GPU / temps from first boot - no Grafana to stand up

Here "other distros" means the usual do-it-yourself path: a general-purpose distro where each of these is your job to install, wire up, and maintain. The rest of this article is how Reefy delivers each of them instead.

Managed from anywhere, including your phone

Every Reefy device is managed from a web dashboard at reefy.ai - cloud-hosted, so it is always reachable and shows your whole fleet even when a device itself is offline.

The transport is MQTT, not a custom agent protocol. Browsers connect directly over MQTT-over-WebSocket (WSS) using a short-lived JWT minted per session, so the dashboard gets the same real-time device stream the backend does - status, metrics, and app lifecycle events arrive as they happen.

Remote shells and app web UIs ride Cloudflare named tunnels (cloudflared on the device), so you reach a box at <hostname>--ssh--<id>.reefy.ai without it having a public IP or an open inbound port. The in-browser terminal runs over the same MQTT bridge. The whole dashboard is responsive, so all of this works even from a phone while you are on the go.

Works fully offline once configured

Remote management is a convenience, not a dependency.

Because the reconciler persists desired-state.json and the apps' container images locally, a device boots and brings its entire stack back up from local images - no network required. MQTT connects asynchronously afterward; apps never wait on the cloud to start. Saved network configuration (including static IPs) is re-applied on boot by the same reconcile path. The result is a device that runs unattended in fully offline or air-gapped environments, and simply syncs with the dashboard - including firmware (OS) upgrades - whenever it can reach it.

On the local network the device needs no cloud at all to reach: it advertises itself over mDNS as <name>.local, and serves a small page on port 80 with its own CA certificate to download and trust. Install that CA once and you get authenticated HTTPS to the device and its apps across the LAN - completely offline.

A boot path engineered for bare metal

Reefy boots through a Unified Kernel Image (UKI) - the entire OS packed into a single EFI binary.

The kernel, initramfs, kernel command line, and OS-release metadata are bundled into that one binary via systemd's linuxx64.efi.stub. One file, assembled with objcopy, is the entire bootable OS. This removes the fragile multi-file bootloader-plus-config setups that plague traditional distros and gives a clean, measurable boot artifact. Build flavors (production vs. a verbose dev shell) differ only in the baked-in .cmdline section; the kernel and rootfs are identical.

Bundling the whole OS into one EFI binary also makes it the natural unit for Secure Boot: there is a single artifact to sign and measure rather than a kernel, initramfs, and command line that each need covering independently. The image is signing-ready, so hardening the boot chain further - signed UKIs with enrolled keys, and TPM-measured boot - is a direct path from here rather than a re-architecture.

Boot and run from a USB dongle

Reefy boots and runs from a USB stick - a surprisingly optimal way to run bare metal, though it runs just as well flashed to an internal drive.

The dongle has real upsides. It carries a GPT layout of two EFI System Partitions (the A/B slots, reefy-a / reefy-b) and a tiny raw key partition (a data partition is added only when a device is adopted to run entirely from the dongle). Because the OS lives on the stick, the machine's internal NVMe/SSD is left entirely for your encrypted data, the OS is trivially portable between machines, and re-imaging or recovering a box is as simple as swapping a stick.

USB dongle holds the OS (EFI slots A/B + a key partition); the key unlocks the internal NVMe/SSD, which is LUKS2-encrypted with an LVM thin pool and XFS holding app data and snapshots

A/B updates, like mission-critical hardware

OS updates use an A/B slot scheme - the new OS goes to the spare slot, is tried, and rolls back automatically if it misbehaves.

The reefy-efi updater fully recreates the inactive slot - it is reformatted from scratch on every update, so any residual filesystem damage on that partition is wiped and recovered automatically - then writes the new UKI there and sets the UEFI BootNext variable as a one-shot to try it.

The safety net is health-gated, not blind: after a switched boot, the system waits for storage and the control plane to actually come up healthy and only then commits the new slot to BootOrder. If they do not, the machine reboots and UEFI falls straight back to the previous, known-good slot. There is a second, lower-level net too: a hardware watchdog (pinged by systemd through /dev/watchdog) is armed across the update, so even if the new kernel hangs completely - never even reaching the health check - the watchdog resets the box and UEFI falls back. The system is designed so that a failed update should not brick the machine.

This approach was inspired by the fault-tolerant update systems used in projects like SpaceX's Starlink satellites, where a failed remote update simply is not an option. For the curious, Starlink's paper on over-the-air updates covers the same problem at the scale of thousands of satellites - over the vacuum, in their case.

Sixty Starlink satellites stacked atop a Falcon 9 rocket before launch
Reefy was inspired by fault-tolerant update systems like SpaceX's Starlink. Photo: SpaceX (public domain).

Encryption on by default

User data encryption is on from first boot - not an option you remember to enable.

Data partitions are LUKS2 with AES-256-XTS, and when a device runs from a dongle the encryption key lives on the dongle itself.

That has an unexpected upside: pull the stick - and with it the key - and the internal drives are unreadable ciphertext. There is no wipe step and no waiting - a machine can be safely handed to another owner, returned, or decommissioned the moment the dongle is removed.

A storage stack tuned for AI data and backups

On internal drives the data path is LUKS2 -> LVM thin pool -> XFS - encrypted, with instant snapshots for backups.

A single volume group (reefy) hosts a thin pool (reefy_pool) from which per-app thin volumes are carved. Thin provisioning gives us instant, space-efficient snapshots, which is exactly what makes consistent backups possible without taking apps offline.

Two choices in that stack come straight from benchmarking rather than folklore:

We benchmarked the stack on one of the fastest NVMe drives available (a PCIe 4.0 Samsung 990 EVO Plus, rated 7.25 / 6.3 GB/s) and confirmed that sequential read and write are essentially unaffected by the encryption + LVM + XFS layers. Full methodology and numbers are in the storage benchmark study.

Cloud backup and painless migration

Reefy backs selected app-data folders up to the cloud with Borg - deduplicated and encrypted before they leave the device.

It uses one encrypted repository per user and per-instance append-only SSH keys for access control. The reconciler takes an LVM thin snapshot first, so the archive is point-in-time consistent even while the app keeps writing.

The payoff is that moving to a new machine feels like migrating to a new iPhone: bring up a fresh Reefy box, point an app instance at an existing archive, and the reconciler runs the restore as part of applying desired state. Your app comes back with the same data and state. Hardware becomes disposable; your setup does not - so a failed machine is replaced in minutes rather than painstakingly recovered. That matters most with consumer-grade hardware and across larger fleets, where something is always failing somewhere.

GPUs that just work

Nvidia support is built into the OS - drivers, CUDA, and firmware ship in the image.

Reefy tracks the latest stable drivers: the GPU kernel module, GSP firmware, and CUDA userspace all ship in the build. At boot, Reefy generates a Container Device Interface (CDI) spec for the GPU - the modern way to expose accelerators to containers, which replaces the older Nvidia container runtime.

Apps do not deal with any of this. An app simply declares that it wants a GPU, and Reefy attaches the device, libraries, and mounts automatically whenever the machine has one. No driver-version roulette, no manual runtime surgery.

Built-in monitoring, from the first boot

Monitoring in the spirit of Grafana, there the instant you boot - nothing to install or wire up.

Devices push metrics over MQTT (Prometheus-style reefy_* names with labels - so disk and network are reported per-mount and per-interface); the backend ingests them into a TimescaleDB time-series store, with rollups that keep multi-day dashboard queries fast.

CPU, GPU, memory, disk, network, fan speeds, and temperatures are all visible immediately. For custom-built PCs the fan and temperature series are genuinely useful for judging how well your cooling actually holds up under load - and the one-click Bench app (sysbench, fio, gpu-fryer) lets you put the hardware under that load on purpose.

Estimated cloud savings

Reefy turns the hardware you already own into a number you can point at.

The dashboard continuously estimates what the same machine would cost as a cloud VPS and accumulates the savings against its actual uptime.

The estimate is grounded: your CPU, RAM, and disk are matched to median VPS pricing across major providers (rounded down to stay conservative), and every online hour credits its share. The running total is exactly what you are not paying a cloud.

The App Catalog

A catalog of popular open-source apps that install in one click - and you can package and publish your own.

Reefy's catalog spans inference servers, AI agents, dev environments, and media and home apps. Either way, the hard parts of bare metal (boot, storage, encryption, GPUs, networking, updates) are already solved underneath, so an app is just a container plus a manifest.

Starting one is a single click: the dashboard turns the app's manifest into a container service - storage carved from the thin pool, environment injected, GPU attached when present, and a public HTTPS route provisioned - and pushes it to the device to bring up.

That public route is a feature in itself. Every app you start is reachable from anywhere instantly: its ports are published through a Cloudflare named tunnel, so there is no port forwarding, no firewall holes, and no public IP - the app is fronted at a *.reefy.ai URL with TLS and short-lived access tokens. Think of it like Tailscale Funnel for exposing a service port, except it works out of the box with zero setup.

A few examples from the catalog:

Development environments

Ubuntu and Fedora dev environments are full distros - first-class Reefy apps you can run several at once on one box.

Each is a full, real distro - systemd and all - with full access to the host hardware, including KVM, so you can spin up virtual machines inside them. That means several independent Ubuntu or Fedora environments can run on a single Reefy PC at once - genuinely handy for development and experimentation.

A unified LLM proxy and router

Reefy includes an OpenAI-compatible LLM gateway that owns provider credentials and token refresh, so your apps don't have to.

Reefy includes this proxy and router out of the box (reefy-llm-proxy). It owns provider credentials and the OAuth token-refresh lifecycle, so individual apps stop reinventing that plumbing: they point OPENAI_BASE_URL at the local proxy, use a standard OpenAI client, and let Reefy handle refresh and routing. Credentials stay on the device and are never synced back to the cloud. More detail in the reefy-llm-proxy repo.

Liveness monitoring and alerts

Reefy watches every device and emails you the moment one drops offline - or comes back.

Device up/down is detected through MQTT's Last-Will-Testament: when a device's connection drops, the broker publishes its offline status, the backend marks it down and emails the admin (with a stale-online scrub as defense in depth, and a back-online email when it returns). You always know the state of your fleet without watching it.

What hardware is supported?

Just about any x86-64 machine today, with ARM on the roadmap.

Reefy is deliberately agnostic to hardware configuration: the same image boots a 2-core Celeron with 4 GB of RAM and a 128-core datacenter server, and scales the storage, GPU, and app stack to whatever it finds. A mini-PC, an old laptop, a custom build, a rack server - they all run Reefy.

We're at the beginning

AI computers today are where smartphones were around 2003 - the platform is just forming, and the OS for it is still being created.

Android turned commodity hardware into a phone platform, and the market went from tens of millions of units to roughly a billion within a decade. AI computers are at that same early inflection - and Reefy is here to power them.

Line chart of global smartphone sales rising from near zero in 2002 to about 968 million units in 2013, annotated with the founding of Android (2003), Google's acquisition of Android (2005), the first iPhone (2007), and the first Android phone (2008)
Smartphones went from near-zero to ~1 billion units within a decade of Android's start. AI computers are at a similar starting point.

A reef, not a cloud

Reefy is a new concept, like the cloud - but on hardware you own.

A split view of clouds above the waterline and a colorful coral reef below, dotted with glowing server-like structures
We love the cloud - Reefy isn't a replacement, just a third option.

The cloud was a genuinely new idea: stop hand-tending servers like pets and treat compute as a uniform, declarative, API-driven utility - cattle, not pets. The trade-off is that it lives in someone else's datacenter, with your data and your compute off-prem.

Reefy brings that same modern model to hardware you own and keep close. Your machines are described by desired state, not configured by hand; they are reproducible and disposable (lose one, restore onto another like migrating to a new phone); and they are all managed from a single dashboard. The old sysadmin-per-box ritual is gone - but the box is still yours, on your desk or in your rack.

And to be clear: we love the cloud, and it is not going anywhere. Reefy is not a replacement so much as a third option. That is the whole metaphor - clouds are vast, white, uniform, and far away; a reef is colorful, diverse, alive, and close to home. Reefy is on-prem compute run the modern way: your own diverse hardware, managed like a cloud, right where you are.

The apps are yours; Reefy makes the metal disappear.

Open source

Reefy OS is developed in the open. Browse the firmware, the storage benchmark study, and more at github.com/reefyai/reefy.

Try Reefy

Getting a machine onto Reefy is short: sign in at reefy.ai, flash the personalized image to a USB stick, boot, and adopt the device in the dashboard.