Reaper
Reaper is a lightweight Kubernetes container-less runtime that executes commands directly on cluster nodes without traditional container isolation.
Think of it as a way to run host-native processes through Kubernetes’ orchestration layer — standard Kubernetes API (Pods, kubectl logs, kubectl exec) with full host access.
What Reaper Provides
- Standard Kubernetes API (Pods, kubectl logs, kubectl exec)
- Process lifecycle management (start, stop, restart)
- Shared overlay filesystem for workload isolation from host changes
- Kubernetes volumes (ConfigMap, Secret, hostPath, emptyDir)
- Sensitive host file filtering (SSH keys, passwords, SSL keys)
- Interactive sessions (PTY support)
- UID/GID switching with
securityContext - Per-pod configuration via Kubernetes annotations
- Custom Resource Definitions: ReaperPod (simplified workloads), ReaperOverlay (overlay lifecycle management), ReaperDaemonJob (run jobs on every node with dependency ordering)
- Helm chart for one-command installation and configuration
What Reaper Does NOT Provide
- Container isolation (namespaces, cgroups)
- Resource limits (CPU, memory)
- Network isolation (uses host networking)
- Container image pulling
Use Cases
- HPC workloads: Slurm worker daemons that need direct CPU/GPU access
- Cluster maintenance: Ansible playbooks and system configuration tasks
- Privileged system utilities: Direct hardware access, device management
- Node monitoring: Host-level metric exporters (node_exporter, etc.)
- Legacy applications: Programs that require host-level access
- Development and debugging: Interactive host access via kubectl
Disclaimer
Reaper is an experimental, personal project built to explore what’s possible with AI-assisted development. It is under continuous development with no stability guarantees. Use entirely at your own risk.
Source Code
The source code is available at github.com/miguelgila/reaper.
Installation
Helm (Recommended)
The recommended way to install Reaper on a Kubernetes cluster is via the Helm chart:
helm upgrade --install reaper deploy/helm/reaper/ \
--namespace reaper-system --create-namespace \
--wait --timeout 120s
This installs:
- Node DaemonSet: Copies shim + runtime binaries to every node
- CRD Controller: Watches ReaperPod resources and creates Pods
- Agent DaemonSet: Health monitoring and Prometheus metrics
- RuntimeClass: Registers
reaper-v2with Kubernetes - RBAC: Required roles and bindings
See Helm Chart Reference for configuration values.
Playground (Local Testing)
Spin up a 3-node Kind cluster with Reaper pre-installed. No local Rust toolchain needed — compilation happens inside Docker:
# Build from source
./scripts/setup-playground.sh
# Or use pre-built images from GHCR
./scripts/setup-playground.sh --release
# Use a specific release version
./scripts/setup-playground.sh --release v0.2.14
# Clean up
./scripts/setup-playground.sh --cleanup
Building from Source
Reaper requires Rust. The toolchain version is pinned in rust-toolchain.toml and installed automatically.
git clone https://github.com/miguelgila/reaper
cd reaper
cargo build --release
Binaries are output to target/release/.
Cross-Compilation (macOS to Linux)
Since Reaper runs on Linux Kubernetes nodes, cross-compile static musl binaries:
# For x86_64 nodes
docker run --rm -v "$(pwd)":/work -w /work \
messense/rust-musl-cross:x86_64-musl \
cargo build --release --target x86_64-unknown-linux-musl
# For aarch64 nodes
docker run --rm -v "$(pwd)":/work -w /work \
messense/rust-musl-cross:aarch64-musl \
cargo build --release --target aarch64-unknown-linux-musl
Requirements
Runtime (cluster nodes):
- Linux kernel with overlayfs support (standard since 3.18)
- Kubernetes cluster with containerd runtime
- Root access on cluster nodes
Playground:
Building from source:
- All of the above, plus Rust
Quick Start
This guide assumes you have a Reaper-enabled cluster (see Installation).
Run a Command on the Host
The simplest way to run a command on the host is with a ReaperPod:
apiVersion: reaper.io/v1alpha1
kind: ReaperPod
metadata:
name: my-task
spec:
command: ["/bin/sh", "-c", "echo Hello from $(hostname) && uname -a"]
kubectl apply -f my-task.yaml
kubectl logs my-task
kubectl get reaperpods
With Volumes
apiVersion: reaper.io/v1alpha1
kind: ReaperPod
metadata:
name: config-reader
spec:
command: ["/bin/sh", "-c", "cat /config/settings.yaml"]
volumes:
- name: config
mountPath: /config
configMap: "my-config"
readOnly: true
With Node Selector
apiVersion: reaper.io/v1alpha1
kind: ReaperPod
metadata:
name: compute-task
spec:
command: ["/bin/sh", "-c", "echo Running on $(hostname)"]
nodeSelector:
workload-type: compute
See ReaperPod CRD Reference for the full spec.
Using Raw Pods
For use cases that need the full Kubernetes Pod API — interactive sessions, DaemonSets, Deployments, exec, etc. — you can use standard Pods with runtimeClassName: reaper-v2.
Note: The
imagefield is required by Kubernetes but ignored by Reaper. Use a small image likebusybox.
Run a Command
apiVersion: v1
kind: Pod
metadata:
name: my-task
spec:
runtimeClassName: reaper-v2
restartPolicy: Never
containers:
- name: task
image: busybox
command: ["/bin/sh", "-c"]
args: ["echo Hello from host && uname -a"]
kubectl apply -f my-task.yaml
kubectl logs my-task # See output
kubectl get pod my-task # Status: Completed
Interactive Shell
kubectl run -it debug --rm --image=busybox --restart=Never \
--overrides='{"spec":{"runtimeClassName":"reaper-v2"}}' \
-- /bin/bash
Exec into Running Containers
kubectl exec -it my-pod -- /bin/sh
Volumes
ConfigMaps, Secrets, hostPath, emptyDir, and projected volumes all work:
apiVersion: v1
kind: Pod
metadata:
name: my-task
spec:
runtimeClassName: reaper-v2
restartPolicy: Never
volumes:
- name: config
configMap:
name: my-config
containers:
- name: task
image: busybox
command: ["/bin/sh", "-c", "cat /config/settings.yaml"]
volumeMounts:
- name: config
mountPath: /config
readOnly: true
See Pod Compatibility for the full list of supported and ignored fields.
What’s Next
Reaper provides Custom Resource Definitions for higher-level workflows:
- ReaperPod — Simplified pod spec without container boilerplate
- ReaperOverlay — PVC-like overlay lifecycle management
- ReaperDaemonJob — Run jobs to completion on every matching node, with dependency ordering and shared overlays
See the CRD Reference for full documentation and the examples for runnable demos.
Architecture Overview
Reaper consists of three components arranged in a three-tier system:
Kubernetes/containerd
↓ (ttrpc)
containerd-shim-reaper-v2 (long-lived shim, implements Task trait)
↓ (exec: create/start/state/delete/kill)
reaper-runtime (short-lived OCI runtime CLI)
↓ (fork FIRST, then spawn)
monitoring daemon → spawns workload → wait() → captures exit code
Components
containerd-shim-reaper-v2
The shim is a long-lived process (one per container) that communicates with containerd via ttrpc. It implements the containerd Task service interface and delegates OCI operations to the runtime binary.
reaper-runtime
The runtime is a short-lived CLI tool called by the shim for OCI operations (create, start, state, kill, delete). It implements the fork-first architecture for process monitoring.
Monitoring Daemon
The daemon is forked by the runtime during start. It spawns the workload as its child, calls wait() to capture the real exit code, and updates the state file.
Fork-First Architecture
This is the most critical design decision in Reaper:
- Runtime forks → creates monitoring daemon
- Parent (CLI) exits immediately (OCI spec requires this)
- Daemon calls
setsid()to detach - Daemon spawns workload (daemon becomes parent)
- Daemon calls
wait()on workload → captures real exit code - Daemon updates state file, then exits
Why fork-first? Only a process’s parent can call wait() on it. The daemon must be the workload’s parent to capture exit codes. Spawning first, then forking, would leave the Child handle invalid in the forked process.
State Management
Process lifecycle state is stored in /run/reaper/<container-id>/state.json:
{
"id": "abc123...",
"bundle": "/run/containerd/io.containerd.runtime.v2.task/k8s.io/abc123...",
"status": "stopped",
"pid": 12345,
"exit_code": 0
}
The shim polls this file to detect state changes and publishes containerd events (e.g., TaskExit).
Further Reading
- Shim v2 Protocol — Full protocol implementation details
- Overlay Filesystem — How host filesystem protection works
Shim v2 Protocol
Shim v2 Implementation Design
Overview
This document outlines the implementation plan for containerd Runtime v2 API (shim protocol) support in Reaper, enabling Kubernetes integration for command execution on the host system.
Important Clarification: Reaper does not create traditional containers. Instead, it executes commands directly on the Kubernetes cluster nodes, providing a lightweight alternative to full containerization for specific use cases.
Background
What is the Shim v2 Protocol?
The containerd Runtime v2 API is the interface between:
- containerd (Kubernetes container runtime)
- Container runtime shim (our code)
- Command executor (reaper-runtime running commands on host)
Kubernetes → CRI → containerd → [Shim v2 API] → reaper-shim → host command execution
Why Do We Need It?
Without shim v2:
- ❌ Kubernetes can’t execute commands via reaper
- ❌ No process lifecycle management
- ❌ No command output streaming
With shim v2:
- ✅ Kubernetes can run/start/stop commands
- ✅ Stream command output and exec into running processes
- ✅ Monitor command execution status
- ✅ Full process lifecycle support
Architecture
Three-Tier Design (Implemented)
containerd-shim-reaper-v2 ← Shim binary (ttrpc server, long-lived)
↓ (subprocess calls)
reaper-runtime ← OCI runtime CLI (create/start/state/kill/delete)
↓ (fork)
monitoring daemon ← Spawns and monitors workload
↓ (spawn)
workload process ← The actual command being run
Key Points:
- Shim is long-lived (one per container, communicates with containerd via ttrpc)
- Runtime is short-lived CLI (called by shim for OCI operations)
- Monitoring daemon is forked by runtime to watch workload
- Workload is spawned BY the daemon (daemon is parent)
Why Fork-First Architecture?
The Problem:
- OCI spec requires runtime CLI to exit immediately after
start - Someone needs to
wait()on the workload to capture exit code - Only a process’s parent can call
wait()on it
Previous Bug (FIXED):
We originally spawned the workload first, then forked. After fork(), the std::process::Child handle was invalid in the forked child because it was created by the parent process.
Solution: Fork FIRST, then spawn
- Runtime forks → creates monitoring daemon
- Parent (CLI) exits immediately
- Daemon spawns workload (daemon becomes parent)
- Daemon can now
wait()on workload
Shim v2 API Implementation
Task Service Methods
service Task {
rpc Create(CreateTaskRequest) returns (CreateTaskResponse);
rpc Start(StartTaskRequest) returns (StartTaskResponse);
rpc Delete(DeleteTaskRequest) returns (DeleteTaskResponse);
rpc Pids(PidsRequest) returns (PidsResponse);
rpc Pause(PauseRequest) returns (google.protobuf.Empty);
rpc Resume(ResumeRequest) returns (google.protobuf.Empty);
rpc Checkpoint(CheckpointTaskRequest) returns (google.protobuf.Empty);
rpc Kill(KillRequest) returns (google.protobuf.Empty);
rpc Exec(ExecProcessRequest) returns (google.protobuf.Empty);
rpc ResizePty(ResizePtyRequest) returns (google.protobuf.Empty);
rpc CloseIO(CloseIORequest) returns (google.protobuf.Empty);
rpc Update(UpdateTaskRequest) returns (google.protobuf.Empty);
rpc Wait(WaitRequest) returns (WaitResponse);
rpc Stats(StatsRequest) returns (StatsResponse);
rpc Connect(ConnectRequest) returns (ConnectResponse);
rpc Shutdown(ShutdownRequest) returns (google.protobuf.Empty);
}
Implementation Status
| Method | Status | Notes |
|---|---|---|
| Create | ✅ | Calls reaper-runtime create, handles sandbox detection |
| Start | ✅ | Calls reaper-runtime start, fork-first architecture |
| Delete | ✅ | Calls reaper-runtime delete, cleans up state |
| Kill | ✅ | Calls reaper-runtime kill, handles ESRCH gracefully |
| Wait | ✅ | Polls state file, publishes TaskExit event |
| State | ✅ | Calls reaper-runtime state, returns proper protobuf status |
| Pids | ✅ | Returns workload PID from state |
| Stats | ✅ | Basic implementation (no cgroup metrics) |
| Connect | ✅ | Returns shim and workload PIDs |
| Shutdown | ✅ | Triggers shim exit |
| Pause/Resume | ⚠️ | Returns OK but no-op (no cgroup freezer) |
| Checkpoint | ⚠️ | Not implemented (no CRIU) |
| Exec | ✅ | Implemented with PTY support |
| ResizePty | ✅ | Shim writes dimensions to resize file, runtime daemon applies via TIOCSWINSZ |
| CloseIO | ⚠️ | Not implemented |
| Update | ⚠️ | Not implemented (no cgroups) |
Implementation Milestones
✅ Milestone 1: Project Setup - COMPLETED
- Add dependencies:
containerd-shim,containerd-shim-protos,tokio,async-trait - Generate protobuf code from containerd definitions (via containerd-shim-protos)
- Create
containerd-shim-reaper-v2binary crate - Set up basic TTRPC server with Shim and Task traits
✅ Milestone 2: Core Task API - COMPLETED
- Implement Create - parse bundle, call reaper-runtime create
- Implement Start - call reaper-runtime start, capture PID
- Implement Delete - call reaper-runtime delete, cleanup state
- Implement Kill - call reaper-runtime kill with signal
- Implement Wait - poll state file for completion
- Implement State - return container status with proper protobuf enums
- Implement Pids - list container processes
✅ Milestone 3: Process Monitoring - COMPLETED
- Fork-first architecture in reaper-runtime
- Monitoring daemon as parent of workload
- Real exit code capture via
child.wait() - State file updates from monitoring daemon
- Zombie process prevention (proper reaping)
- Shim polling of state file for completion detection
✅ Milestone 4: Containerd Integration - COMPLETED
- TaskExit event publishing with timestamps
- Proper
exited_attimestamps in WaitResponse - Proper
exited_attimestamps in StateResponse - ESRCH handling in kill (already-exited processes)
- Sandbox container detection and faking
- Timing delay for fast processes
✅ Milestone 5: Kubernetes Integration - COMPLETED
- RuntimeClass configuration
- End-to-end pod lifecycle testing
- Pod status transitions to “Completed”
- Exit code capture and reporting
- No zombie processes
- PTY support for interactive containers
- Exec implementation with PTY support
- File descriptor leak fix
- Overlay namespace improvements
Critical Bug Fixes (January 2026)
1. Fork Order Bug
File: src/bin/reaper-runtime/main.rs:188-311
Problem: std::process::Child handle invalid after fork
Fix: Fork first, then spawn workload in the forked child
#![allow(unused)]
fn main() {
match unsafe { fork() }? {
ForkResult::Parent { .. } => {
// CLI exits, daemon will update state
sleep(100ms);
exit(0);
}
ForkResult::Child => {
setsid(); // Detach
let child = Command::new(program).spawn()?; // We're the parent!
update_state("running", child.id());
sleep(500ms); // Let containerd observe "running"
child.wait()?; // This works!
update_state("stopped", exit_code);
exit(0);
}
}
}
2. Fast Process Timing
File: src/bin/reaper-runtime/main.rs:264-270
Problem: Fast commands (echo) completed before containerd observed “running” state
Fix: Added 500ms delay after setting “running” state
3. Kill ESRCH Error
File: src/bin/reaper-runtime/main.rs:347-365
Problem: containerd’s kill() failed with ESRCH for already-dead processes
Fix: Treat ESRCH as success (process not running = goal achieved)
4. TaskExit Event Publishing
File: src/bin/containerd-shim-reaper-v2/main.rs:162-199
Problem: containerd wasn’t recognizing container exits
Fix: Publish TaskExit event with proper exited_at timestamp
5. Response Timestamps
File: src/bin/containerd-shim-reaper-v2/main.rs:545-552, 615-625
Problem: Missing timestamps in WaitResponse and StateResponse
Fix: Include exited_at timestamp in all responses for stopped containers
Technical Details
ReaperShim Structure
#![allow(unused)]
fn main() {
#[derive(Clone)]
struct ReaperShim {
exit: Arc<ExitSignal>,
runtime_path: String,
namespace: String,
}
}
ReaperTask Structure
#![allow(unused)]
fn main() {
#[derive(Clone)]
struct ReaperTask {
runtime_path: String,
sandbox_state: Arc<Mutex<HashMap<String, (bool, u32)>>>,
publisher: Arc<RemotePublisher>,
namespace: String,
}
}
State File Format
{
"id": "abc123...",
"bundle": "/run/containerd/io.containerd.runtime.v2.task/k8s.io/abc123...",
"status": "stopped",
"pid": 12345,
"exit_code": 0
}
Sandbox Container Detection
Sandbox (pause) containers are detected by checking:
- Image name contains “pause”
- Command is
/pause - Process args contain “pause”
Sandboxes return fake responses immediately (no actual process).
Dependencies
Cargo Dependencies
[dependencies]
containerd-shim = { version = "0.10", features = ["async", "tracing"] }
containerd-shim-protos = { version = "0.10", features = ["async"] }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
tracing = "0.1"
tracing-subscriber = "0.3"
Testing
Run Integration Tests
./scripts/run-integration-tests.sh
This orchestrates all testing including Rust unit tests, Kubernetes infrastructure setup, and comprehensive integration tests (DNS, overlay, host protection, UID/GID switching, privilege dropping, zombies, exec, etc.).
For options and troubleshooting, see TESTING.md.
Security Features
UID/GID Switching and Privilege Dropping
Implemented: February 2026
The runtime supports OCI user specification for credential switching, allowing workloads to run as non-root users. This integrates with Kubernetes securityContext:
spec:
securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
containers:
- name: app
securityContext:
runAsUser: 1001
Implementation
File: src/bin/reaper-runtime/main.rs
Privilege dropping follows the standard Unix sequence in pre_exec hooks:
#![allow(unused)]
fn main() {
// 1. Set supplementary groups (requires CAP_SETGID)
if !user.additional_gids.is_empty() {
let gids: Vec<gid_t> = user.additional_gids.iter().map(|&g| g).collect();
safe_setgroups(&gids)?;
}
// 2. Set GID (requires CAP_SETGID)
if setgid(user.gid) != 0 {
return Err(std::io::Error::last_os_error());
}
// 3. Set UID (irreversible privilege drop)
if setuid(user.uid) != 0 {
return Err(std::io::Error::last_os_error());
}
// 4. Apply umask (if specified)
if let Some(mask) = user.umask {
umask(mask as mode_t);
}
}
Platform Compatibility: The setgroups() syscall signature differs across platforms. We provide a platform-specific wrapper:
- Linux:
size_t(usize) for length parameter - macOS/BSD:
c_int(i32) for length parameter
Execution Paths
User switching is implemented in all four execution paths:
- PTY mode (interactive containers):
do_start()with terminal=true - Non-PTY mode (batch containers):
do_start()with terminal=false - Exec with PTY (kubectl exec -it):
do_exec()with terminal=true - Exec without PTY (kubectl exec):
do_exec()with terminal=false
Integration Tests
Unit Tests (tests/integration_user_management.rs):
test_run_with_current_user- Validates UID/GID from configtest_privilege_drop_root_to_user- Tests root → non-root transitiontest_non_root_cannot_switch_user- Permission denial for non-roottest_supplementary_groups_validation- additionalGids supporttest_umask_affects_file_permissions- umask application
Kubernetes Integration Tests (scripts/run-integration-tests.sh):
test_uid_gid_switching- securityContext UID/GID (runAsUser: 1000)test_privilege_drop- Unprivileged execution (runAsUser: 1001)
All tests validate actual runtime credentials (not just config parsing) via id -u and id -g commands in the container.
Resources
Document Version: 2.1 Last Updated: February 2026 Status: Core Implementation Complete with Exec and PTY Support
Overlay Filesystem
Overlay Filesystem Design
Overview
Reaper uses a shared mount namespace with an overlayfs to protect the host filesystem while allowing cross-deployment file sharing. All workloads on a node share a single writable overlay layer; the host root is the read-only lower layer.
How It Works
Host Root (/) ─── read-only lower layer
│
┌───────┴────────┐
│ OverlayFS │
│ merged view │
└───────┬────────┘
│
/run/reaper/overlay/upper ─── shared writable layer
- Reads fall through to the host root (lower layer)
- Writes go to the upper layer (
/run/reaper/overlay/upper) - All Reaper workloads see the same upper layer
- The host filesystem is never modified
Architecture
Namespace Creation (First Workload)
The first workload to start creates the shared namespace:
reaper-runtime do_start()
└─ fork() (daemon child)
└─ setsid()
└─ enter_overlay()
└─ acquire_lock(/run/reaper/overlay.lock)
└─ create_namespace()
└─ fork() (inner child - helper)
│ ├─ unshare(CLONE_NEWNS)
│ ├─ mount("", "/", MS_PRIVATE | MS_REC)
│ ├─ mount overlay on /run/reaper/merged
│ ├─ bind-mount /proc, /sys, /dev, /run
│ ├─ bind-mount /etc → /run/reaper/merged/etc
│ ├─ pivot_root(/run/reaper/merged, .../old_root)
│ ├─ umount(/old_root, MNT_DETACH)
│ └─ signal parent "ready", sleep forever (kept alive)
│
└─ inner parent (host ns):
├─ wait for "ready"
├─ bind-mount /proc/<child>/ns/mnt → /run/reaper/shared-mnt-ns
├─ keep child alive (helper persists namespace)
└─ setns(shared-mnt-ns) # join the namespace
Namespace Joining (Subsequent Workloads)
reaper-runtime do_start()
└─ enter_overlay()
└─ acquire_lock()
└─ namespace_exists(/run/reaper/shared-mnt-ns) → true
└─ join_namespace()
└─ setns(fd, CLONE_NEWNS)
Why Inner Fork?
The bind-mount of /proc/<pid>/ns/mnt to a host path must be done from
the HOST mount namespace. After unshare(CLONE_NEWNS), the process is in
the new namespace and bind-mounts don’t propagate to the host. The inner
parent stays in the host namespace to perform this operation.
Why Keep Helper Alive?
The helper process (inner child) is kept alive to persist the namespace.
While the bind-mount of /proc/<pid>/ns/mnt keeps the namespace reference,
keeping the helper alive ensures /etc files and other bind-mounts remain
accessible. The helper sleeps indefinitely until explicitly terminated.
Why pivot_root?
Mounting overlay directly on / hides all existing submounts (/proc,
/sys, /dev). With pivot_root, we mount overlay on a new point,
bind-mount special filesystems into it, then switch root. This preserves
real host /proc, /sys, and /dev.
Configuration
| Variable | Default | Description |
|---|---|---|
REAPER_OVERLAY_BASE | /run/reaper/overlay | Base dir for upper/work layers |
Overlay is always enabled on Linux. There is no option to disable it — workloads must not modify the host filesystem.
Bind-Mounted Directories
Only kernel-backed special filesystems and /run are bind-mounted from
the host into the overlay:
/proc— process information (kernel-backed)/sys— kernel/device information (kernel-backed)/dev— device nodes (kernel-backed)/run— runtime state (needed for daemon↔shim communication via state files)
/tmp is NOT bind-mounted — writes to /tmp go through the overlay
upper layer, protecting the host’s /tmp from modification.
Directory Structure
/run/reaper/
├── overlay/
│ ├── upper/ # shared writable layer
│ └── work/ # overlayfs internal
├── merged/ # pivot_root target (temporary during setup)
├── shared-mnt-ns # bind-mounted namespace reference
├── overlay.lock # file lock for namespace creation
└── <container-id>/ # per-container state (existing)
Lifecycle
- Boot:
/runis tmpfs, starts empty (ephemeral by design) - First workload: Creates overlay dirs, namespace, and overlay mount
- Subsequent workloads: Join existing namespace via
setns() - Reboot: Everything under
/runis cleared; fresh start
Mandatory Isolation
Overlay is mandatory on Linux. If overlay setup fails (e.g., not running
as root, kernel lacks overlay support), the workload is refused — it
will not run on the host filesystem. The daemon exits with code 1 and
updates the container state to stopped.
Requirements
- Linux kernel with overlayfs support (standard since 3.18)
CAP_SYS_ADMIN(required forunshare,setns,mount,pivot_root)- Reaper runtime runs as root on the node (standard for container runtimes)
- Not available on macOS (code gated with
#[cfg(target_os = "linux")])
Sensitive File Filtering
Reaper automatically filters sensitive host files to prevent workloads from accessing credentials, SSH keys, and other sensitive data. Filtering is implemented by bind-mounting empty placeholders over sensitive paths after pivot_root.
Default Filtered Paths
/root/.ssh- root user SSH keys/etc/shadow,/etc/gshadow- password hashes/etc/ssh/ssh_host_*_key- SSH host private keys/etc/ssl/private- SSL/TLS private keys/etc/sudoers,/etc/sudoers.d- sudo configuration/var/lib/docker- Docker internal state/run/secrets- container secrets
Configuration
| Variable | Default | Description |
|---|---|---|
REAPER_FILTER_ENABLED | true | Enable/disable filtering |
REAPER_FILTER_PATHS | "" | Colon-separated custom paths |
REAPER_FILTER_MODE | append | append or replace |
REAPER_FILTER_ALLOWLIST | "" | Paths to exclude from filtering |
REAPER_FILTER_DIR | /run/reaper/overlay-filters | Placeholder directory |
Example: Add custom paths while keeping defaults:
REAPER_FILTER_PATHS="/custom/secret:/home/user/.aws/credentials"
Example: Replace default list entirely:
REAPER_FILTER_MODE=replace
REAPER_FILTER_PATHS="/etc/shadow:/etc/gshadow"
Example: Disable a specific default filter:
REAPER_FILTER_ALLOWLIST="/etc/shadow"
Security Guarantees
- Filters are immutable (workloads cannot unmount them)
- Applied once during namespace creation
- Inherited by all workloads joining the namespace
- Non-existent paths are silently skipped
- Individual filter failures are logged but non-fatal
How It Works
After pivot_root completes in the shared namespace:
- Read filter configuration from environment variables
- Build filter list (defaults + custom, minus allowlist)
- Create empty placeholder files/directories in
/run/reaper/overlay-filters/ - For each sensitive path:
- If path exists, create matching placeholder (file or directory)
- Bind-mount placeholder over the sensitive path
- Log success/failure
This makes sensitive files appear empty or missing to workloads, while the actual host files remain untouched.
Namespace Isolation
By default (REAPER_OVERLAY_ISOLATION=namespace), each Kubernetes namespace
gets its own isolated overlay. This means workloads in production cannot
see writes from workloads in dev, matching Kubernetes’ namespace-as-trust-boundary
expectation.
How It Works
- The containerd shim reads
io.kubernetes.pod.namespacefrom OCI annotations - It passes
--namespace <ns>toreaper-runtime create - The runtime stores the namespace in
ContainerState(state.json) - On
startandexec, the runtime reads the namespace from state and computes per-namespace paths for overlay dirs, mount namespace, and lock
Per-Namespace Path Layout
/run/reaper/
overlay/
default/upper/ # K8s "default" namespace
default/work/
kube-system/upper/ # K8s "kube-system" namespace
kube-system/work/
merged/
default/ # pivot_root target per namespace
kube-system/
ns/
default # persisted mount namespace bind-mount
kube-system
overlay-default.lock # per-namespace flock
overlay-kube-system.lock
Legacy Node-Wide Mode
Set REAPER_OVERLAY_ISOLATION=node to use the old flat layout where all
workloads share a single overlay regardless of their K8s namespace. This
is useful for cross-deployment file sharing or backward compatibility.
Upgrade Path
Existing containers created before the upgrade have namespace: None in their
state files. With the default namespace isolation mode, their start will fail.
Drain nodes before upgrading to ensure no in-flight containers are affected.
Limitations
/runis typically a small tmpfs; for write-heavy workloads, configureREAPER_OVERLAY_BASEto point to a larger filesystem- Within a single K8s namespace, workloads still share the same overlay (no per-pod isolation)
- Overlay does not protect against processes that directly modify kernel
state via
/procor/syswrites - Sensitive file filtering does not support glob patterns (use explicit paths)
Node Configuration
Configuration
Reaper is configured through a combination of node-level configuration files, environment variables, and per-pod Kubernetes annotations.
Node Configuration
Reaper reads configuration from /etc/reaper/reaper.conf on each node. The Helm chart creates this file automatically via the node DaemonSet init container.
Config File Format
# /etc/reaper/reaper.conf (KEY=VALUE, one per line)
REAPER_DNS_MODE=kubernetes
REAPER_RUNTIME_LOG=/run/reaper/runtime.log
REAPER_OVERLAY_BASE=/run/reaper/overlay
REAPER_OVERLAY_ISOLATION=namespace
Load Order
- Config file defaults (
/etc/reaper/reaper.conf) - Environment variables override file values
Settings Reference
| Variable | Default | Description |
|---|---|---|
REAPER_CONFIG | /etc/reaper/reaper.conf | Override config file path |
REAPER_DNS_MODE | host | DNS resolution: host (node’s resolv.conf) or kubernetes/k8s (CoreDNS) |
REAPER_OVERLAY_ISOLATION | namespace | Overlay isolation: namespace (per-K8s-namespace) or node (shared) |
REAPER_OVERLAY_BASE | /run/reaper/overlay | Base directory for overlay upper/work layers |
REAPER_RUNTIME_LOG | (none) | Runtime log file path |
REAPER_SHIM_LOG | (none) | Shim log file path |
REAPER_ANNOTATIONS_ENABLED | true | Master switch for pod annotation overrides |
REAPER_FILTER_ENABLED | true | Enable sensitive file filtering in overlay |
REAPER_FILTER_PATHS | (none) | Additional colon-separated paths to filter |
REAPER_FILTER_MODE | append | Filter mode: append (add to defaults) or replace |
REAPER_FILTER_ALLOWLIST | (none) | Paths to exclude from filtering |
Pod Annotations
Users can override certain Reaper configuration parameters per-pod using Kubernetes annotations with the reaper.runtime/ prefix.
Supported Annotations
| Annotation | Values | Default | Description |
|---|---|---|---|
reaper.runtime/dns-mode | host, kubernetes, k8s | Node config (REAPER_DNS_MODE) | DNS resolution mode for this pod |
reaper.runtime/overlay-name | DNS label (e.g., pippo) | (none — uses namespace overlay) | Named overlay group within the namespace |
Example
apiVersion: v1
kind: Pod
metadata:
name: my-task
annotations:
reaper.runtime/dns-mode: "kubernetes"
reaper.runtime/overlay-name: "my-group"
spec:
runtimeClassName: reaper-v2
restartPolicy: Never
containers:
- name: task
image: busybox
command: ["/bin/sh", "-c", "nslookup kubernetes.default"]
Security Model
- Only annotations in the allowlist above are honored. Unknown annotation keys are silently ignored.
- Administrator-controlled parameters (overlay paths, filter settings, isolation mode) cannot be overridden via annotations.
- Administrators can disable all annotation processing:
REAPER_ANNOTATIONS_ENABLED=false
How It Works
- The shim extracts
reaper.runtime/*annotations from the OCI config (populated by kubelet from pod metadata). - Annotations are stored in the container state during
create. - During
start, annotations are validated against the allowlist and applied. Invalid values are logged and ignored. - If no annotation is set, the node-level configuration is used as the default.
Helm Chart Values
The Helm chart (deploy/helm/reaper/) configures most settings automatically. Key values:
# Node configuration written to /etc/reaper/reaper.conf
config:
dnsMode: kubernetes
runtimeLog: /run/reaper/runtime.log
# Image settings (tag defaults to Chart.AppVersion)
node:
image:
repository: ghcr.io/miguelgila/reaper-node
tag: ""
controller:
image:
repository: ghcr.io/miguelgila/reaper-controller
tag: ""
agent:
enabled: true
image:
repository: ghcr.io/miguelgila/reaper-agent
tag: ""
See deploy/helm/reaper/values.yaml for the full reference.
Pod Annotations
metadata: name: my-task annotations: reaper.runtime/dns-mode: “kubernetes” reaper.runtime/overlay-name: “my-group” spec: runtimeClassName: reaper-v2 restartPolicy: Never containers: - name: task image: busybox command: [“/bin/sh”, “-c”, “nslookup kubernetes.default”]
### Security Model
- Only annotations in the allowlist above are honored. Unknown annotation keys are silently ignored.
- Administrator-controlled parameters (overlay paths, filter settings, isolation mode) **cannot** be overridden via annotations.
- Administrators can disable all annotation processing: `REAPER_ANNOTATIONS_ENABLED=false`
### How It Works
1. The shim extracts `reaper.runtime/*` annotations from the OCI config (populated by kubelet from pod metadata).
2. Annotations are stored in the container state during `create`.
3. During `start`, annotations are validated against the allowlist and applied. Invalid values are logged and ignored.
4. If no annotation is set, the node-level configuration is used as the default.
## Helm Chart Values
The Helm chart (`deploy/helm/reaper/`) configures most settings automatically. Key values:
```yaml
# Node configuration written to /etc/reaper/reaper.conf
config:
dnsMode: kubernetes
runtimeLog: /run/reaper/runtime.log
# Image settings (tag defaults to Chart.AppVersion)
node:
image:
repository: ghcr.io/miguelgila/reaper-node
tag: ""
controller:
image:
repository: ghcr.io/miguelgila/reaper-controller
tag: ""
agent:
enabled: true
image:
repository: ghcr.io/miguelgila/reaper-agent
tag: ""
See deploy/helm/reaper/values.yaml for the full reference.
Pod Field Compatibility
Pod Field Compatibility
Reaper implements the Kubernetes Pod API but ignores or doesn’t support certain container-specific fields since it runs processes directly on the host without traditional container isolation.
Field Reference
| Pod Field | Behavior |
|---|---|
spec.containers[].image | Ignored by Reaper — Kubelet pulls the image before the runtime runs, so a valid image is required. Use a lightweight image like busybox. Reaper does not use it. |
spec.containers[].resources.limits | Ignored — No cgroup enforcement; processes use host resources. |
spec.containers[].resources.requests | Ignored — Scheduling hints not used. |
spec.containers[].volumeMounts | Supported — Bind mounts for ConfigMap, Secret, hostPath, emptyDir. |
spec.containers[].securityContext.capabilities | Ignored — Processes run with host-level capabilities. |
spec.containers[].livenessProbe | Ignored — No health checking. |
spec.containers[].readinessProbe | Ignored — No readiness checks. |
spec.containers[].command | Supported — Program path on host (must exist). |
spec.containers[].args | Supported — Arguments to the command. |
spec.containers[].env | Supported — Environment variables. |
spec.containers[].workingDir | Supported — Working directory for the process. |
spec.runtimeClassName | Required — Must be set to reaper-v2. |
Best Practice
Use a small, valid image like busybox. Kubelet pulls the image before handing off to the runtime, so the image must exist in a registry. Reaper itself ignores the image entirely — it runs the command directly on the host.
Supported Features Summary
| Feature | Status |
|---|---|
command / args | Supported |
env / envFrom | Supported |
volumeMounts (ConfigMap, Secret, hostPath, emptyDir) | Supported |
workingDir | Supported |
securityContext.runAsUser / runAsGroup | Supported |
restartPolicy | Supported (by kubelet) |
runtimeClassName | Required (reaper-v2) |
| Resource limits/requests | Ignored |
| Probes (liveness, readiness, startup) | Ignored |
| Capabilities | Ignored |
| Image pulling | Handled by kubelet, ignored by Reaper |
Development Guide
Development Guide
This document contains information for developers working on the Reaper project.
Table of Contents
- Development Setup
- Building
- Testing
- Code Quality
- Git Hooks
- Docker (Optional)
- VS Code Setup
- CI/CD
- Coverage
- Contributing
Development Setup
Prerequisites
- Rust toolchain (we pin
stableviarust-toolchain.toml) - Docker (optional, for Linux-specific testing on macOS)
- Ansible (for deploying to clusters)
Clone and Build
git clone https://github.com/miguelgila/reaper
cd reaper
cargo build
The repository includes rust-toolchain.toml which automatically pins the Rust toolchain version and enables rustfmt and clippy components.
Building
Local Build (Debug)
cargo build
Binaries are output to target/debug/.
Release Build
cargo build --release
Binaries are output to target/release/.
Static Musl Build (for Kubernetes deployment)
For deployment to Kubernetes clusters, we build static musl binaries:
# Install musl target (one-time setup)
rustup target add x86_64-unknown-linux-musl
# Build static binary
docker run --rm \
-v "$(pwd)":/work \
-w /work \
messense/rust-musl-cross:x86_64-musl \
cargo build --release --target x86_64-unknown-linux-musl
This produces binaries at target/x86_64-unknown-linux-musl/release/ that work in containerized environments (like Kind nodes).
For aarch64:
rustup target add aarch64-unknown-linux-musl
docker run --rm \
-v "$(pwd)":/work \
-w /work \
messense/rust-musl-cross:aarch64-musl \
cargo build --release --target aarch64-unknown-linux-musl
Testing
See TESTING.md for comprehensive testing documentation.
Quick Reference
# Unit tests (fast, recommended for local development)
cargo test
# Full integration tests (Kubernetes + unit tests)
./scripts/run-integration-tests.sh
# Integration tests (K8s only, skip cargo tests)
./scripts/run-integration-tests.sh --skip-cargo
# Coverage report (requires Docker)
./scripts/docker-coverage.sh
Test Modules
tests/integration_basic_binary.rs- Basic runtime functionality (create/start/state/delete)tests/integration_user_management.rs- User/group ID handling, umasktests/integration_shim.rs- Shim-specific teststests/integration_io.rs- FIFO stdout/stderr redirectiontests/integration_exec.rs- Exec into running containerstests/integration_overlay.rs- Overlay filesystem tests
Run a specific test suite:
cargo test --test integration_basic_binary
Code Quality
Formatting
Format all code before committing:
cargo fmt --all
Check formatting without making changes:
cargo fmt --all -- --check
Linting
Run clippy to catch common mistakes and improve code quality:
# Quick check
cargo clippy --all-targets --all-features
# Match CI exactly (treats warnings as errors)
cargo clippy -- -D warnings
CI runs clippy with -D warnings, so any warning is a hard failure. The pre-push hook runs this automatically if you’ve installed hooks via ./scripts/install-hooks.sh.
Linux Cross-Check (macOS only)
The overlay module (src/bin/reaper-runtime/overlay.rs) is gated by #[cfg(target_os = "linux")] and doesn’t compile on macOS. To catch compilation errors in Linux-only code:
# One-time setup
rustup target add x86_64-unknown-linux-gnu
# Check compilation for Linux target
cargo clippy --target x86_64-unknown-linux-gnu --all-targets --all-features
Git Hooks
We provide git hooks in .githooks/ to catch issues before they reach CI.
Enable Hooks
./scripts/install-hooks.sh
This sets core.hooksPath to .githooks/ and marks the hooks executable. Since the hooks are checked into the repo, every contributor gets the same setup.
Available Hooks
| Hook | Runs | Purpose |
|---|---|---|
pre-commit | cargo fmt --all | Auto-formats code and stages changes before each commit |
pre-push | cargo clippy -- -D warnings | Catches lint issues before pushing (matches CI) |
The pre-push hook mirrors the exact clippy invocation used in CI, so pushes that pass locally will pass the CI clippy check too.
Customization
- pre-commit: To fail on unformatted code instead of auto-fixing, change
cargo fmt --alltocargo fmt --all -- --checkand remove the re-staging logic. - pre-push: To skip clippy for a one-off push, use
git push --no-verify.
Docker (Optional)
Docker is not required for local development on macOS. Prefer cargo test locally for speed.
Use Docker when you need:
- Code coverage via
cargo-tarpaulin(Linux-first tool) - CI failure reproduction specific to Linux
- Static musl binary builds for Kubernetes
Run Coverage in Docker
./scripts/docker-coverage.sh
This runs cargo-tarpaulin in a Linux container with appropriate capabilities.
VS Code Setup
Recommended Extensions
- rust-analyzer — Main Rust language support
- CodeLLDB (vadimcn.vscode-lldb) — Debug adapter for Rust
- Test Explorer UI — Unified test UI
Configure rust-analyzer to run clippy on save and enable CodeLens for inline run/debug buttons.
CI/CD
GitHub Actions workflows run on pushes and pull requests to main:
CI Workflow (ci.yml)
A single unified pipeline that runs:
cargo fmt -- --check(formatting)cargo clippy --workspace --all-targets -- -D warnings(linting)cargo audit(dependency vulnerability scan)cargo test --verbose(unit tests)cargo tarpaulin→ Codecov upload (coverage)- Cross-compile static musl binaries (all binaries)
- Kind integration tests (
run-integration-tests.sh --skip-cargo) - Example validation (
test-examples.sh --skip-cluster)
Coverage
Local Coverage (Linux)
If running on Linux, you can use tarpaulin directly:
cargo install cargo-tarpaulin
cargo tarpaulin --out Xml --timeout 600
Coverage via Docker (macOS/Windows)
Run the included Docker script:
./scripts/docker-coverage.sh
Configuration lives in tarpaulin.toml. Functions requiring root + Linux namespaces (tested by kind-integration) are excluded via #[cfg(not(tarpaulin_include))] so coverage reflects what unit tests can actually reach.
Contributing
Before Opening a PR
-
Format code:
cargo fmt --all -
Run linting:
cargo clippy --all-targets --all-features -
Run tests:
cargo test -
Optional: Run integration tests:
./scripts/run-integration-tests.sh -
Install git hooks (auto-formats on commit, runs clippy before push):
./scripts/install-hooks.sh
Development Workflow
For fast feedback during development:
# Quick iteration cycle
cargo test # Unit tests (seconds)
cargo clippy # Linting
# Before pushing
cargo fmt --all # Format code
cargo test # All unit tests
./scripts/run-integration-tests.sh # Full validation
Integration Test Iteration
If you’re iterating on overlay or shim logic:
# First run (build cluster, binaries, tests)
./scripts/run-integration-tests.sh --no-cleanup
# Make code changes...
# Rebuild and test (skip cargo, reuse cluster)
cargo build --release --bin containerd-shim-reaper-v2 --bin reaper-runtime
./scripts/run-integration-tests.sh --skip-cargo --no-cleanup
# Repeat until satisfied...
# Final cleanup run
./scripts/run-integration-tests.sh --skip-cargo
Project Structure
reaper/
├── src/
│ ├── bin/
│ │ ├── containerd-shim-reaper-v2/ # Shim binary
│ │ │ └── main.rs # Shim implementation
│ │ └── reaper-runtime/ # Runtime binary
│ │ ├── main.rs # OCI runtime CLI
│ │ ├── state.rs # State persistence
│ │ └── overlay.rs # Overlay filesystem (Linux)
├── tests/ # Integration tests
├── scripts/ # Installation and testing scripts
├── deploy/
│ ├── ansible/ # Ansible playbooks for deployment
│ └── kubernetes/ # Kubernetes manifests
├── docs/ # Documentation
└── .githooks/ # Git hooks (pre-commit, pre-push)
Common Tasks
Add a New Binary
- Create directory under
src/bin/<binary-name>/ - Add
main.rsin that directory - Add entry to
Cargo.toml:[[bin]] name = "binary-name" path = "src/bin/binary-name/main.rs"
Add a New Test Suite
- Create
tests/integration_<name>.rs - Use
#[test]or#[tokio::test]for async tests - Run with
cargo test --test integration_<name>
Update Dependencies
# Check for outdated dependencies
cargo outdated
# Update to latest compatible versions
cargo update
# Update Cargo.lock and check tests still pass
cargo test
Debug a Test
Use VS Code’s debug launch configurations or run with logging:
RUST_LOG=debug cargo test <test-name> -- --nocapture
Troubleshooting
Clippy Errors on macOS for Linux-only Code
Run clippy with Linux target:
cargo clippy --target x86_64-unknown-linux-gnu --all-targets
Tests Fail with “Permission Denied”
Some tests require root for namespace operations. Run:
sudo cargo test
Or use integration tests which run in Kind (isolated environment):
./scripts/run-integration-tests.sh
Docker Build Fails
Ensure Docker is running:
docker ps
If Docker daemon is not accessible, start Docker Desktop or the Docker daemon.
Integration Tests Timeout
Increase timeout or check cluster resources:
kubectl get nodes
kubectl describe pod <pod-name>
Additional Resources
Testing & Integration
Testing & Integration
This document consolidates all information about running tests, integration tests, and development workflows for the Reaper project.
Quick Reference
All common tasks are available via make. Run make help for the full list.
| Task | Command |
|---|---|
| Full CI check (recommended before push) | make ci |
| Unit tests | make test |
| Clippy (macOS) | make clippy |
| Clippy (Linux cross-check) | make check-linux |
| Coverage (Docker, CI-parity) | make coverage |
| Integration tests (full suite) | make integration |
| Integration tests (K8s only, skip cargo) | make integration-quick |
Unit Tests
Run Rust tests natively on your machine:
cargo test
Tests run in a few seconds and provide immediate feedback. Use this for development iteration.
Test Modules
integration_basic_binary- Basic runtime functionalityintegration_user_management- User/group handling (UID/GID switching, privilege dropping, umask)integration_shim- Shim-specific testsintegration_io- FIFO stdout/stderr redirectionintegration_exec- Exec into running containersintegration_overlay- Overlay filesystem tests
Run a specific test:
cargo test --test integration_basic_binary
Integration Tests (Kubernetes)
The main integration test suite runs against a kind (Kubernetes in Docker) cluster. It validates:
- ✓ DNS resolution in container
- ✓ Basic command execution (echo)
- ✓ Overlay filesystem sharing across pods
- ✓ Host filesystem protection (no leakage to host)
- ✓ UID/GID switching with securityContext
- ✓ Privilege drop to non-root user
- ✓ Shim cleanup after pod deletion
- ✓ No defunct (zombie) processes
- ✓
kubectl execsupport
Full Suite (Recommended)
Runs cargo tests, builds binaries, creates a kind cluster, and runs all integration tests:
./scripts/run-integration-tests.sh
Options:
--skip-cargo— Skip Rust unit tests (useful for rapid K8s-only reruns)--no-cleanup— Keep the kind cluster running after tests (for debugging)--verbose— Also print debug output to stdout (in addition to log file)--agent-only— Only run agent tests (skip cargo, integration, and controller tests)--crd-only— Only run CRD controller tests (skip cargo, integration, and agent tests)--help— Show usage
Examples
Rerun K8s tests against an existing cluster:
./scripts/run-integration-tests.sh --skip-cargo --no-cleanup
Run only CRD controller tests (fast iteration on ReaperPod CRD):
./scripts/run-integration-tests.sh --crd-only --no-cleanup
Run only agent tests:
./scripts/run-integration-tests.sh --agent-only --no-cleanup
Keep cluster for interactive debugging:
./scripts/run-integration-tests.sh --no-cleanup
Then interact with the cluster:
kubectl get pods
kubectl logs <pod-name>
kubectl describe pod <pod-name>
Test Output & Logs
- Console output: Test results with pass/fail badges
- Log file:
/tmp/reaper-integration-logs/integration-test.log(detailed diagnostics) - GitHub Actions: Results posted to job summary when run in CI
How It Works
The test harness orchestrates:
- Phase 1: Rust cargo tests (
integration_*tests) - Phase 2: Kubernetes infrastructure setup
- Create or reuse kind cluster
- Build static musl binaries (matches node architecture)
- Deploy shim and runtime binaries to cluster node
- Configure containerd with the Reaper runtime
- Phase 3: Kubernetes readiness
- Wait for API server and nodes
- Create RuntimeClass
- Wait for default ServiceAccount
- Phase 4: Integration tests
- DNS, echo, overlay, host protection, UID/GID switching, privilege drop, exec, zombie check
- Phase 4b: Controller tests (ReaperPod CRD)
- CRD installation, controller deployment, ReaperPod lifecycle, status mirroring, exit code propagation, annotations, custom printer columns, garbage collection
- Phase 5: Summary & reporting
Coverage
Generate code coverage report using Docker:
./scripts/docker-coverage.sh
This runs cargo-tarpaulin (Linux-first tool) in a container with appropriate capabilities.
Containerd Configuration
Configure a containerd instance to use the Reaper shim runtime:
./scripts/configure-containerd.sh <context> <node-id>
<context>:kindorminikube(determines config locations)<node-id>: Docker container ID (e.g., fromdocker ps)
This script is automatically run by run-integration-tests.sh.
Development Workflow
Before Pushing (CI-parity on macOS)
Run the full CI-equivalent check locally:
make ci
This runs, in order: fmt check, clippy (macOS), clippy (Linux cross-check), cargo test, and coverage (Docker + tarpaulin). If this passes, CI will pass.
Quick Iteration
For fast feedback during development:
make test # Unit tests only (seconds)
make clippy # macOS clippy
make check-linux # Catches #[cfg(linux)] compilation issues
Linux Cross-Check
The overlay module (overlay.rs) is gated by #[cfg(target_os = "linux")] and doesn’t compile on macOS. make check-linux cross-checks clippy against the x86_64-unknown-linux-gnu target to catch compilation errors in Linux-only code without leaving macOS.
Requires the target (one-time setup):
rustup target add x86_64-unknown-linux-gnu
Coverage (CI-parity)
Coverage runs tarpaulin inside Docker to match CI exactly:
make coverage
Configuration lives in tarpaulin.toml. Functions requiring root + Linux namespaces (tested by kind-integration) are excluded via #[cfg(not(tarpaulin_include))] so coverage reflects what unit tests can actually reach.
Integration Test Iteration
If you’re iterating on overlay or shim logic:
# First run (build cluster, binaries, tests)
./scripts/run-integration-tests.sh --no-cleanup
# Make code changes...
# Rebuild and test (skip cargo, reuse cluster)
cargo build --release --bin containerd-shim-reaper-v2 --bin reaper-runtime --target x86_64-unknown-linux-musl
./scripts/run-integration-tests.sh --skip-cargo --no-cleanup
# Repeat until satisfied...
# Final cleanup run
./scripts/run-integration-tests.sh --skip-cargo
Troubleshooting
No kind cluster available
The test harness automatically creates one. If it fails, check:
- Docker is running:
docker ps - kind is installed:
kind --version - Sufficient disk space:
df -h
Pod stuck in Pending
Check containerd logs on the node:
docker exec <node-id> journalctl -u containerd -n 50 --no-pager
Check Kubelet logs:
docker exec <node-id> journalctl -u kubelet -n 50 --no-pager
Test times out
Increase timeout in test function or check node resources:
docker exec <node-id> top -b -n 1
docker exec <node-id> df -h
RuntimeClass not found
Wait a few seconds after applying the RuntimeClass, as it takes time to propagate.
Directory Structure
reaper/
├── scripts/
│ ├── run-integration-tests.sh [MAIN] Orchestrates all integration tests
│ ├── install-reaper.sh Ansible-based installation (DEPRECATED)
│ ├── build-node-image.sh Build reaper-node installer image for Kind
│ ├── build-controller-image.sh Build reaper-controller image for Kind
│ ├── install-node.sh Init container script for node DaemonSet
│ ├── generate-kind-inventory.sh Auto-generate Kind inventory for Ansible (DEPRECATED)
│ ├── configure-containerd.sh Helper to configure containerd
│ ├── install-hooks.sh Setup git hooks (optional)
│ └── docker-coverage.sh Run coverage in Docker
├── tests/
│ ├── integration_basic_binary.rs
│ ├── integration_user_management.rs
│ ├── integration_shim.rs
│ ├── integration_io.rs
│ ├── integration_exec.rs
│ └── integration_overlay.rs
├── deploy/
│ └── kubernetes/ [K8s cluster config examples]
├── examples/ [Runnable Kind-based demos]
└── docs/
└── TESTING.md [This file]
CI Integration
The CI pipeline (.github/workflows/ci.yml) runs automatically on:
- Push to
mainorfix/**branches - Pull requests targeting
main
Changes to documentation (*.md, docs/**), LICENSE*, and .gitignore are excluded from triggering runs.
Jobs
The pipeline runs these jobs:
| Job | Description |
|---|---|
| Format | cargo fmt -- --check |
| Clippy | cargo clippy --workspace --all-targets -- -D warnings |
| Security Audit | cargo audit |
| Tests | cargo test --verbose |
| Coverage | cargo tarpaulin → Codecov upload |
| Build and Cache | Cross-compile static musl binaries (all 4 binaries) |
| kind-integration | Full integration test suite (run-integration-tests.sh --skip-cargo) |
| Example Validation | test-examples.sh --skip-cluster |
Results
Results are posted to the GitHub Actions job summary. If any test fails, the workflow reports the failure with diagnostics.
Archived / Deprecated Scripts
The following scripts have been consolidated into run-integration-tests.sh and are no longer maintained:
kind-integration.sh— Replaced byrun-integration-tests.sh(more features, better test reporting)minikube-setup-runtime.sh— Minikube support deprecatedminikube-test.sh— Minikube support deprecatedtest-k8s-integration.sh— Replaced byrun-integration-tests.shdocker-test.sh— Optional helper; usecargo testfor speed ordocker-coverage.shfor coverage
Next Steps
- Read the Architecture documentation for deeper understanding
- Check Overlay Design for filesystem isolation details
- See SHIMV2 Design for runtime internals
Contributing
Contributing to Reaper
Thanks for your interest in contributing! Here are some guidelines:
Code Style
- Run
cargo fmtbefore committing - Run
cargo clippyto check for common mistakes - Write tests for new functionality
Pull Request Process
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes and commit them
- Push to your fork
- Open a Pull Request
Testing
Please ensure all tests pass:
cargo test
And check code quality:
cargo fmt
cargo clippy
License
By contributing, you agree that your contributions will be licensed under the MIT License.
Examples
Examples
Runnable examples demonstrating Reaper’s capabilities. Each example includes a setup.sh script that creates a Kind cluster with Reaper pre-installed.
Prerequisites
All examples require:
Note: Examples 01–08 use the legacy Ansible-based installer. Newer examples (09+) use the Helm-based
setup-playground.shpattern.
Run all scripts from the repository root.
Examples
01-scheduling/ — Node Scheduling Patterns
Demonstrates running workloads on all nodes vs. a labeled subset using DaemonSets with nodeSelector.
- 3-node cluster (1 control-plane + 2 workers)
- All-node DaemonSet (load/memory monitor on every node)
- Subset DaemonSet (login-node monitor only on
node-role=loginnodes)
./examples/01-scheduling/setup.sh
kubectl apply -f examples/01-scheduling/all-nodes-daemonset.yaml
kubectl apply -f examples/01-scheduling/subset-nodes-daemonset.yaml
02-client-server/ — TCP Client-Server Communication
Demonstrates cross-node networking with a socat TCP server on one node and clients connecting from other nodes over host networking.
- 4-node cluster (1 control-plane + 3 workers)
- Server on
role=servernode, clients onrole=clientnodes - Clients discover the server IP via a ConfigMap
./examples/02-client-server/setup.sh
kubectl apply -f examples/02-client-server/server-daemonset.yaml
kubectl apply -f examples/02-client-server/client-daemonset.yaml
kubectl logs -l app=demo-client --all-containers --prefix -f
03-client-server-runas/ — Client-Server with Non-Root User
Same as client-server, but all workloads run as a shared non-root user (demo-svc, UID 1500 / GID 1500), demonstrating Reaper’s securityContext.runAsUser / runAsGroup support. The setup script creates the user on every node with identical IDs, mimicking an LDAP environment.
- 4-node cluster (1 control-plane + 3 workers)
- Shared
demo-svcuser created on all nodes (UID 1500, GID 1500) - All log output includes
uid=to prove privilege drop
./examples/03-client-server-runas/setup.sh
kubectl apply -f examples/03-client-server-runas/server-daemonset.yaml
kubectl apply -f examples/03-client-server-runas/client-daemonset.yaml
kubectl logs -l app=demo-client-runas --all-containers --prefix -f
04-volumes/ — Kubernetes Volume Mounts
Demonstrates Reaper’s volume mount support across four volume types: ConfigMap, Secret, hostPath, and emptyDir. Showcases package installation (nginx) inside the overlay namespace without modifying the host.
- 2-node cluster (1 control-plane + 1 worker)
- ConfigMap-configured nginx, read-only Secrets, hostPath file serving, emptyDir scratch workspace
- Software installed inside pod commands via overlay (host unmodified)
./examples/04-volumes/setup.sh
kubectl apply -f examples/04-volumes/configmap-nginx.yaml
kubectl logs configmap-nginx -f
05-kubemix/ — Kubernetes Workload Mix
Demonstrates running Jobs, DaemonSets, and Deployments simultaneously on a 10-node cluster. Each workload type targets a different set of labeled nodes, showcasing Reaper across diverse Kubernetes workload modes. All workloads read configuration from dedicated ConfigMap volumes.
- 10-node cluster (1 control-plane + 9 workers)
- Workers partitioned: 3 batch (Jobs), 3 daemon (DaemonSets), 3 service (Deployments)
- Each workload reads config from its own ConfigMap volume
./examples/05-kubemix/setup.sh
kubectl apply -f examples/05-kubemix/
kubectl get pods -o wide
06-ansible-jobs/ — Ansible Jobs
Demonstrates overlay persistence by running sequential Jobs: the first installs Ansible via apt, the second runs an Ansible playbook (from a ConfigMap) to install and verify nginx. Packages installed by Job 1 persist in the shared overlay for Job 2.
- 10-node cluster (1 control-plane + 9 workers)
- Job 1: installs Ansible on all workers (persists in overlay)
- Job 2: runs Ansible playbook from ConfigMap to install nginx
./examples/06-ansible-jobs/setup.sh
kubectl apply -f examples/06-ansible-jobs/install-ansible-job.yaml
kubectl wait --for=condition=Complete job/install-ansible --timeout=300s
kubectl apply -f examples/06-ansible-jobs/nginx-playbook-job.yaml
07-ansible-complex/ — Ansible Complex (Reboot-Resilient)
Fully reboot-resilient Ansible deployment using only DaemonSets. A bootstrap DaemonSet installs Ansible, then role-specific DaemonSets run playbooks (nginx on login nodes, htop on compute nodes). Init containers create implicit dependencies so a single kubectl apply -f deploys everything in the right order. All packages survive node reboots.
- 10-node cluster (1 control-plane + 9 workers: 2 login, 7 compute)
- 3 DaemonSets: Ansible bootstrap (all), nginx (login), htop (compute)
- Init container dependencies — no manual ordering needed
./examples/07-ansible-complex/setup.sh
kubectl apply -f examples/07-ansible-complex/
kubectl rollout status daemonset/nginx-login --timeout=300s
08-mix-container-runtime-engines/ — Mixed Runtime Engines
Demonstrates mixed runtime engines in the same cluster: a standard containerized OpenLDAP server (default containerd/runc) alongside Reaper workloads that configure SSSD on every node. Reaper pods consume the LDAP service via a fixed ClusterIP, enabling getent passwd to resolve LDAP users on the host.
- 4-node cluster (1 control-plane + 3 workers: 1 login, 2 compute)
- OpenLDAP Deployment (default runtime) with 5 posixAccount users
- Reaper DaemonSets: Ansible bootstrap + SSSD configuration (all workers)
- Init containers handle dependency ordering (Ansible + LDAP readiness)
./examples/08-mix-container-runtime-engines/setup.sh
kubectl apply -f examples/08-mix-container-runtime-engines/
kubectl rollout status daemonset/base-config --timeout=300s
09-reaperpod/ — ReaperPod CRD
Demonstrates the ReaperPod Custom Resource Definition — a simplified, Reaper-native way to run workloads without container boilerplate. A reaper-controller watches ReaperPod resources and creates real Pods with runtimeClassName: reaper-v2 pre-configured.
- No
image:field needed (busybox placeholder handled automatically) - Reaper-specific fields:
dnsMode,overlayName, simplified volumes - Status tracks phase, podName, nodeName, exitCode
# Prerequisites: install CRD and controller
kubectl create namespace reaper-system
kubectl apply -f deploy/kubernetes/crds/reaperpods.reaper.io.yaml
kubectl apply -f deploy/kubernetes/reaper-controller.yaml
# Run a simple task
kubectl apply -f examples/09-reaperpod/simple-task.yaml
kubectl get reaperpods
kubectl describe reaperpod hello-world
# With volumes (create ConfigMap first)
kubectl create configmap app-config --from-literal=greeting="Hello from ConfigMap"
kubectl apply -f examples/09-reaperpod/with-volumes.yaml
# With node selector (label a node first)
kubectl label node <name> workload-type=compute
kubectl apply -f examples/09-reaperpod/with-node-selector.yaml
10-slurm-hpc/ — Slurm HPC (Mixed Runtimes)
Demonstrates a Slurm HPC cluster using mixed Kubernetes runtimes: slurmctld (scheduler) runs as a standard container, while slurmd (worker daemons) run on compute nodes via Reaper with direct host access for CPU pinning and device management.
- 4-node cluster (1 control-plane + 1 slurmctld + 2 compute)
- slurmctld Deployment (default runtime) with munge authentication
- slurmd DaemonSet (Reaper) on compute nodes with shared overlay
./examples/10-slurm-hpc/setup.sh
kubectl apply -f examples/10-slurm-hpc/
kubectl rollout status daemonset/slurmd --timeout=300s
11-node-monitoring/ — Node Monitoring (Prometheus + Reaper)
Demonstrates host-level node monitoring: Prometheus node_exporter runs as a Reaper DaemonSet for accurate host metrics, while a containerized Prometheus server (default runtime) scrapes them.
- 3-node cluster (1 control-plane + 2 workers)
- node_exporter DaemonSet (Reaper) — downloads and runs on host
- Prometheus Deployment (default runtime) with Kubernetes service discovery
./examples/11-node-monitoring/setup.sh
kubectl apply -f examples/11-node-monitoring/
kubectl port-forward svc/prometheus 9090:9090
12-daemon-job/ — ReaperDaemonJob CRD (Node Configuration)
Demonstrates the ReaperDaemonJob Custom Resource Definition — a “DaemonSet for Jobs” that runs commands to completion on every matching node. Designed for node configuration tasks like Ansible playbooks that compose via shared overlays.
- Dependency ordering via
afterfield (second job waits for first) - Shared overlays via
overlayName(composable node config) - Per-node status tracking with retry support
# Prerequisites: Reaper + controller running (via Helm or setup-playground.sh)
kubectl apply -f examples/12-daemon-job/simple-daemon-job.yaml
kubectl get reaperdaemonjobs
kubectl describe reaperdaemonjob node-info
# Composable example with dependencies
kubectl apply -f examples/12-daemon-job/composable-node-config.yaml
kubectl get rdjob -w # watch until both jobs complete
Cleanup
Examples with setup.sh scripts can be cleaned up independently:
./examples/01-scheduling/setup.sh --cleanup
./examples/02-client-server/setup.sh --cleanup
./examples/03-client-server-runas/setup.sh --cleanup
./examples/04-volumes/setup.sh --cleanup
./examples/05-kubemix/setup.sh --cleanup
./examples/06-ansible-jobs/setup.sh --cleanup
./examples/07-ansible-complex/setup.sh --cleanup
./examples/08-mix-container-runtime-engines/setup.sh --cleanup
./examples/10-slurm-hpc/setup.sh --cleanup
./examples/11-node-monitoring/setup.sh --cleanup
For CRD-based examples (09, 12), delete the resources directly:
kubectl delete reaperpod --all
kubectl delete reaperdaemonjob --all
Custom Resource Definitions
Reaper provides three CRDs for managing workloads, overlay filesystems, and node-wide configuration tasks.
ReaperPod
A simplified, Reaper-native way to run workloads without standard container boilerplate.
- Group:
reaper.io - Version:
v1alpha1 - Kind:
ReaperPod - Short name:
rpod(kubectl get rpod)
Spec
| Field | Type | Required | Description |
|---|---|---|---|
command | string[] | Yes | Command to execute on the host |
args | string[] | No | Arguments to the command |
env | EnvVar[] | No | Environment variables (simplified format) |
volumes | Volume[] | No | Volume mounts (simplified format) |
nodeSelector | map[string]string | No | Node selection constraints |
dnsMode | string | No | DNS resolution mode (host or kubernetes) |
overlayName | string | No | Named overlay group (requires matching ReaperOverlay) |
Status
| Field | Type | Description |
|---|---|---|
phase | string | Current phase: Pending, Running, Succeeded, Failed |
podName | string | Name of the backing Pod |
nodeName | string | Node where the workload runs |
exitCode | int | Process exit code (when completed) |
startTime | string | When the workload started |
completionTime | string | When the workload completed |
Simplified Volumes
ReaperPod volumes use a flat format instead of the nested Kubernetes volume spec:
volumes:
- name: config
mountPath: /etc/config
configMap: "my-configmap" # ConfigMap name (string)
readOnly: true
- name: secret
mountPath: /etc/secret
secret: "my-secret" # Secret name (string)
- name: host
mountPath: /data
hostPath: "/opt/data" # Host path (string)
- name: scratch
mountPath: /tmp/work
emptyDir: true # EmptyDir (bool)
Examples
Simple Task
apiVersion: reaper.io/v1alpha1
kind: ReaperPod
metadata:
name: hello-world
spec:
command: ["/bin/sh", "-c", "echo Hello from $(hostname) at $(date)"]
With Volumes
apiVersion: reaper.io/v1alpha1
kind: ReaperPod
metadata:
name: with-config
spec:
command: ["/bin/sh", "-c", "cat /config/greeting"]
volumes:
- name: config
mountPath: /config
configMap: "app-config"
readOnly: true
With Node Selector
apiVersion: reaper.io/v1alpha1
kind: ReaperPod
metadata:
name: compute-task
spec:
command: ["/bin/sh", "-c", "echo Running on $(hostname)"]
nodeSelector:
workload-type: compute
Controller
The reaper-controller watches ReaperPod resources and creates backing Pods with runtimeClassName: reaper-v2. It translates the simplified ReaperPod spec into a full Pod spec.
- Pod name matches ReaperPod name (1:1 mapping)
- Owner references enable automatic garbage collection
- Status is mirrored from the backing Pod
- If
overlayNameis set, the Pod stays Pending until a matching ReaperOverlay is Ready
ReaperOverlay
A PVC-like resource that manages named overlay filesystem lifecycles independently from ReaperPod workloads. Enables Kubernetes-native overlay creation, reset, and deletion without requiring direct node access.
- Group:
reaper.io - Version:
v1alpha1 - Kind:
ReaperOverlay - Short name:
rovl(kubectl get rovl)
Spec
| Field | Type | Default | Description |
|---|---|---|---|
resetPolicy | string | Manual | When to reset: Manual, OnFailure, OnDelete |
resetGeneration | int | 0 | Increment to trigger a reset on all nodes |
Status
| Field | Type | Description |
|---|---|---|
phase | string | Current phase: Pending, Ready, Resetting, Failed |
observedResetGeneration | int | Last resetGeneration fully applied |
nodes[] | array | Per-node overlay state |
nodes[].nodeName | string | Node name |
nodes[].ready | bool | Whether the overlay is available |
nodes[].lastResetTime | string | ISO 8601 timestamp of last reset |
message | string | Human-readable status message |
PVC-like Behavior
ReaperOverlay works like a PersistentVolumeClaim:
- Blocking: ReaperPods with
overlayNamestay Pending until the matching ReaperOverlay exists and is Ready - Cleanup on delete: A finalizer ensures on-disk overlay data is cleaned up on all nodes when the ReaperOverlay is deleted
- Reset: Increment
spec.resetGenerationto trigger overlay teardown and recreation on all nodes
Examples
Create an Overlay
apiVersion: reaper.io/v1alpha1
kind: ReaperOverlay
metadata:
name: slurm
spec:
resetPolicy: Manual
Use with a ReaperPod
apiVersion: reaper.io/v1alpha1
kind: ReaperPod
metadata:
name: install-slurm
spec:
overlayName: slurm
command: ["bash", "-c", "apt-get update && apt-get install -y slurm-wlm"]
Reset a Corrupt Overlay
kubectl patch rovl slurm --type merge -p '{"spec":{"resetGeneration":1}}'
kubectl get rovl slurm -w # watch until phase returns to Ready
Delete an Overlay
kubectl delete rovl slurm # finalizer cleans up on-disk data on all nodes
ReaperDaemonJob
A “DaemonSet for Jobs” that runs a command to completion on every matching node, with support for dependency ordering, retry policies, and shared overlays. Designed for node configuration tasks like Ansible playbooks that compose via shared overlays.
- Group:
reaper.io - Version:
v1alpha1 - Kind:
ReaperDaemonJob - Short name:
rdjob(kubectl get rdjob)
Spec
| Field | Type | Default | Description |
|---|---|---|---|
command | string[] | (required) | Command to execute on each node |
args | string[] | Arguments to the command | |
env | EnvVar[] | Environment variables (same format as ReaperPod) | |
workingDir | string | Working directory for the command | |
overlayName | string | Named overlay group for shared filesystem | |
nodeSelector | map[string]string | Target specific nodes by labels (all nodes if empty) | |
dnsMode | string | DNS resolution mode (host or kubernetes) | |
runAsUser | int | UID for the process | |
runAsGroup | int | GID for the process | |
volumes | Volume[] | Volume mounts (same format as ReaperPod) | |
tolerations | Toleration[] | Tolerations for the underlying Pods | |
triggerOn | string | NodeReady | Trigger events: NodeReady or Manual |
after | string[] | Dependency ordering — names of other ReaperDaemonJobs that must complete first | |
retryLimit | int | 0 | Maximum retries per node on failure |
concurrencyPolicy | string | Skip | What to do on re-trigger while running: Skip or Replace |
Status
| Field | Type | Description |
|---|---|---|
phase | string | Overall phase: Pending, Running, Completed, PartiallyFailed |
readyNodes | int | Number of nodes that completed successfully |
totalNodes | int | Total number of targeted nodes |
observedGeneration | int | Last spec generation reconciled |
nodeStatuses[] | array | Per-node execution status |
nodeStatuses[].nodeName | string | Node name |
nodeStatuses[].phase | string | Per-node phase: Pending, Running, Succeeded, Failed |
nodeStatuses[].reaperPodName | string | Name of the ReaperPod created for this node |
nodeStatuses[].exitCode | int | Exit code on this node |
nodeStatuses[].retryCount | int | Number of retries so far |
message | string | Human-readable status message |
Controller Layering
ReaperDaemonJob → ReaperPod → Pod. The DaemonJob controller creates one ReaperPod per matching node, pinned via nodeName. The existing ReaperPod controller then creates the backing Pods. No changes to the runtime or shim.
Dependency Ordering
The after field lists other ReaperDaemonJobs that must reach Completed phase before this job starts on any node. This enables composable workflows where one job’s output is another’s input (via shared overlays).
Examples
Simple Node Info
apiVersion: reaper.io/v1alpha1
kind: ReaperDaemonJob
metadata:
name: node-info
spec:
command: ["/bin/sh", "-c"]
args:
- |
echo "Node: $(hostname)"
echo "Kernel: $(uname -r)"
Composable Node Config with Dependencies
apiVersion: reaper.io/v1alpha1
kind: ReaperDaemonJob
metadata:
name: mount-filesystems
spec:
command: ["/bin/sh", "-c"]
args: ["mkdir -p /mnt/shared && mount -t nfs server:/export /mnt/shared"]
overlayName: node-config
nodeSelector:
role: compute
---
apiVersion: reaper.io/v1alpha1
kind: ReaperDaemonJob
metadata:
name: install-packages
spec:
command: ["/bin/sh", "-c"]
args: ["apt-get update && apt-get install -y htop"]
overlayName: node-config
after:
- mount-filesystems
nodeSelector:
role: compute
retryLimit: 2
Helm Chart Reference
The Reaper Helm chart is located at deploy/helm/reaper/.
Installation
helm upgrade --install reaper deploy/helm/reaper/ \
--namespace reaper-system --create-namespace \
--wait --timeout 120s
Values
Node Installer DaemonSet
| Value | Default | Description |
|---|---|---|
node.image.repository | ghcr.io/miguelgila/reaper-node | Node installer image |
node.image.tag | "" (uses appVersion) | Image tag |
node.image.pullPolicy | IfNotPresent | Pull policy |
node.installPath | /usr/local/bin | Binary install path on host |
node.configureContainerd | false | Whether to configure and restart containerd |
CRD Controller Deployment
| Value | Default | Description |
|---|---|---|
controller.image.repository | ghcr.io/miguelgila/reaper-controller | Controller image |
controller.image.tag | "" (uses appVersion) | Image tag |
controller.image.pullPolicy | IfNotPresent | Pull policy |
controller.replicas | 1 | Number of controller replicas |
controller.resources.requests.cpu | 10m | CPU request |
controller.resources.requests.memory | 32Mi | Memory request |
controller.resources.limits.cpu | 100m | CPU limit |
controller.resources.limits.memory | 64Mi | Memory limit |
Agent DaemonSet
| Value | Default | Description |
|---|---|---|
agent.enabled | true | Enable the agent DaemonSet |
agent.image.repository | ghcr.io/miguelgila/reaper-agent | Agent image |
agent.image.tag | "" (uses appVersion) | Image tag |
agent.image.pullPolicy | IfNotPresent | Pull policy |
agent.resources.requests.cpu | 10m | CPU request |
agent.resources.requests.memory | 32Mi | Memory request |
agent.resources.limits.cpu | 100m | CPU limit |
agent.resources.limits.memory | 64Mi | Memory limit |
RuntimeClass
| Value | Default | Description |
|---|---|---|
runtimeClass.name | reaper-v2 | RuntimeClass name |
runtimeClass.handler | reaper-v2 | Containerd handler name |
Reaper Configuration
| Value | Default | Description |
|---|---|---|
config.dnsMode | kubernetes | DNS resolution mode |
config.runtimeLog | /run/reaper/runtime.log | Runtime log path |
What Gets Installed
The chart installs:
- CRDs (
deploy/helm/reaper/crds/) — ReaperPod CRD definition - Namespace —
reaper-system(created by--create-namespace) - Node DaemonSet — Init container copies shim + runtime binaries to host
- Controller Deployment — Watches ReaperPod CRDs, creates Pods
- Agent DaemonSet — Health monitoring and Prometheus metrics
- RuntimeClass — Registers
reaper-v2with Kubernetes - RBAC — ServiceAccount, ClusterRole, ClusterRoleBinding for controller and agent