FreeBSD The Power to Serve

Security Sandboxing Using ktrace(1)

Contact: Jake Freeland <jfree@FreeBSD.org>

Capsicumization With ktrace(1)

This report introduces an extension to ktrace(1) that logs capability violations for programs that have not been Capsicumized.

The first logical step in Capsicumization is determining where your program is raising capability violations. You could approach this issue by looking through the source and removing Capsicum-incompatible code, but this can be tedious and requires the developer to be familiar with everything that is not allowed in capability mode.

An alternative to finding violations manually is to use ktrace(1). The ktrace(1) utility logs kernel activity for a specified process. Capsicum violations occur inside of the kernel, so ktrace(1) can record and return extra information about your program’s violations with the -t p option.

Programs traditionally need to be put into capability mode before they will report violations. When a restricted system call is entered, it will fail and return with ECAPMODE: Not permitted in capability mode. If the developer is doing error checking, then it is likely that their program will terminate with that error. This behavior made violation tracing inconvenient because ktrace(1) would only report the first capability violation, and then the program would terminate.

Luckily, a new extension to ktrace(1) can record violations when a program is NOT in capability mode. This means that any developer can run capability violation tracing on their program with no modification to see where it is raising violations. Since the program is never actually put into capability mode, it will still acquire resources and execute normally.

Violation Tracing Examples

The cap_violate program, shown below, attempts to raise every type of violation that ktrace(1) can capture:

# ktrace -t p ./cap_violate
# kdump
1603 ktrace   CAP   system call not allowed: execve
1603 foo      CAP   system call not allowed: open
1603 foo      CAP   system call not allowed: open
1603 foo      CAP   system call not allowed: open
1603 foo      CAP   system call not allowed: open
1603 foo      CAP   system call not allowed: readlink
1603 foo      CAP   system call not allowed: open
1603 foo      CAP   cpuset_setaffinity: restricted cpuset operation
1603 foo      CAP   openat: restricted VFS lookup: AT_FDCWD
1603 foo      CAP   openat: restricted VFS lookup: /
1603 foo      CAP   system call not allowed: bind
1603 foo      CAP   sendto: restricted address lookup: struct sockaddr { AF_INET, 0.0.0.0:5000 }
1603 foo      CAP   socket: protocol not allowed: IPPROTO_ICMP
1603 foo      CAP   kill: signal delivery not allowed: SIGCONT
1603 foo      CAP   system call not allowed: chdir
1603 foo      CAP   system call not allowed: fcntl, cmd: F_KINFO
1603 foo      CAP   operation requires CAP_WRITE, descriptor holds CAP_READ
1603 foo      CAP   attempt to increase capabilities from CAP_READ to CAP_READ,CAP_WRITE

The first 7 system call not allowed entries did not explicitly originate from the cap_violate program code. Instead, they were raised by FreeBSD’s C runtime libraries. This becomes apparent when you trace namei translations alongside capability violations using the -t np option:

# ktrace -t np ./cap_violate
# kdump
1632 ktrace   CAP   system call not allowed: execve
1632 ktrace   NAMI  "./cap_violate"
1632 ktrace   NAMI  "/libexec/ld-elf.so.1"
1632 foo      CAP   system call not allowed: open
1632 foo      NAMI  "/etc/libmap.conf"
1632 foo      CAP   system call not allowed: open
1632 foo      NAMI  "/usr/local/etc/libmap.d"
1632 foo      CAP   system call not allowed: open
1632 foo      NAMI  "/var/run/ld-elf.so.hints"
1632 foo      CAP   system call not allowed: open
1632 foo      NAMI  "/lib/libc.so.7"
1632 foo      CAP   system call not allowed: readlink
1632 foo      NAMI  "/etc/malloc.conf"
1632 foo      CAP   system call not allowed: open
1632 foo      NAMI  "/dev/pvclock"
1632 foo      CAP   cpuset_setaffinity: restricted cpuset operation
1632 foo      NAMI  "ktrace.out"
1632 foo      CAP   openat: restricted VFS lookup: AT_FDCWD
1632 foo      NAMI  "/"
1632 foo      CAP   openat: restricted VFS lookup: /
1632 foo      CAP   system call not allowed: bind
1632 foo      CAP   sendto: restricted address lookup: struct sockaddr { AF_INET, 0.0.0.0:5000 }
1632 foo      CAP   socket: protocol not allowed: IPPROTO_ICMP
1632 foo      CAP   kill: signal delivery not allowed: SIGCONT
1632 foo      CAP   system call not allowed: chdir
1632 foo      NAMI  "."
1632 foo      CAP   system call not allowed: fcntl, cmd: F_KINFO
1632 foo      CAP   operation requires CAP_WRITE, descriptor holds CAP_READ
1632 foo      CAP   attempt to increase capabilities from CAP_READ to CAP_READ,CAP_WRITE

In practice, capability mode is always entered following the initialization of the C runtime libraries, so a program would never trigger those first 7 violations. We are only seeing them because ktrace(1) starts recording violations before the program starts.

This demonstration makes it clear that violation tracing is not always perfect. It is a helpful guide for detecting restricted system calls, but may not always parody your program’s actual behavior in capability mode. In capability mode, violations are equivalent to errors; they are an indication to stop execution. Violation tracing is ignoring this suggestion and continuing execution anyway, so invalid violations may be reported.

The next example traces violations from the unzip(1) utility (pre-Capsicumization):

# ktrace -t np unzip foo.zip
Archive:  foo.zip
creating: bar/
extracting: bar/bar.txt
creating: baz/
extracting: baz/baz.txt
# kdump
1926 ktrace   CAP   system call not allowed: execve
1926 ktrace   NAMI  "/usr/bin/unzip"
1926 ktrace   NAMI  "/libexec/ld-elf.so.1"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/etc/libmap.conf"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/usr/local/etc/libmap.d"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/var/run/ld-elf.so.hints"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/lib/libarchive.so.7"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/usr/lib/libarchive.so.7"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/lib/libc.so.7"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/lib/libz.so.6"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/lib/libbz2.so.4"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/usr/lib/libbz2.so.4"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/lib/liblzma.so.5"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/usr/lib/liblzma.so.5"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/lib/libbsdxml.so.4"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/lib/libprivatezstd.so.5"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/usr/lib/libprivatezstd.so.5"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/lib/libcrypto.so.111"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/lib/libmd.so.6"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/lib/libthr.so.3"
1926 unzip    CAP   system call not allowed: readlink
1926 unzip    NAMI  "/etc/malloc.conf"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/dev/pvclock"
1926 unzip    NAMI  "foo.zip"
1926 unzip    CAP   openat: restricted VFS lookup: AT_FDCWD
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/etc/localtime"
1926 unzip    NAMI  "bar"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    CAP   system call not allowed: mkdir
1926 unzip    NAMI  "bar"
1926 unzip    NAMI  "bar"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    NAMI  "bar/bar.txt"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    NAMI  "bar/bar.txt"
1926 unzip    CAP   openat: restricted VFS lookup: AT_FDCWD
1926 unzip    NAMI  "baz"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    CAP   system call not allowed: mkdir
1926 unzip    NAMI  "baz"
1926 unzip    NAMI  "baz"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    NAMI  "baz/baz.txt"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    NAMI  "baz/baz.txt"
1926 unzip    CAP   openat: restricted VFS lookup: AT_FDCWD

The violation tracing output for unzip(1) is more akin to what a developer would see when tracing their own program for the first time. Most programs link against libraries. In this case, unzip(1) is linking against libarchive(3), which is reflected here:

1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/lib/libarchive.so.7"
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/usr/lib/libarchive.so.7"

The violations for unzip(1) can be found below the C runtime violations:

1926 unzip    NAMI  "foo.zip"
1926 unzip    CAP   openat: restricted VFS lookup: AT_FDCWD
1926 unzip    CAP   system call not allowed: open
1926 unzip    NAMI  "/etc/localtime"
1926 unzip    NAMI  "bar"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    CAP   system call not allowed: mkdir
1926 unzip    NAMI  "bar"
1926 unzip    NAMI  "bar"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    NAMI  "bar/bar.txt"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    NAMI  "bar/bar.txt"
1926 unzip    CAP   openat: restricted VFS lookup: AT_FDCWD
1926 unzip    NAMI  "baz"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    CAP   system call not allowed: mkdir
1926 unzip    NAMI  "baz"
1926 unzip    NAMI  "baz"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    NAMI  "baz/baz.txt"
1926 unzip    CAP   fstatat: restricted VFS lookup: AT_FDCWD
1926 unzip    NAMI  "baz/baz.txt"
1926 unzip    CAP   openat: restricted VFS lookup: AT_FDCWD

In this instance, unzip(1) is recreating the file structure contained in the zip archive. Violations are being raised because the AT_FDCWD value cannot be used in capability mode. The bulk of these violations can be fixed by opening AT_FDCWD (the current directory) before entering capability mode and passing that descriptor into openat(2), fstatat(2), and mkdirat(2) as a relative reference.

Violation tracing may not automatically Capsicumize programs, but it is another tool in the developer’s toolbox. It only takes a few seconds to run a program under ktrace(1) and the result is almost always a decent starting point for sandboxing your program using Capsicum.

Sponsor: FreeBSD Foundation


Last modified on: July 25, 2023 by Lorenzo Salvadore