FreeBSD The Power to Serve

SIMD enhancements for amd64

Contact: Robert Clausecker <fuz@FreeBSD.org>

SIMD instruction set extensions such as SSE, AVX, and NEON are ubiquitous on modern computers and offer performance advantages for many applications. The goal of this project is to provide SIMD-enhanced versions of common libc functions (mostly those described in string(3)), speeding up most C programs.

For each function optimised, up to four implementations will be provided:

  • a scalar implementation optimised for amd64, but without any SIMD usage,

  • either a baseline implementation using SSE and SSE2, or an x86-64-v2 implementation using all SSE extensions up to SSE4.2,

  • an x86-64-v3 implementation using AVX and AVX2, and

  • an x86-64-v4 implementation using AVX-512F/BW/CD/DQ.

Users will be able to select which level of SIMD enhancements to use by setting the ARCHLEVEL environment variable.

While the current project only concerns amd64, the work may be expanded to other architectures like arm64 in the future.

During the last few months, significant progress has been made on this project. SIMD-enhanced versions of bcmp(3), index(3), memchr(3), memcmp(3), stpcpy(3), strchr(3), strchrnul(3), strcpy(3), strcspn(3), strlen(3), strnlen(3), and strspn(3) have landed. Functions memcpy(3), memmove(3), strcmp(3), timingsafe_bcmp(3) (see D41673), and timingsafe_memcmp(3) (see D41696) are work in progress. Unfortunately, the work has not made the cut for FreeBSD 14.0, but it is slated to be part of FreeBSD 14.1.

Sponsor: The FreeBSD Foundation


Last modified on: October 2, 2023 by Graham Perrin