Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
PCTR(4)		    FreeBSD/amd64 Kernel Interfaces Manual	       PCTR(4)

     pctr -- driver for	CPU performance	counters

     pseudo-device pctr	1

     The pctr device provides access to	the performance	counters on AMD	and
     Intel brand processors, and to the	TSC on others.

     Intel processors have two 40-bit performance counters which can be	pro-
     grammed to	count events such as cache misses, branch target buffer	hits,
     TLB misses, dual-issues, interrupts, pipeline flushes, and	more.  While
     AMD processors have four 48-bit counters, their precision is decreased to
     40	bits.

     There is one ioctl	call to	read the status	of all counters, and one ioctl
     call to program the function of each counter.  All	require	the following

	   #include <sys/types.h>
	   #include <machine/cpu.h>
	   #include <machine/pctr.h>

     The current state of all counters can be read with	the PCIOCRD ioctl,
     which takes an argument of	type struct pctrst:

	   #define PCTR_NUM	   4
	   struct pctrst {
		   u_int pctr_fn[PCTR_NUM];
		   pctrval pctr_tsc;
		   pctrval pctr_hwc[PCTR_NUM];

     In	this structure,	ctr_fn contains	the functions of the counters, as pre-
     viously set by the	PCIOCS0, PCIOCS1, PCIOCS2 and PCIOCS3 ioctls (see be-
     low).  pctr_hwc contains the actual value of the hardware counters.
     pctr_tsc is a free-running, 64-bit	cycle counter.

     The functions of the counters can be programmed with ioctls PCIOCS0,
     PCIOCS1, PCIOCS2 and PCIOCS3 which	require	a writeable file descriptor
     and take an argument of type unsigned int.	 The meaning of	this integer
     is	dependent on the particular CPU.

   Time	stamp counter
     The time stamp counter is available on most of the	AMD and	Intel CPUs.
     It	is set to zero at boot time, and then increments with each cycle.  Be-
     cause the counter is 64-bits wide,	it does	not overflow.

     The value of the time stamp counter is returned by	the PCIOCRD ioctl, so
     that one can get an exact timestamp on readings of	the hardware event

     The performance counters can be read directly from	user-mode without need
     to	invoke the kernel.  The	macro rdpmc(ctr) takes 0, 1, 2 or 3 as an ar-
     gument to specify a counter, and returns that counter's 40-bit value
     (which will be of type pctrval).  This is generally preferable to making
     a system call as it introduces less distortion in measurements.

     Counter functions supported by these CPUs contain several parts.  The
     most significant byte (an 8-bit integer shifted left by PCTR_CM_SHIFT)
     contains a	counter	mask.  If non-zero, this sets a	threshold for the num-
     ber of times an event must	occur in one cycle for the counter to be in-
     cremented.	 The counter mask can therefore	be used	to count cycles	in
     which an event occurs at least some number	of times.  The next byte con-
     tains several flags:

     PCTR_U   Enables counting of events that occur in user mode.

     PCTR_K   Enables counting of events that occur in kernel mode.  You must
	      set at least one of PCTR_K and PCTR_U to count anything.

     PCTR_E   Counts edges rather than cycles.	For some functions this	allows
	      you to get an estimate of	the number of events rather than the
	      number of	cycles occupied	by those events.

     PCTR_EN  Enable counters.	This bit must be set in	the function for
	      counter 0	in order for either of the counters to be enabled.
	      This bit should probably be set in counter 1 as well.

     PCTR_I   Inverts the sense	of the counter mask.  When this	bit is set,
	      the counter only increments on cycles in which there are no more
	      events than specified in the counter mask.

     The next byte (shifted left by the	PCTR_UM_SHIFT) contains	flags specific
     to	the event being	counted, also known as the unit	mask.

     For events	dealing	with the L2 cache, the following flags are valid on
     Intel brand processors:

     PCTR_UM_M	Count events involving modified	cache coherency	state lines.

     PCTR_UM_E	Count events involving exclusive cache coherency state lines.

     PCTR_UM_S	Count events involving shared cache coherency state lines.

     PCTR_UM_I	Count events involving invalid cache coherency state lines.

     To	measure	all L2 cache activity, all these bits should be	set.  They can
     be	set with the macro PCTR_UM_MESI	which contains the bitwise or of all
     of	the above.

     For event types dealing with bus transactions, there is another flag that
     can be set	in the unit mask:

     PCTR_UM_A	Count all appropriate bus events, not just those initiated by
		the processor.

     Events marked (MESI) require the PCTR_UM_[MESI] bits in the unit mask.
     Events marked (A) can take	the PCTR_UM_A bit.

     Finally, the least	significant byte of the	counter	function is the	event
     type to count.  A list of possible	event functions	could be obtained by
     running a pctr(1) command with -l option.


     [ENODEV]  An attempt was made to set the counter functions	on a CPU that
	       does not	support	counters.

     [EINVAL]  An invalid counter function was provided	as an argument to the
	       PCIOCSx ioctl.

     [EPERM]   An attempt was made to set the counter functions, but the de-
	       vice was	not open for writing.

     pctr(1), ioctl(2)

     A pctr device first appeared in OpenBSD 2.0.  Support for amd64 architec-
     ture appeared in OpenBSD 4.3.

     The pctr device was written by David Mazieres <>.  Support
     for amd64 architecture was	written	by Mike	Belopuhov <>.

     Not all counter functions are completely accurate.	 Some of the functions
     may not make any sense at all.  Also you should be	aware of the possibil-
     ity of an interrupt between invocations of	rdpmc()	that can potentially
     decrease the accuracy of measurements.

FreeBSD	13.0			October	5, 2019			  FreeBSD 13.0


Want to link to this manual page? Use this URL:

home | help