Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help

       lmbench - benchmarking toolbox

       #include	``lmbench.h''

       typedef u_long iter_t

       typedef (*benchmp_f)(iter_t iterations, void* cookie)

       void benchmp(benchmp_f	initialize,   benchmp_f	 benchmark,  benchmp_f
       cleanup,	int enough, int	parallel, int warmup, int  repetitions,	 void*

       uint64	 get_n()

       void milli(char *s, uint64 n)

       void micro(char *s, uint64 n)

       void nano(char *s, uint64 n) void mb(uint64 bytes)

       void kb(uint64 bytes)

       Creating	benchmarks using the lmbench timing harness is easy.  Since it
       is so easy to measure performance using lmbench ,  it  is  possible  to
       quickly	answer questions that arise during system design, development,
       or tuning.  For example,	image processing

       There are two attributes	that are critical for performance, latency and
       bandwidth,  and	lmbench's  timing harness makes	it easy	to measure and
       report results for both.	 Latency is usually important  for  frequently
       executed	 operations,  and  bandwidth  is usually important when	moving
       large chunks of data.

       There are a number of factors to	consider when building benchmarks.

       The timing harness requires that	the benchmarked	operation  be  idempo-
       tent so that it can be repeated indefinitely.

       The timing subsystem, benchmp, is passed	up to three function pointers.
       Some benchmarks may need	as few as one  function	 pointer  (for	bench-

       void benchmp(initialize,	 benchmark, cleanup, enough, parallel, warmup,
       repetitions, cookie)
	      measures the performance of benchmark repeatedly and reports the
	      median result.  benchmp creates parallel sub-processes which run
	      benchmark	in parallel.  This allows lmbench to measure the  sys-
	      tem's  ability  to  scale	 as the	number of client processes in-
	      creases.	Each sub-process executes initialize  before  starting
	      the  benchmarking	 cycle with iterations set to 0.  It will call
	      initialize , benchmark , and cleanup with	iterations set to  the
	      number  of  iterations in	the timing loop	several	times in order
	      to collect repetitions results.  The calls to benchmark are sur-
	      rounded  by  start  and  stop call to time the amount of time it
	      takes to do the benchmarked operation iterations	times.	 After
	      all the benchmark	results	have been collected, cleanup is	called
	      with iterations set to 0 to cleanup any resources	which may have
	      been  allocated  by  initialize  or benchmark.  cookie is	a void
	      pointer to a hunk	of memory that can be used to store any	param-
	      eters or state that is needed by the benchmark.

       void benchmp_getstate()
	      returns a	void pointer to	the lmbench-internal state used	during
	      benchmarking.  The state is not to be used or accessed  directly
	      by clients, but rather would be passed into benchmp_interval.

       iter_t	 benchmp_interval(void*	state)
	      returns  the  number  of	times the benchmark should execute its
	      benchmark	loop during this timing	interval.  This	is  used  only
	      for  weird  benchmarks which cannot implement the	benchmark body
	      in a function which can return, such as the page fault  handler.
	      Please see lat_sig.c for sample usage.

       uint64	 get_n()
	      returns  the  number  of times loop_body was executed during the
	      timing interval.

       void milli(char *s, uint64 n)
	      print out	the time per operation in  milli-seconds.   n  is  the
	      number of	operations during the timing interval, which is	passed
	      as a parameter because each loop_body can	contain	several	opera-

       void micro(char *s, uint64 n)
	      print the	time per opertaion in micro-seconds.

       void nano(char *s, uint64 n)
	      print the	time per operation in nano-seconds.

       void mb(uint64 bytes)
	      print the	bandwidth in megabytes per second.

       void kb(uint64 bytes)
	      print the	bandwidth in kilobytes per second.

USING lmbench
       Here  is	 an example of a simple	benchmark that measures	the latency of
       the random number generator lrand48():

	      #include ``lmbench.h''

	      benchmark_lrand48(iter_t iterations, void* cookie) {
		   while(iterations-- >	0)

	      main(int argc, char *argv[])
		   benchmp(NULL, benchmark_lrand48,  NULL,  0,	1,  0,	TRIES,
		   micro( lrand48()", get_n());"

       Here  is	 a simple benchmark that measures and reports the bandwidth of

	      #include ``lmbench.h''

	      #define MB (1024 * 1024)
	      #define SIZE (8 *	MB)

	      struct _state {
		   int size;
		   char* a;
		   char* b;

	      initialize_bcopy(iter_t iterations, void*	cookie)	{
		   struct _state* state	= (struct _state*)cookie;

		  if (!iterations) return;
		   state->a = malloc(state->size);
		   state->b = malloc(state->size);
		   if (state->a	== NULL	|| state->b == NULL)

	      benchmark_bcopy(iter_t iterations, void* cookie) {
		   struct _state* state	= (struct _state*)cookie;

		   while(iterations-- >	0)
			bcopy(state->a,	state->b, state->size);

	      cleanup_bcopy(iter_t iterations, void* cookie) {
		   struct _state* state	= (struct _state*)cookie;

		  if (!iterations) return;

	      main(int argc, char *argv[])
		   struct _state state;

		   state.size =	SIZE;
		   benchmp(initialize_bcopy, benchmark_bcopy, cleanup_bcopy,
			0, 1, 0, TRIES,	&state);
		   mb(get_n() *	state.size);

       A slightly more complex version of the bcopy  benchmark	might  measure
       bandwidth  as a function	of memory size and parallelism.	 The main pro-
       cedure in this case might look something	like this:

	      main(int argc, char *argv[])
		   int	size, par;
		   struct _state state;

		   for (size = 64; size	<= SIZE; size <<= 1) {
			for (par = 1; par < 32;	par <<=	1) {
			     state.size	= size;
			     benchmp(initialize_bcopy, benchmark_bcopy,
				  cleanup_bcopy, 0, par, 0, TRIES, &state);
			     fprintf(stderr, d%d
			     mb(par * get_n() *	state.size);

       There are three environment variables that can be used  to  modify  the
       lmbench timing subsystem: ENOUGH, TIMING_O, and LOOP_O.

       Development of lmbench is continuing.

       lmbench(8), timing(3), reporting(3), results(3).

       Carl Staelin and	Larry McVoy

       Comments, suggestions, and bug reports are always welcome.

(c)1998-2000 Larry McVoy and Carl St$Date:$			    LMBENCH(3)


Want to link to this manual page? Use this URL:

home | help