Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
RRDCREATE(1)			    rrdtool			  RRDCREATE(1)

       rrdcreate - Set up a new	Round Robin Database

       rrdtool create filename [--start|-b start time] [--step|-s step]
       [DS:ds-name:DST:dst arguments] [RRA:CF:cf arguments]

       The create function of RRDtool lets you set up new Round	Robin Database
       (RRD) files.  The file is created at its	final, full size and filled
       with *UNKNOWN* data.

	   The name of the RRD you want	to create. RRD files should end	with
	   the extension .rrd. However,	RRDtool	will accept any	filename.

       --start|-b start	time (default: now - 10s)
	   Specifies the time in seconds since 1970-01-01 UTC when the first
	   value should	be added to the	RRD. RRDtool will not accept any data
	   timed before	or at the time specified.

	   See also AT-STYLE TIME SPECIFICATION	section	in the rrdfetch	docu-
	   mentation for other ways to specify time.

       --step|-s step (default:	300 seconds)
	   Specifies the base interval in seconds with which data will be fed
	   into	the RRD.

       DS:ds-name:DST:dst arguments
	   A single RRD	can accept input from several data sources (DS), for
	   example incoming and	outgoing traffic on a specific communication
	   line. With the DS configuration option you must define some basic
	   properties of each data source you want to store in the RRD.

	   ds-name is the name you will	use to reference this particular data
	   source from an RRD. A ds-name must be 1 to 19 characters long in
	   the characters [a-zA-Z0-9_].

	   DST defines the Data	Source Type. The remaining arguments of	a data
	   source entry	depend on the data source type.	For GAUGE, COUNTER,
	   DERIVE, and ABSOLUTE	the format for a data source entry is:

	   DS:ds-name:GAUGE | COUNTER |	DERIVE | ABSOLUTE:heartbeat:min:max

	   For COMPUTE data sources, the format	is:


	   In order to decide which data source	type to	use, review the	defi-
	   nitions that	follow.	Also consult the section on "HOW TO MEASURE"
	   for further insight.

	       is for things like temperatures or number of people in a	room
	       or the value of a RedHat	share.

	       is for continuous incrementing counters like the	ifInOctets
	       counter in a router. The	COUNTER	data source assumes that the
	       counter never decreases,	except when a counter overflows.  The
	       update function takes the overflow into account.	 The counter
	       is stored as a per-second rate. When the	counter	overflows,
	       RRDtool checks if the overflow happened at the 32bit or 64bit
	       border and acts accordingly by adding an	appropriate value to
	       the result.

	       will store the derivative of the	line going from	the last to
	       the current value of the	data source. This can be useful	for
	       gauges, for example, to measure the rate	of people entering or
	       leaving a room. Internally, derive works	exactly	like COUNTER
	       but without overflow checks. So if your counter does not	reset
	       at 32 or	64 bit you might want to use DERIVE and	combine	it
	       with a MIN value	of 0.


	       by Don Baarda <>

	       If you cannot tolerate ever mistaking the occasional counter
	       reset for a legitimate counter wrap, and	would prefer "Un-
	       knowns" for all legitimate counter wraps	and resets, always use
	       DERIVE with min=0. Otherwise, using COUNTER with	a suitable max
	       will return correct values for all legitimate counter wraps,
	       mark some counter resets	as "Unknown", but can mistake some
	       counter resets for a legitimate counter wrap.

	       For a 5 minute step and 32-bit counter, the probability of mis-
	       taking a	counter	reset for a legitimate wrap is arguably	about
	       0.8% per	1Mbps of maximum bandwidth. Note that this equates to
	       80% for 100Mbps interfaces, so for high bandwidth interfaces
	       and a 32bit counter, DERIVE with	min=0 is probably preferable.
	       If you are using	a 64bit	counter, just about any	max setting
	       will eliminate the possibility of mistaking a reset for a
	       counter wrap.

	       is for counters which get reset upon reading. This is used for
	       fast counters which tend	to overflow. So	instead	of reading
	       them normally you reset them after every	read to	make sure you
	       have a maximum time available before the	next overflow. Another
	       usage is	for things you count like number of messages since the
	       last update.

	       is for storing the result of a formula applied to other data
	       sources in the RRD. This	data source is not supplied a value on
	       update, but rather its Primary Data Points (PDPs) are computed
	       from the	PDPs of	the data sources according to the rpn-expres-
	       sion that defines the formula. Consolidation functions are then
	       applied normally	to the PDPs of the COMPUTE data	source (that
	       is the rpn-expression is	only applied to	generate PDPs).	In
	       database	software, such data sets are referred to as "virtual"
	       or "computed" columns.

	   heartbeat defines the maximum number	of seconds that	may pass be-
	   tween two updates of	this data source before	the value of the data
	   source is assumed to	be *UNKNOWN*.

	   min and max define the expected range values	for data supplied by a
	   data	source.	If min and/or max any value outside the	defined	range
	   will	be regarded as *UNKNOWN*. If you do not	know or	care about min
	   and max, set	them to	U for unknown. Note that min and max always
	   refer to the	processed values of the	DS. For	a traffic-COUNTER type
	   DS this would be the	maximum	and minimum data-rate expected from
	   the device.

	   If information on minimal/maximal expected values is	available, al-
	   ways	set the	min and/or max properties. This	will help RRDtool in
	   doing a simple sanity check on the data supplied when running up-

	   rpn-expression defines the formula used to compute the PDPs of a
	   COMPUTE data	source from other data sources in the same <RRD>. It
	   is similar to defining a CDEF argument for the graph	command.
	   Please refer	to that	manual page for	a list and description of RPN
	   operations supported. For COMPUTE data sources, the following RPN
	   operations are not supported: COUNT,	PREV, TIME, and	LTIME. In ad-
	   dition, in defining the RPN expression, the COMPUTE data source may
	   only	refer to the names of data source listed previously in the
	   create command. This	is similar to the restriction that CDEFs must
	   refer only to DEFs and CDEFs	previously defined in the same graph

       RRA:CF:cf arguments
	   The purpose of an RRD is to store data in the round robin archives
	   (RRA). An archive consists of a number of data values or statistics
	   for each of the defined data-sources	(DS) and is defined with an
	   RRA line.

	   When	data is	entered	into an	RRD, it	is first fit into time slots
	   of the length defined with the -s option, thus becoming a primary
	   data	point.

	   The data is also processed with the consolidation function (CF) of
	   the archive.	There are several consolidation	functions that consol-
	   idate primary data points via an aggregate function:	AVERAGE, MIN,
	   MAX,	LAST. The format of RRA	line for these consolidation functions

	   RRA:AVERAGE | MIN | MAX | LAST:xff:steps:rows

	   xff The xfiles factor defines what part of a	consolidation interval
	   may be made up from *UNKNOWN* data while the	consolidated value is
	   still regarded as known. It is given	as the ratio of	allowed	*UN-
	   KNOWN* PDPs to the number of	PDPs in	the interval. Thus, it ranges
	   from	0 to 1 (exclusive).

	   steps defines how many of these primary data	points are used	to
	   build a consolidated	data point which then goes into	the archive.

	   rows	defines	how many generations of	data values are	kept in	an

Aberrant Behavior Detection with Holt-Winters Forecasting
       In addition to the aggregate functions, there are a set of specialized
       functions that enable RRDtool to	provide	data smoothing (via the	Holt-
       Winters forecasting algorithm), confidence bands, and the flagging
       aberrant	behavior in the	data source time series:

       o   RRA:HWPREDICT:rows:alpha:beta:seasonal period[:rra-num]

       o   RRA:SEASONAL:seasonal period:gamma:rra-num

       o   RRA:DEVSEASONAL:seasonal period:gamma:rra-num

       o   RRA:DEVPREDICT:rows:rra-num

       o   RRA:FAILURES:rows:threshold:window length:rra-num

       These RRAs differ from the true consolidation functions in several
       ways.  First, each of the RRAs is updated once for every	primary	data
       point.  Second, these RRAs are interdependent. To generate real-time
       confidence bounds, a matched set	of HWPREDICT, SEASONAL,	DEVSEASONAL,
       and DEVPREDICT must exist. Generating smoothed values of	the primary
       data points requires both a HWPREDICT RRA and SEASONAL RRA. Aberrant
       behavior	detection requires FAILURES, HWPREDICT,	DEVSEASONAL, and SEA-

       The actual predicted, or	smoothed, values are stored in the HWPREDICT
       RRA. The	predicted deviations are stored	in DEVPREDICT (think a stan-
       dard deviation which can	be scaled to yield a confidence	band). The
       FAILURES	RRA stores binary indicators. A	1 marks	the indexed observa-
       tion as failure;	that is, the number of confidence bounds violations in
       the preceding window of observations met	or exceeded a specified
       threshold. An example of	using these RRAs to graph confidence bounds
       and failures appears in rrdgraph.

       The SEASONAL and	DEVSEASONAL RRAs store the seasonal coefficients for
       the Holt-Winters	forecasting algorithm and the seasonal deviations, re-
       spectively.  There is one entry per observation time point in the sea-
       sonal cycle. For	example, if primary data points	are generated every
       five minutes and	the seasonal cycle is 1	day, both SEASONAL and DEVSEA-
       SONAL will have 288 rows.

       In order	to simplify the	creation for the novice	user, in addition to
       supporting explicit creation of the HWPREDICT, SEASONAL,	DEVPREDICT,
       DEVSEASONAL, and	FAILURES RRAs, the RRDtool create command supports im-
       plicit creation of the other four when HWPREDICT	is specified alone and
       the final argument rra-num is omitted.

       rows specifies the length of the	RRA prior to wrap around. Remember
       that there is a one-to-one correspondence between primary data points
       and entries in these RRAs. For the HWPREDICT CF,	rows should be larger
       than the	seasonal period. If the	DEVPREDICT RRA is implicitly created,
       the default number of rows is the same as the HWPREDICT rows argument.
       If the FAILURES RRA is implicitly created, rows will be set to the sea-
       sonal period argument of	the HWPREDICT RRA. Of course, the RRDtool re-
       size command is available if these defaults are not sufficient and the
       creator wishes to avoid explicit	creations of the other specialized
       function	RRAs.

       seasonal	period specifies the number of primary data points in a	sea-
       sonal cycle. If SEASONAL	and DEVSEASONAL	are implicitly created,	this
       argument	for those RRAs is set automatically to the value specified by
       HWPREDICT. If they are explicitly created, the creator should verify
       that all	three seasonal period arguments	agree.

       alpha is	the adaption parameter of the intercept	(or baseline) coeffi-
       cient in	the Holt-Winters forecasting algorithm.	See rrdtool for	a de-
       scription of this algorithm. alpha must lie between 0 and 1. A value
       closer to 1 means that more recent observations carry greater weight in
       predicting the baseline component of the	forecast. A value closer to 0
       means that past history carries greater weight in predicting the	base-
       line component.

       beta is the adaption parameter of the slope (or linear trend) coeffi-
       cient in	the Holt-Winters forecasting algorithm.	beta must lie between
       0 and 1 and plays the same role as alpha	with respect to	the predicted
       linear trend.

       gamma is	the adaption parameter of the seasonal coefficients in the
       Holt-Winters forecasting	algorithm (HWPREDICT) or the adaption parame-
       ter in the exponential smoothing	update of the seasonal deviations. It
       must lie	between	0 and 1. If the	SEASONAL and DEVSEASONAL RRAs are cre-
       ated implicitly,	they will both have the	same value for gamma: the
       value specified for the HWPREDICT alpha argument. Note that because
       there is	one seasonal coefficient (or deviation)	for each time point
       during the seasonal cycle, the adaptation rate is much slower than the
       baseline. Each seasonal coefficient is only updated (or adapts) when
       the observed value occurs at the	offset in the seasonal cycle corre-
       sponding	to that	coefficient.

       If SEASONAL and DEVSEASONAL RRAs	are created explicitly,	gamma need not
       be the same for both. Note that gamma can also be changed via the RRD-
       tool tune command.

       rra-num provides	the links between related RRAs.	If HWPREDICT is	speci-
       fied alone and the other	RRAs are created implicitly, then there	is no
       need to worry about this	argument. If RRAs are created explicitly, then
       carefully pay attention to this argument. For each RRA which includes
       this argument, there is a dependency between that RRA and another RRA.
       The rra-num argument is the 1-based index in the	order of RRA creation
       (that is, the order they	appear in the create command). The dependent
       RRA for each RRA	requiring the rra-num argument is listed here:

       o   HWPREDICT rra-num is	the index of the SEASONAL RRA.

       o   SEASONAL rra-num is the index of the	HWPREDICT RRA.

       o   DEVPREDICT rra-num is the index of the DEVSEASONAL RRA.

       o   DEVSEASONAL rra-num is the index of the HWPREDICT RRA.

       o   FAILURES rra-num is the index of the	DEVSEASONAL RRA.

       threshold is the	minimum	number of violations (observed values outside
       the confidence bounds) within a window that constitutes a failure. If
       the FAILURES RRA	is implicitly created, the default value is 7.

       window length is	the number of time points in the window. Specify an
       integer greater than or equal to	the threshold and less than or equal
       to 28.  The time	interval this window represents	depends	on the inter-
       val between primary data	points.	If the FAILURES	RRA is implicitly cre-
       ated, the default value is 9.

       Here is an explanation by Don Baarda on the inner workings of RRDtool.
       It may help you to sort out why all this	*UNKNOWN* data is popping up
       in your databases:

       RRDtool gets fed	samples/updates	at arbitrary times. From these it
       builds Primary Data Points (PDPs) on every "step" interval. The PDPs
       are then	accumulated into the RRAs.

       The "heartbeat" defines the maximum acceptable interval between sam-
       ples/updates. If	the interval between samples is	less than "heartbeat",
       then an average rate is calculated and applied for that interval. If
       the interval between samples is longer than "heartbeat",	then that en-
       tire interval is	considered "unknown". Note that	there are other	things
       that can	make a sample interval "unknown", such as the rate exceeding
       limits, or a sample that	was explicitly marked as unknown.

       The known rates during a	PDP's "step" interval are used to calculate an
       average rate for	that PDP. If the total "unknown" time accounts for
       more than half the "step", the entire PDP is marked as "unknown". This
       means that a mixture of known and "unknown" sample times	in a single
       PDP "step" may or may not add up	to enough "known" time to warrent for
       a known PDP.

       The "heartbeat" can be short (unusual) or long (typical)	relative to
       the "step" interval between PDPs. A short "heartbeat" means you require
       multiple	samples	per PDP, and if	you don't get them mark	the PDP	un-
       known. A	long heartbeat can span	multiple "steps", which	means it is
       acceptable to have multiple PDPs	calculated from	a single sample. An
       extreme example of this might be	a "step" of 5 minutes and a "heart-
       beat" of	one day, in which case a single	sample every day will result
       in all the PDPs for that	entire day period being	set to the same	aver-
       age rate. -- Don	Baarda _don.baarda@baesystems.com_

	      u|02|----* sample1, restart "hb"-timer
	      u|03|   /
	      u|04|  /
	      u|05| /
	      u|06|/	 "hbt" expired
	       |08|----* sample2, restart "hb"
	       |09|   /
	       |10|  /
	      u|11|----* sample3, restart "hb"
	      u|12|   /
	      u|13|  /
	step1_u|14| /
	      u|15|/	 "swt" expired
	       |17|----* sample4, restart "hb",	create "pdp" for step1 =
	       |18|   /	 = unknown due to 10 "u" labled	secs > 0.5 * step
	       |19|  /
	       |20| /
	       |21|----* sample5, restart "hb"
	       |22|   /
	       |23|  /
	       |24|----* sample6, restart "hb"
	       |25|   /
	       |26|  /
	       |27|----* sample7, restart "hb"
	step2__|28|   /
	       |22|  /
	       |23|----* sample8, restart "hb",	create "pdp" for step1,	create "cdp"
	       |24|   /
	       |25|  /

       graphics	by

       Here are	a few hints on how to measure:

	   Usually you have some type of meter you can read to get the temper-
	   ature.  The temperature is not really connected with	a time.	The
	   only	connection is that the temperature reading happened at a cer-
	   tain	time. You can use the GAUGE data source	type for this. RRDtool
	   will	then record your reading together with the time.

       Mail Messages
	   Assume you have a method to count the number	of messages trans-
	   ported by your mailserver in	a certain amount of time, giving you
	   data	like '5	messages in the	last 65	seconds'. If you look at the
	   count of 5 like an ABSOLUTE data type you can simply	update the RRD
	   with	the number 5 and the end time of your monitoring period. RRD-
	   tool	will then record the number of messages	per second. If at some
	   later stage you want	to know	the number of messages transported in
	   a day, you can get the average messages per second from RRDtool for
	   the day in question and multiply this number	with the number	of
	   seconds in a	day. Because all math is run with Doubles, the preci-
	   sion	should be acceptable.

       It's always a Rate
	   RRDtool stores rates	in amount/second for COUNTER, DERIVE and ABSO-
	   LUTE	data.  When you	plot the data, you will	get on the y axis
	   amount/second which you might be tempted to convert to an absolute
	   amount by multiplying by the	delta-time between the points. RRDtool
	   plots continuous data, and as such is not appropriate for plotting
	   absolute amounts as for example "total bytes" sent and received in
	   a router. What you probably want is plot rates that you can scale
	   to bytes/hour, for example, or plot absolute	amounts	with another
	   tool	that draws bar-plots, where the	delta-time is clear on the
	   plot	for each point (such that when you read	the graph you see for
	   example GB on the y axis, days on the x axis	and one	bar for	each

	rrdtool	create temperature.rrd --step 300 \
	 DS:temp:GAUGE:600:-273:5000 \
	 RRA:AVERAGE:0.5:1:1200	\
	 RRA:MIN:0.5:12:2400 \
	 RRA:MAX:0.5:12:2400 \

       This sets up an RRD called temperature.rrd which	accepts	one tempera-
       ture value every	300 seconds. If	no new data is supplied	for more than
       600 seconds, the	temperature becomes *UNKNOWN*.	The minimum acceptable
       value is	-273 and the maximum is	5'000.

       A few archive areas are also defined. The first stores the temperatures
       supplied	for 100	hours (1'200 * 300 seconds = 100 hours). The second
       RRA stores the minimum temperature recorded over	every hour (12 * 300
       seconds = 1 hour), for 100 days (2'400 hours). The third	and the	fourth
       RRA's do	the same for the maximum and average temperature, respec-

	rrdtool	create monitor.rrd --step 300	     \
	  DS:ifOutOctets:COUNTER:1800:0:4294967295   \
	  RRA:AVERAGE:0.5:1:2016		     \

       This example is a monitor of a router interface.	The first RRA tracks
       the traffic flow	in octets; the second RRA generates the	specialized
       functions RRAs for aberrant behavior detection. Note that the rra-num
       argument	of HWPREDICT is	missing, so the	other RRAs will	implicitly be
       created with default parameter values. In this example, the forecasting
       algorithm baseline adapts quickly; in fact the most recent one hour of
       observations (each at 5 minute intervals) accounts for 75% of the base-
       line prediction.	The linear trend forecast adapts much more slowly. Ob-
       servations made during the last day (at 288 observations	per day) ac-
       count for only 65% of the predicted linear trend. Note: these computa-
       tions rely on an	exponential smoothing formula described	in the LISA
       2000 paper.

       The seasonal cycle is one day (288 data points at 300 second inter-
       vals), and the seasonal adaption	parameter will be set to 0.1. The RRD
       file will store 5 days (1'440 data points) of forecasts and deviation
       predictions before wrap around. The file	will store 1 day (a seasonal
       cycle) of 0-1 indicators	in the FAILURES	RRA.

       The same	RRD file and RRAs are created with the following command,
       which explicitly	creates	all specialized	function RRAs.

	rrdtool	create monitor.rrd --step 300 \
	  DS:ifOutOctets:COUNTER:1800:0:4294967295 \
	  RRA:AVERAGE:0.5:1:2016 \
	  RRA:HWPREDICT:1440:0.1:0.0035:288:3 \
	  RRA:SEASONAL:288:0.1:2 \
	  RRA:DEVPREDICT:1440:5	\
	  RRA:DEVSEASONAL:288:0.1:2 \

       Of course, explicit creation need not replicate implicit	create,	a num-
       ber of arguments	could be changed.

	rrdtool	create proxy.rrd --step	300 \
	  DS:Total:DERIVE:1800:0:U  \
	  DS:Duration:DERIVE:1800:0:U  \
	  DS:AvgReqDur:COMPUTE:Duration,Requests,0,EQ,1,Requests,IF,/ \

       This example is monitoring the average request duration during each 300
       sec interval for	requests processed by a	web proxy during the interval.
       In this case, the proxy exposes two counters, the number	of requests
       processed since boot and	the total cumulative duration of all processed
       requests. Clearly these counters	both have some rollover	point, but us-
       ing the DERIVE data source also handles the reset that occurs when the
       web proxy is stopped and	restarted.

       In the RRD, the first data source stores	the requests per second	rate
       during the interval. The	second data source stores the total duration
       of all requests processed during	the interval divided by	300. The COM-
       PUTE data source	divides	each PDP of the	AccumDuration by the corre-
       sponding	PDP of TotalRequests and stores	the average request duration.
       The remainder of	the RPN	expression handles the divide by zero case.

       Tobias Oetiker <>

1.2.30				  2009-01-19			  RRDCREATE(1)

NAME | SYNOPSIS | DESCRIPTION | Aberrant Behavior Detection with Holt-Winters Forecasting | The HEARTBEAT and the STEP | HOW TO MEASURE | EXAMPLE | EXAMPLE 2 | EXAMPLE 3 | AUTHOR

Want to link to this manual page? Use this URL:

home | help