Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
rwcount(1)			SiLK Tool Suite			    rwcount(1)

NAME
       rwcount - Print traffic summary across time

SYNOPSIS
	 rwcount [--bin-size=SIZE] [--load-scheme=LOADSCHEME]
	       [--start-time=START_TIME] [--end-time=END_TIME]
	       [--skip-zeroes] [--bin-slots] [--epoch-slots]
	       [--timestamp-format=FORMAT] [--no-titles]
	       [--no-columns] [--column-separator=CHAR]
	       [--no-final-delimiter] [{--delimited | --delimited=CHAR}]
	       [--print-filenames] [--copy-input=PATH] [--output-path=PATH]
	       [--pager=PAGER_PROG] [--site-config-file=FILENAME]
	       [{--legacy-timestamps | --legacy-timestamps={1,0}}]
	       {[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}

	 rwcount --help

	 rwcount --version

DESCRIPTION
       rwcount summarizes SiLK flow records across time.  It counts the
       records in the input stream, and	groups their byte and packet totals
       into time bins.	rwcount	produces textual output	with one row for each
       bin.

       rwcount reads SiLK Flow records from the	files named on the command
       line or from the	standard input when no file names are specified	and
       --xargs is not present.	To read	the standard input in addition to the
       named files, use	"-" or "stdin" as a file name.	If an input file name
       ends in ".gz", the file is uncompressed as it is	read.  When the
       --xargs switch is provided, rwcount reads the names of the files	to
       process from the	named text file	or from	the standard input if no file
       name argument is	provided to the	switch.	 The input to --xargs must
       contain one file	name per line.

       rwcount splits each flow	record into bins whose size is determined by
       the argument to the --bin-size switch.  When that switch	is not
       provided, rwcount uses 30-second	bins by	default.

       By default, the first row of data rwcount prints	is the bin containing
       the starting time of the	earliest record	that appears in	the input.
       rwcount then prints a row for every bin until it	reaches	the bin
       containing the most recent ending time.	Rows whose counts are zero are
       printed unless the --skip-zero switch is	specified.

       The --start-time	and --end-time switches	tell rwcount to	use a specific
       time for	the first row and the final row.  The --start-time switch
       always sets the time stamp on the first bin to the specified time.
       With the	--end-time switch, rwcount computes a maximum end-time by
       setting any unspecified hour, minute, second, and millisecond field to
       its maximum value, and the final	bin is that which contains the maximum
       end-time.

       When --start-time and --end-time	are both specified, rwcount reserves
       the memory for the bins before it begins	processing the records.	 If
       the memory cannot be allocated, rwcount exits.  If this happens,	try
       reducing	the time span or increasing the	bin-size.

   Load	Scheme
       A router	or other flow generator	summarizes the traffic it sees into
       records.	 In addition to	the five-tuple (source port and	address,
       destination port	and address, and protocol), the	record has its start
       time, end time, total byte count, and total packet count.  There	is no
       way to know how the bytes and packets were distributed during the
       duration	of the record: their distribution could	be front-loaded, back-
       loaded, uniform,	et cetera.

       When the	start and end times of a individual flow record	put that
       record into a single bin, rwcount can simply add	that record's volume
       (byte and packet	counts)	to the bin.

       When the	duration of a flow record causes it to span multiple bins,
       rwcount must to told how	to allocate the	volume among the bins.	The
       --load-scheme switch determines this, and it has	supports the following
       allocation schemes:

       time-proportional
	   Divides the total volume of the flow	by the duration	of the flow,
	   and multiplies the quotient by the time spent in the	bin.  Thus,
	   the volume the flow contributes to a	bin is proportional to the
	   time	the flow spent in the bin.  This models	a flow where the
	   volume/second ratio is uniform.

       bin-uniform
	   Divides the volume of the flow by the number	of bins	the flow
	   spans, and adds the quotient	to each	of the bins.  In this scheme,
	   the volume/bin ratio	is uniform.

       start-spike
	   Adds	the total volume for the flow into the bin containing the
	   start time of the flow.  This models	a flow that is front-loaded to
	   the point where the entire volume is	a single spike occurring in
	   the initial millisecond of flow.

       middle-spike
	   Determines the time at the midpoint of the flow, and	adds the
	   entire volume for the flow into the bin containing that time.

       end-spike
	   Adds	the total volume for the flow into the bin containing the end
	   time	of the flow.  This models a flow that is back-loaded to	the
	   point where the entire volume is a single spike occurring in	final
	   millisecond of the flow.

       maximum-volume
	   Adds	the entire volume for the flow into every bin that contains
	   any part of the flow.  In theory, the distribution of the bytes in
	   the record could be a spike that occurs at any point	during the
	   flow's duration.  This scheme allows	one to determine, in
	   aggregate, the maximum possible volume that could have occurred
	   during this bin.  In	this scheme, the "Records" column gives	the
	   number of records that were active during the bin.

       minimum-volume
	   Acts	as though the volume for the flow occurred in some other bin.
	   It is possible that a record	that spans multiple bins did not
	   contribute any volume to the	current	bin.  This scheme allows one
	   to determine, in aggregate, the minimum possible volume that	may
	   have	occurred during	this bin.  The "Records" column	in this
	   scheme, as in the "maximum-volume" scheme, gives the	number of flow
	   records that	were active during the bin.

       Be aware	that the "spike" load-schemes allocate the entire flow to a
       single bin. This	can create the impression that there is	more traffic
       occurring during	a particular time window that the physical network
       supports.

       The "maximum-volume" and	"minimum-volume" schemes are used to compute
       the maximum and minimum volumes that could have been transferred	during
       any one bin.  "maximum-volume" intentionally over-counts	the flow
       volume and "minimum-volume" intentionally under-counts.

       To see the effect of the	various	load-schemes, suppose rwcount is using
       60-second bins and the input contains two records.  The first record
       begins at 12:03:50, ends	at 12:06:20, and contains 9,000	bytes (60
       bytes/second for	150 seconds).  This record may contribute to bins at
       12:03, 12:04, 12:05, and	12:06.	The second record begins at 12:04:05
       and lasts 15 seconds; this record's volume always contributes its 200
       bytes to	the 12:04 bin.	The --load-scheme option splits	the byte-
       counts of the records as	follows:

	BIN		    12:03:00	12:04:00    12:05:00	12:06:00

	time-proportional	 600	    3800	3600	    1200
	bin-uniform		2250	    2450	2250	    2250
	start-spike		9000	     200	   0	       0
	middle-spike		   0	     200	9000	       0
	end-spike		   0	     200	   0	    9000
	maximum-volume		9000	    9200	9000	    9000
	minimum-volume		   0	     200	   0	       0

       For the record that spans multiple bins:	the "time-proportional"	scheme
       assumes 60 bytes/second,	the "bin-uniform" scheme divides the volume
       evenly by the four bins,	the "middle-spike" scheme assumes all the
       volume occurs at	12:05:05, the "maximum-volume" scheme adds the volume
       to every	bin, and the "minimum-volume" scheme ignores the record.

OPTIONS
       Option names may	be abbreviated if the abbreviation is unique or	is an
       exact match for an option.  A parameter to an option may	be specified
       as --arg=param or --arg param, though the first form is required	for
       options that take optional parameters.

       --bin-size=SIZE
	   Denote the size of each time	bin, in	seconds; defaults to 30
	   seconds.  rwcount supports millisecond size bins; SIZE may be a
	   floating point value	equal to or greater than than 0.001.

       --load-scheme=LOADSCHEME
	   Specify how a flow record that spans	multiple bins allocates	its
	   bytes and packets among the bins.  The default scheme is
	   "time-proportional",	which assumes the volume/second	ratio of the
	   flow	record is constant.  See the "Load Scheme" section for
	   additional information on the load-scheme choices.  The LOADSCHEME
	   may be one of the following names or	numbers; names may be
	   abbreviated to the shortest prefix that is unique.

	   time-proportional,4
	       Allocate	the volume in proportion to the	amount of time the
	       flow spent in the bin.

	   bin-uniform,0
	       Allocate	the volume evenly across the bins that contain any
	       part of the flow's duration.

	   start-spike,1
	       Allocate	the entire volume to the bin containing	the start time
	       of the flow.

	   middle-spike,3
	       Allocate	the entire volume to the bin containing	the time at
	       the midpoint of the flow.

	   end-spike,2
	       Allocate	the entire volume to the bin containing	the end	time
	       of the flow.

	   maximum-volume,5
	       Allocate	the entire volume to all of the	bins containing	any
	       part of the flow.

	   minimum-volume,6
	       Allocate	the flow's volume to a bin only	if the flow is
	       completely contained within the bin; otherwise ignore the flow.

       --start-time=START_TIME
	   Set the time	of the first bin to START_TIME.	 When this switch is
	   not given, the first	bin is one that	holds the starting time	of the
	   earliest record.  The START_TIME may	be specified in	a format of
	   "yyyy/mm/dd[:HH[:MM[:SS[.sss]]]]" (or "T" may be used in place of
	   ":" to separate the day and hour).  The time	must be	specified to
	   at least day	precision, and unspecified hour, minute, second, and
	   millisecond values are set to zero.	Whether	the date strings
	   represent times in UTC or the local timezone	depend on how SiLK was
	   compiled, which can be determined from the "Timezone	support"
	   setting in the output from rwcount --version.  Alternatively, the
	   time	may be specified as seconds since the UNIX epoch, and an
	   unspecified milliseconds value is set to 0.

       --end-time=END_TIME
	   Set the time	of the final bin to END_TIME.  When this switch	is not
	   given, the final bin	is one that holds the ending time of the
	   latest record.  The format of END_TIME is the same as that for
	   START_TIME.	Unspecified hour, minute, second, and millisecond
	   values are set to 23, 59, 59, and 999 respectively.	When END_TIME
	   is specified	as seconds since the UNIX epoch, an unspecified
	   milliseconds	value is set to	999.  When both	--start-time and
	   --end-time are used,	the END_TIME is	adjusted so that the final bin
	   represents a	complete interval.

       --skip-zeroes
	   Disable printing of bins with no traffic.  By default, all bins are
	   printed.

       --bin-slots
	   Use the internal bin	index as the label for each bin	in the output;
	   the default is to label each	bin with the time in a human-readable
	   format.

       --epoch-slots
	   Use the UNIX	epoch time (number of seconds since midnight UTC on
	   1970-01-01) as the label for	each bin in the	output;	the default is
	   to label each bin with the time in a	human-readable format.	This
	   switch is equivalent	to --timestamp-format=epoch.  This switch is
	   deprecated as of SiLK 3.11.0, and it	will be	removed	in the SiLK
	   4.0 release.

       --timestamp-format=FORMAT
	   Specify the format and/or timezone to use when printing timestamps.
	   When	this switch is not specified, the SILK_TIMESTAMP_FORMAT
	   environment variable	is checked for a default format	and/or
	   timezone.  If it is empty or	contains invalid values, timestamps
	   are printed in the default format, and the timezone is UTC unless
	   SiLK	was compiled with local	timezone support.  FORMAT is a comma-
	   separated list of a format and/or a timezone.  The format is	one
	   of:

	   default
	       Print the timestamps as "YYYY/MM/DDThh:mm:ss".

	   iso Print the timestamps as "YYYY-MM-DD hh:mm:ss".

	   m/d/y
	       Print the timestamps as "MM/DD/YYYY hh:mm:ss".

	   epoch
	       Print the timestamps as the number of seconds since 00:00:00
	       UTC on 1970-01-01.

	   When	a timezone is specified, it is used regardless of the default
	   timezone support compiled into SiLK.	 The timezone is one of:

	   utc Use Coordinated Universal Time to print timestamps.

	   local
	       Use the TZ environment variable or the local timezone.

       --no-titles
	   Turn	off column titles.  By default,	titles are printed.

       --no-columns
	   Disable fixed-width columnar	output.

       --column-separator=C
	   Use specified character between columns and after the final column.
	   When	this switch is not specified, the default of '|' is used.

       --no-final-delimiter
	   Do not print	the column separator after the final column.  Normally
	   a delimiter is printed.

       --delimited
       --delimited=C
	   Run as if --no-columns --no-final-delimiter --column-sep=C had been
	   specified.  That is,	disable	fixed-width columnar output; if
	   character C is provided, it is used as the delimiter	between
	   columns instead of the default '|'.

       --print-filenames
	   Print to the	standard error the names of input files	as they	are
	   opened.

       --copy-input=PATH
	   Copy	all binary SiLK	Flow records read as input to the specified
	   file	or named pipe.	PATH may be "stdout" or	"-" to write flows to
	   the standard	output as long as the --output-path switch is
	   specified to	redirect rwcount's textual output to a different
	   location.

       --output-path=PATH
	   Write the textual output to PATH, where PATH	is a filename, a named
	   pipe, the keyword "stderr" to write the output to the standard
	   error, or the keyword "stdout" or "-" to write the output to	the
	   standard output (and	bypass the paging program).  If	PATH names an
	   existing file, rwcount exits	with an	error unless the SILK_CLOBBER
	   environment variable	is set,	in which case PATH is overwritten.  If
	   this	switch is not given, the output	is either sent to the pager or
	   written to the standard output.

       --pager=PAGER_PROG
	   When	output is to a terminal, invoke	the program PAGER_PROG to view
	   the output one screen full at a time.  This switch overrides	the
	   SILK_PAGER environment variable, which in turn overrides the	PAGER
	   variable.  If the --output-path switch is given or if the value of
	   the pager is	determined to be the empty string, no paging is
	   performed and all output is written to the terminal.

       --site-config-file=FILENAME
	   Read	the SiLK site configuration from the named file	FILENAME.
	   When	this switch is not provided, rwcount searches for the site
	   configuration file in the locations specified in the	"FILES"
	   section.

       --legacy-timestamps
       --legacy-timestamps=NUM
	   When	NUM is not specified or	is 1, this switch is equivalent	to
	   --timestamp-format=m/d/y.  Otherwise, the switch has	no effect.
	   This	switch is deprecated as	of SiLK	3.0.0, and it will be removed
	   in the SiLK 4.0 release.

       --xargs
       --xargs=FILENAME
	   Read	the names of the input files from FILENAME or from the
	   standard input if FILENAME is not provided.	The input is expected
	   to have one filename	per line.  rwcount opens each named file in
	   turn	and reads records from it as if	the filenames had been listed
	   on the command line.

       --help
	   Print the available options and exit.

       --version
	   Print the version number and	information about how SiLK was
	   configured, then exit the application.

       --start-epoch=START_TIME
	   Alias the --start-time switch.  This	switch is deprecated as	of
	   SiLK	3.8.0.

       --end-epoch=START_TIME
	   Alias the --end-time	switch.	 This switch is	deprecated as of SiLK
	   3.8.0.

EXAMPLES
       In the following	examples, the dollar sign ("$")	represents the shell
       prompt.	The text after the dollar sign represents the command line.
       Lines have been wrapped for improved readability, and the back slash
       ("\") is	used to	indicate a wrapped line.

       To count	all web	traffic	on Feb 12, 2009, into 1	hour bins:

	$ rwfilter --pass=stdout --start-date=2009/02/12:00	   \
	       --end-date=2009/02/12:23	--proto=6 --aport=80	   \
	  | rwcount --bin-size=3600
		       Date|	  Records|	    Bytes|	Packets|
	2009/02/12T00:00:00|	  1490.49|   578270918.16|    463951.55|
	2009/02/12T01:00:00|	  1459.33|   596455716.52|    457487.80|
	2009/02/12T02:00:00|	  1529.06|   562602842.44|    451456.41|
	2009/02/12T03:00:00|	  1503.89|   562683116.38|    455554.81|
	2009/02/12T04:00:00|	  1561.89|   590554569.78|    489273.81|
	....

       To bin the records according to their start times, use the
       --load-scheme switch:

	$ rwfilter ... --pass=stdout	   \
	  | rwcount --bin-size=3600 --load-scheme=1
		       Date|	  Records|	    Bytes|	Packets|
	2009/02/12T00:00:00|	  1494.00|   580350969.00|    464952.00|
	2009/02/12T01:00:00|	  1462.00|   596145212.00|    457871.00|
	2009/02/12T02:00:00|	  1526.00|   561629416.00|    451088.00|
	2009/02/12T03:00:00|	  1502.00|   563500618.00|    455262.00|
	2009/02/12T04:00:00|	  1562.00|   589265818.00|    489279.00|
	...

       To bin the records by their end times:
	$ rwfilter ... --pass=stdout	   \
	  | rwcount --bin-size=3600 --load-scheme=2
		       Date|	  Records|	    Bytes|	Packets|
	2009/02/12T00:00:00|	  1488.00|   577132372.00|    463393.00|
	2009/02/12T01:00:00|	  1458.00|   596956697.00|    457376.00|
	2009/02/12T02:00:00|	  1530.00|   562806395.00|    451551.00|
	2009/02/12T03:00:00|	  1506.00|   562101791.00|    455671.00|
	2009/02/12T04:00:00|	  1562.00|   591408602.00|    489371.00|
	...

       To force	the hourly bins	to run from 30 minutes past the	hour, use the
       --start-time switch:

	$ rwfilter ... --pass=stdout	   \
	  | rwcount --bin-size=3600 --start-time=2002/12/31:23:30
		       Date|	  Records|	    Bytes|	Packets|
	2009/02/12T00:30:00|	  1483.26|   581251364.04|    456554.40|
	2009/02/12T01:30:00|	  1494.00|   575037453.00|    449280.00|
	2009/02/12T02:30:00|	  1486.36|   559700466.61|    447700.15|
	2009/02/12T03:30:00|	  1555.23|   588882400.58|    480724.48|
	2009/02/12T04:30:00|	  1537.79|   564756248.52|    472003.45|
	...

ENVIRONMENT
       SILK_TIMESTAMP_FORMAT
	   This	environment variable is	used as	the value for
	   --timestamp-format when that	switch is not provided.	 Since SiLK
	   3.11.0.

       SILK_PAGER
	   When	set to a non-empty string, rwcount automatically invokes this
	   program to display its output a screen at a time.  If set to	an
	   empty string, rwcount does not automatically	page its output.

       PAGER
	   When	set and	SILK_PAGER is not set, rwcount automatically invokes
	   this	program	to display its output a	screen at a time.

       SILK_CLOBBER
	   The SiLK tools normally refuse to overwrite existing	files.
	   Setting SILK_CLOBBER	to a non-empty value removes this restriction.

       SILK_CONFIG_FILE
	   This	environment variable is	used as	the value for the
	   --site-config-file when that	switch is not provided.

       SILK_DATA_ROOTDIR
	   This	environment variable specifies the root	directory of data
	   repository.	As described in	the "FILES" section, rwcount may use
	   this	environment variable when searching for	the SiLK site
	   configuration file.

       SILK_PATH
	   This	environment variable gives the root of the install tree.  When
	   searching for configuration files, rwcount may use this environment
	   variable.  See the "FILES" section for details.

       TZ  When	the argument to	the --timestamp-format switch includes "local"
	   or when a SiLK installation is built	to use the local timezone, the
	   value of the	TZ environment variable	determines the timezone	in
	   which rwcount displays timestamps.  (If both	of those are false,
	   the TZ environment variable is ignored.)  If	the TZ environment
	   variable is not set,	the machine's default timezone is used.
	   Setting TZ to the empty string or 0 causes timestamps to be
	   displayed in	UTC.  For system information on	the TZ variable, see
	   tzset(3) or environ(7).  (To	determine if SiLK was built with
	   support for the local timezone, check the "Timezone support"	value
	   in the output of rwcount --version.)	 The TZ	environment variable
	   is also used	when rwcount parses the	timestamp specified in the
	   --start-time	or --end-time switches if SiLK is built	with local
	   timezone support.

FILES
       ${SILK_CONFIG_FILE}
       ${SILK_DATA_ROOTDIR}/silk.conf
       /data/silk.conf
       ${SILK_PATH}/share/silk/silk.conf
       ${SILK_PATH}/share/silk.conf
       /usr/local/share/silk/silk.conf
       /usr/local/share/silk.conf
	   Possible locations for the SiLK site	configuration file which are
	   checked when	the --site-config-file switch is not provided.

SEE ALSO
       rwfilter(1), rwuniq(1), silk(7),	tzset(3), environ(7)

BUGS
       Unlike rwuniq(1), rwcount does not support counting the number of
       distinct	IPs in a bin.  However,	using the --bin-time switch on rwuniq
       can provide time-based binning similar to what rwcount supports.	 Note
       that rwuniq always bins by the each record's start-time (similar	to
       rwcount --load-factor=1), and there is no support in rwuniq for
       dividing	a SiLK record among multiple time bins.

SiLK 3.15.0			  2017-07-02			    rwcount(1)

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | EXAMPLES | ENVIRONMENT | FILES | SEE ALSO | BUGS

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=rwcount&sektion=1&manpath=FreeBSD+12.1-RELEASE+and+Ports>

home | help