Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
rwgroup(1)			SiLK Tool Suite			    rwgroup(1)

NAME
       rwgroup - Tag similar SiLK records with a common	next hop IP value

SYNOPSIS
	 rwgroup
	       {--id-fields=KEY	| --delta-field=FIELD --delta-value=DELTA}
	       [--objective] [--summarize] [--rec-threshold=THRESHOLD]
	       [--group-offset=IP]
	       [--note-add=TEXT] [--note-file-add=FILE]	[--output-path=PATH]
	       [--copy-input=PATH] [--compression-method=COMP_METHOD]
	       [--site-config-file=FILENAME]
	       [--plugin=PLUGIN	[--plugin=PLUGIN ...]]
	       [--python-file=PATH [--python-file=PATH ...]]
	       [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
	       [FILE]

	 rwgroup [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
	       [--plugin=PLUGIN	...] [--python-file=PATH ...] --help

	 rwgroup [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
	       [--plugin=PLUGIN	...] [--python-file=PATH ...] --help-fields

	 rwgroup --version

DESCRIPTION
       rwgroup reads sorted SiLK Flow records (c.f. rwsort(1)) from the
       standard	input or from a	single file name listed	on the command line,
       marks records that form a group with an identifier in the Next Hop IP
       field, and prints the binary SiLK Flow records to the standard output.
       In some ways rwgroup is similar to rwuniq(1), but rwgroup writes	SiLK
       flow records instead of textual output.

       Two SiLK	records	are defined as being in	the same group when the	fields
       specified in the	--id-fields switch match exactly and when the field
       listed in the --delta-field matches within the value given by the
       --delta-value switch.  Either --id-fields or --delta-fields is
       required; both may be specified.	 A --delta-value must be given when
       --delta-fields is present.

       The first group of records gets the identifier 0, and rwgroup writes
       that value into each record's Next Hop IP field.	 The ID	for each
       subsequent group	is incremented by 1.  The --group-offset switch	may be
       used to set the identifier of the initial group.

       The --rec-threshold switch may be used to only write groups that
       contain a certain number	of records.  The --summarize switch attempts
       to merge	records	in the same group to a single output record.

       rwgroup requires	that the records are sorted on the fields listed in
       the --id-fields and --delta-fields switches.  For example, a call using

	 rwgroup --id-field=2 --delta-field=9 --delta-value=3

       should read the output of

	 rwsort	--field=2,9

       otherwise the results are unpredictable.

OPTIONS
       Option names may	be abbreviated if the abbreviation is unique or	is an
       exact match for an option.  A parameter to an option may	be specified
       as --arg=param or --arg param, though the first form is required	for
       options that take optional parameters.

       At least	one value for --id-field or --delta-field must be provided;
       rwgroup terminates with an error	if no fields are specified.

       --id-fields=KEY
	   KEY contains	the list of flow attributes (a.k.a. fields or columns)
	   that	must match exactly for flows to	be considered part of the same
	   group.  Each	field may be specified once only.  KEY is a comma
	   separated list of field-names, field-integers, and ranges of	field-
	   integers; a range is	specified by separating	the start and end of
	   the range with a hyphen (-).	 Field-names are case insensitive.
	   Example:

	    --id-fields=stime,10,1-5

	   There is no default value for the --id-fields switch.

	   The complete	list of	built-in fields	that the SiLK tool suite
	   supports follows, though note that not all fields are present in
	   all SiLK file formats; when a field is not present, its value is 0.

	   sIP,1
	       source IP address

	   dIP,2
	       destination IP address

	   sPort,3
	       source port for TCP and UDP, or equivalent

	   dPort,4
	       destination port	for TCP	and UDP, or equivalent

	   protocol,5
	       IP protocol

	   packets,pkts,6
	       packet count

	   bytes,7
	       byte count

	   flags,8
	       bit-wise	OR of TCP flags	over all packets

	   sTime,9
	       starting	time of	flow (seconds resolution)

	   duration,10
	       duration	of flow	(seconds resolution)

	   eTime,11
	       end time	of flow	(seconds resolution)

	   sensor,12
	       name or ID of sensor at the collection point

	   class,20
	       class of	sensor at the collection point

	   type,21
	       type of sensor at the collection	point

	   iType
	       the ICMP	type value for ICMP or ICMPv6 flows and	zero for non-
	       ICMP flows.  Internally,	SiLK stores the	ICMP type and code in
	       the "dPort" field, so there is no need have both	"dPort"	and
	       "iType" or "iCode" in the sort key.  This field was introduced
	       in SiLK 3.8.1.

	   iCode
	       the ICMP	code value for ICMP or ICMPv6 flows and	zero for non-
	       ICMP flows.  See	note at	"iType".

	   icmpTypeCode,25
	       equivalent to "iType","iCode" in	--id-fields.  This field may
	       not be mixed with "iType" or "iCode", and this field is
	       deprecated as of	SiLK 3.8.1.  As	of SiLK	3.8.1, "icmpTypeCode"
	       may no longer be	used as	the argument to	--delta-field; the
	       "dPort" field will provide an equivalent	result as long as the
	       input is	limited	to ICMP	flow records.

	   Many	SiLK file formats do not store the following fields and	their
	   values will always be 0; they are listed here for completeness:

	   in,13
	       router SNMP input interface or vlanId if	packing	tools were
	       configured to capture it	(see sensor.conf(5))

	   out,14
	       router SNMP output interface or postVlanId

	   SiLK	can store flows	generated by enhanced collection software that
	   provides more information than NetFlow v5.  These flows may support
	   some	or all of these	additional fields; for flows without this
	   additional information, the field's value is	always 0.

	   initialFlags,26
	       TCP flags on first packet in the	flow

	   sessionFlags,27
	       bit-wise	OR of TCP flags	over all packets except	the first in
	       the flow

	   attributes,28
	       flow attributes set by the flow generator:

	       "S" all the packets in this flow	record are exactly the same
		   size

	       "F" flow	generator saw additional packets in this flow
		   following a packet with a FIN flag (excluding ACK packets)

	       "T" flow	generator prematurely created a	record for a long-
		   running connection due to a timeout.	 (When the flow
		   generator yaf(1) is run with	the --silk switch, it will
		   prematurely create a	flow and mark it with "T" if the byte
		   count of the	flow cannot be stored in a 32-bit value.)

	       "C" flow	generator created this flow as a continuation of long-
		   running connection, where the previous flow for this
		   connection met a timeout (or	a byte threshold in the	case
		   of yaf).

	       Consider	a long-running ssh session that	exceeds	the flow
	       generator's active timeout.  (This is the active	timeout	since
	       the flow	generator creates a flow for a connection that still
	       has activity).  The flow	generator will create multiple flow
	       records for this	ssh session, each spanning some	portion	of the
	       total session.  The first flow record will be marked with a "T"
	       indicating that it hit the timeout.  The	second through next-
	       to-last records will be marked with "TC"	indicating that	this
	       flow both timed out and is a continuation of a flow that	timed
	       out.  The final flow will be marked with	a "C", indicating that
	       it was created as a continuation	of an active flow.

	   application,29
	       guess as	to the content of the flow.  Some software that
	       generates flow records from packet data,	such as	yaf, will
	       inspect the contents of the packets that	make up	a flow and use
	       traffic signatures to label the content of the flow.  SiLK
	       calls this label	the application; yaf refers to it as the
	       appLabel.  The application is the port number that is
	       traditionally used for that type	of traffic (see	the
	       /etc/services file on most UNIX systems).  For example, traffic
	       that the	flow generator recognizes as FTP will have a value of
	       21, even	if that	traffic	is being routed	through	the standard
	       HTTP/web	port (80).

	   The following fields	provide	a way to label the IPs or ports	on a
	   record.  These fields require external files	to provide the mapping
	   from	the IP or port to the label:

	   sType,16
	       categorize the source IP	address	as "non-routable", "internal",
	       or "external" and group based on	the category.  Uses the
	       mapping file specified by the SILK_ADDRESS_TYPES	environment
	       variable, or the	address_types.pmap mapping file, as described
	       in addrtype(3).

	   dType,17
	       as sType	for the	destination IP address

	   scc,18
	       the country code	of the source IP address.  Uses	the mapping
	       file specified by the SILK_COUNTRY_CODES	environment variable,
	       or the country_codes.pmap mapping file, as described in
	       ccfilter(3).

	   dcc,19
	       as scc for the destination IP

	   src-map-name
	       label contained in the prefix map file associated with map-
	       name.  If the prefix map	is for IP addresses, the label is that
	       associated with the source IP address.  If the prefix map is
	       for protocol/port pairs,	the label is that associated with the
	       protocol	and source port.  See also the description of the
	       --pmap-file switch below	and the	pmapfilter(3) manual page.

	   dst-map-name
	       as src-map-name for the destination IP address or the protocol
	       and destination port.

	   sval
	       as src-map-name when no map-name	is associated with the prefix
	       map file

	   dval
	       as dst-map-name when no map-name	is associated with the prefix
	       map file

	   Finally, the	list of	built-in fields	may be augmented by the	run-
	   time	loading	of PySiLK code or plug-ins written in C	(also called
	   shared object files or dynamic libraries), as described by the
	   --python-file and --plugin switches.

       --delta-field=FIELD
	   Specify a single field that can differ by a specified delta-value
	   among the SiLK records that make up a group.	 The FIELD identifiers
	   include most	of those specified for --id-fields.  The exceptions
	   are that plug-in fields are not supported, nor are fields that do
	   not have numeric values (e.g., class, type, flags).	The most
	   common value	for this switch	is "stime", which allows records that
	   are identical in the	id-fields but temporally far apart to be in
	   different groups.  The switch takes a single	argument; multiple
	   delta fields	cannot be specified.  When this	switch is specified,
	   the --delta-value switch is required.

       --delta-value=DELTA_VALUE
	   Specify the acceptable difference between the values	of the
	   --delta-field.  The --delta-value switch is required	when the
	   --delta-field switch	is provided.  For fields other than those
	   holding IPs,	when two consecutive records have values less than or
	   equal to DELTA_VALUE, the records are considered members of the
	   same	group.	When the delta-field refers to an IP field,
	   DELTA_VALUE is the number of	least significant bits of the IPs to
	   remove before comparing them.  For example, when --delta-field=sIP
	   --delta-value=8 is specified, two records are the same group	if
	   their source	IPv4 addresses belong to the same /24 or if their
	   source IPv6 addresses belong	to the same /120.  The --objective
	   switch affects the meaning of this switch.

       --objective
	   Change the behavior of the --delta-value switch so that a record is
	   considered part of a	group if the value of its --delta-field	is
	   within the DELTA_VALUE of the first record in the group.  (When
	   this	switch is not specified, consecutive records are compared.)

       --summarize
	   Cause rwgroup to print (typically) a	single record for each group.
	   By default, all records in each group having	at least
	   --rec-threshold members is printed.	When --summarize is active,
	   the record that is written for the group is the first record	in the
	   group with the following modifications:

	   o   The packets and bytes values are	the sum	of the packets and
	       bytes values, respectively, for all records in the group.

	   o   The start-time value is the earliest start time for the records
	       in the group.

	   o   The end-time value is the latest	end time for the records in
	       the group.

	   o   The flags and session-flags values are the bitwise-OR of	all
	       flags and session-flags values, respectively, for the records
	       in the group.

	   Note	that multiple records for a group may be printed if the	bytes,
	   packets, or elapsed time values are too large to be stored in a
	   SiLK	flow record.

       --plugin=PLUGIN
	   Augment the list of fields by using run-time	loading	of the plug-in
	   (shared object) whose path is PLUGIN.  The switch may be repeated
	   to load multiple plug-ins.  The creation of plug-ins	is described
	   in the silk-plugin(3) manual	page.  When PLUGIN does	not contain a
	   slash ("/"),	rwgroup	will attempt to	find a file named PLUGIN in
	   the directories listed in the "FILES" section.  If rwgroup finds
	   the file, it	uses that path.	 If PLUGIN contains a slash or if
	   rwgroup does	not find the file, rwgroup relies on your operating
	   system's dlopen(3) call to find the file.  When the
	   SILK_PLUGIN_DEBUG environment variable is non-empty,	rwgroup	prints
	   status messages to the standard error as it attempts	to find	and
	   open	each of	its plug-ins.

       --rec-threshold=THRESHOLD
	   Specify the minimum number of SiLK records a	group must contain
	   before the records in the group are written to the output stream.
	   The default is 1; i.e., write all records.  The maximum threshold
	   is 65535.

       --group-offset=IP
	   Specify the value to	write into the Next Hop	IP for the records
	   that	comprise the first group.  The value IP	may be an integer, or
	   an IPv4 or IPv6 address in the canonical presentation form.	If not
	   specified, counting begins at 0.  The value for each	subsequent
	   group is incremented	by 1.

       --note-add=TEXT
	   Add the specified TEXT to the header	of the output file as an
	   annotation.	This switch may	be repeated to add multiple
	   annotations to a file.  To view the annotations, use	the
	   rwfileinfo(1) tool.

       --note-file-add=FILENAME
	   Open	FILENAME and add the contents of that file to the header of
	   the output file as an annotation.	This switch may	be repeated to
	   add multiple	annotations.  Currently	the application	makes no
	   effort to ensure that FILENAME contains text; be careful that you
	   do not attempt to add a SiLK	data file as an	annotation.

       --copy-input=PATH
	   Copy	all binary SiLK	Flow records read as input to the specified
	   file	or named pipe.	PATH may be "stdout" or	"-" to write flows to
	   the standard	output as long as the --output-path switch is
	   specified to	redirect rwgroup's output to a different location.

       --output-path=PATH
	   Write the binary SiLK Flow records to PATH, where PATH is a
	   filename, a named pipe, the keyword "stderr"	to write the output to
	   the standard	error, or the keyword "stdout" or "-" to write the
	   output to the standard output.  If PATH names an existing file,
	   rwgroup exits with an error unless the SILK_CLOBBER environment
	   variable is set, in which case PATH is overwritten.	If this	switch
	   is not given, the output is written to the standard output.
	   Attempting to write the binary output to a terminal causes rwgroup
	   to exit with	an error.

       --compression-method=COMP_METHOD
	   Specify the compression library to use when writing output files.
	   If this switch is not given,	the value in the
	   SILK_COMPRESSION_METHOD environment variable	is used	if the value
	   names an available compression method.  When	no compression method
	   is specified, output	to the standard	output or to named pipes is
	   not compressed, and output to files is compressed using the default
	   chosen when SiLK was	compiled.  The valid values for	COMP_METHOD
	   are determined by which external libraries were found when SiLK was
	   compiled.  To see the available compression methods and the default
	   method, use the --help or --version switch.	SiLK can support the
	   following COMP_METHOD values	when the required libraries are
	   available.

	   none
	       Do not compress the output using	an external library.

	   zlib
	       Use the zlib(3) library for compressing the output, and always
	       compress	the output regardless of the destination.  Using zlib
	       produces	the smallest output files at the cost of speed.

	   lzo1x
	       Use the lzo1x algorithm from the	LZO real time compression
	       library for compression,	and always compress the	output
	       regardless of the destination.  This compression	provides good
	       compression with	less memory and	CPU overhead.

	   snappy
	       Use the snappy library for compression, and always compress the
	       output regardless of the	destination.  This compression
	       provides	good compression with less memory and CPU overhead.
	       Since SiLK 3.13.0.

	   best
	       Use lzo1x if available, otherwise use snappy if available,
	       otherwise use zlib if available.	 Only compress the output when
	       writing to a file.

       --site-config-file=FILENAME
	   Read	the SiLK site configuration from the named file	FILENAME.
	   When	this switch is not provided, rwgroup searches for the site
	   configuration file in the locations specified in the	"FILES"
	   section.

       --help
	   Print the available options and exit.  Specifying switches that add
	   new fields or additional switches before --help will	allow the
	   output to include descriptions of those fields or switches.

       --help-fields
	   Print the description and alias(es) of each field and exit.
	   Specifying switches that add	new fields before --help-fields	will
	   allow the output to include descriptions of those fields.

       --version
	   Print the version number and	information about how SiLK was
	   configured, then exit the application.

       --pmap-file=PATH
       --pmap-file=MAPNAME:PATH
	   Load	the prefix map file located at PATH and	create fields named
	   src-map-name	and dst-map-name where map-name	is either the MAPNAME
	   part	of the argument	or the map-name	specified when the file	was
	   created (see	rwpmapbuild(1)).  If no	map-name is available, rwgroup
	   names the fields "sval" and "dval".	Specify	PATH as	"-" or "stdin"
	   to read from	the standard input.  The switch	may be repeated	to
	   load	multiple prefix	map files, but each prefix map must use	a
	   unique map-name.  The --pmap-file switch(es)	must precede the
	   --fields switch.  See also pmapfilter(3).

       --python-file=PATH
	   When	the SiLK Python	plug-in	is used, rwgroup reads the Python code
	   from	the file PATH to define	additional fields that can be used as
	   part	of the group key.  This	file should call register_field() for
	   each	field it wishes	to define.  For	details	and examples, see the
	   silkpython(3) and pysilk(3) manual pages.

LIMITATIONS
       rwgroup requires	sorted data.  The application works by comparing
       records in the order that the records are received (similar to the UNIX
       uniq(1) command), odd orders will produce odd groupings.

EXAMPLES
       In the following	example, the dollar sign ("$") represents the shell
       prompt.	The text after the dollar sign represents the command line.
       Lines have been wrapped for improved readability, and the back slash
       ("\") is	used to	indicate a wrapped line.

       As a rule of thumb, the --id-fields and --delta-field parameters	should
       match rwsort(1)'s call, with --delta-field being	the last parameter.  A
       call to group all web traffic by	queries	from the same addresses
       (field=2) within	10 seconds (field=9) of	the first query	from that
       address will be:

	$ rwfilter --proto=6 --dport=80	--pass=stdout		       \
	  | rwsort --field=2,9					       \
	  | rwgroup --id-field=2 --delta-field=9 --delta-value=10      \
	       --objective

ENVIRONMENT
       PYTHONPATH
	   This	environment variable is	used by	Python to locate modules.
	   When	--python-file is specified, rwgroup must load the Python files
	   that	comprise the PySiLK package, such as silk/__init__.py.	If
	   this	silk/ directory	is located outside Python's normal search path
	   (for	example, in the	SiLK installation tree), it may	be necessary
	   to set or modify the	PYTHONPATH environment variable	to include the
	   parent directory of silk/ so	that Python can	find the PySiLK
	   module.

       SILK_PYTHON_TRACEBACK
	   When	set, Python plug-ins will output traceback information on
	   Python errors to the	standard error.

       SILK_COUNTRY_CODES
	   This	environment variable allows the	user to	specify	the country
	   code	mapping	file that rwgroup uses when computing the scc and dcc
	   fields.  The	value may be a complete	path or	a file relative	to the
	   SILK_PATH.  See the "FILES" section for standard locations of this
	   file.

       SILK_ADDRESS_TYPES
	   This	environment variable allows the	user to	specify	the address
	   type	mapping	file that rwgroup uses when computing the sType	and
	   dType fields.  The value may	be a complete path or a	file relative
	   to the SILK_PATH.  See the "FILES" section for standard locations
	   of this file.

       SILK_CLOBBER
	   The SiLK tools normally refuse to overwrite existing	files.
	   Setting SILK_CLOBBER	to a non-empty value removes this restriction.

       SILK_COMPRESSION_METHOD
	   This	environment variable is	used as	the value for
	   --compression-method	when that switch is not	provided.  Since SiLK
	   3.13.0.

       SILK_CONFIG_FILE
	   This	environment variable is	used as	the value for the
	   --site-config-file when that	switch is not provided.

       SILK_DATA_ROOTDIR
	   This	environment variable specifies the root	directory of data
	   repository.	As described in	the "FILES" section, rwgroup may use
	   this	environment variable when searching for	the SiLK site
	   configuration file.

       SILK_PATH
	   This	environment variable gives the root of the install tree.  When
	   searching for configuration files and plug-ins, rwgroup may use
	   this	environment variable.  See the "FILES" section for details.

       SILK_PLUGIN_DEBUG
	   When	set to 1, rwgroup prints status	messages to the	standard error
	   as it attempts to find and open each	of its plug-ins.  In addition,
	   when	an attempt to register a field fails, rwgroup prints a message
	   specifying the additional function(s) that must be defined to
	   register the	field in rwgroup.  Be aware that the output can	be
	   rather verbose.

FILES
       ${SILK_ADDRESS_TYPES}
       ${SILK_PATH}/share/silk/address_types.pmap
       ${SILK_PATH}/share/address_types.pmap
       /usr/local/share/silk/address_types.pmap
       /usr/local/share/address_types.pmap
	   Possible locations for the address types mapping file required by
	   the sType and dType fields.

       ${SILK_CONFIG_FILE}
       ${SILK_DATA_ROOTDIR}/silk.conf
       /data/silk.conf
       ${SILK_PATH}/share/silk/silk.conf
       ${SILK_PATH}/share/silk.conf
       /usr/local/share/silk/silk.conf
       /usr/local/share/silk.conf
	   Possible locations for the SiLK site	configuration file which are
	   checked when	the --site-config-file switch is not provided.

       ${SILK_COUNTRY_CODES}
       ${SILK_PATH}/share/silk/country_codes.pmap
       ${SILK_PATH}/share/country_codes.pmap
       /usr/local/share/silk/country_codes.pmap
       /usr/local/share/country_codes.pmap
	   Possible locations for the country code mapping file	required by
	   the scc and dcc fields.

       ${SILK_PATH}/lib64/silk/
       ${SILK_PATH}/lib64/
       ${SILK_PATH}/lib/silk/
       ${SILK_PATH}/lib/
       /usr/local/lib64/silk/
       /usr/local/lib64/
       /usr/local/lib/silk/
       /usr/local/lib/
	   Directories that rwgroup checks when	attempting to load a plug-in.

SEE ALSO
       rwfilter(1), rwfileinfo(1), rwsort(1), rwuniq(1), rwpmapbuild(1),
       addrtype(3), ccfilter(3), pmapfilter(3),	pysilk(3), silkpython(3),
       silk-plugin(3), sensor.conf(5), uniq(1),	silk(7), yaf(1), dlopen(3),
       zlib(3)

SiLK 3.19.1			  2021-02-28			    rwgroup(1)

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | LIMITATIONS | EXAMPLES | ENVIRONMENT | FILES | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=rwgroup&sektion=1&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help