Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
binary(3)		   Erlang Module Definition		     binary(3)

NAME
       binary -	Library	for handling binary data.

DESCRIPTION
       This module contains functions for manipulating byte-oriented binaries.
       Although	the majority of	functions could	be provided using  bit-syntax,
       the  functions in this library are highly optimized and are expected to
       either execute faster or	consume	less memory, or	both, than a  counter-
       part written in pure Erlang.

       The  module  is provided	according to Erlang Enhancement	Proposal (EEP)
       31.

   Note:
       The library handles byte-oriented data. For bitstrings that are not bi-
       naries  (does  not  contain whole octets	of bits) a badarg exception is
       thrown from any of the functions	in this	module.

DATA TYPES
       cp()

	      Opaque data type representing a compiled search pattern. Guaran-
	      teed  to	be  a tuple() to allow programs	to distinguish it from
	      non-precompiled search patterns.

       part() =	{Start :: integer() >= 0, Length :: integer()}

	      A	representaion of a part	(or range) in a	 binary.  Start	 is  a
	      zero-based  offset  into	a binary() and Length is the length of
	      that part. As input to functions in this module, a reverse  part
	      specification is allowed,	constructed with a negative Length, so
	      that the part of the binary begins at  Start  +  Length  and  is
	      -Length long. This is useful for referencing the last N bytes of
	      a	binary as {size(Binary), -N}. The functions in this module al-
	      ways return part()s with positive	Length.

EXPORTS
       at(Subject, Pos)	-> byte()

	      Types:

		 Subject = binary()
		 Pos = integer() >= 0

	      Returns  the byte	at position Pos	(zero-based) in	binary Subject
	      as an integer. If	Pos >= byte_size(Subject), a badarg  exception
	      is raised.

       bin_to_list(Subject) -> [byte()]

	      Types:

		 Subject = binary()

	      Same as bin_to_list(Subject, {0,byte_size(Subject)}).

       bin_to_list(Subject, PosLen) -> [byte()]

	      Types:

		 Subject = binary()
		 PosLen	= part()

	      Converts	Subject	 to  a	list of	byte()s, each representing the
	      value of one byte. part()	denotes	which part of the binary()  to
	      convert.

	      Example:

	      1> binary:bin_to_list(<<"erlang">>, {1,3}).
	      "rla"
	      %% or [114,108,97] in list notation.

	      If PosLen	in any way references outside the binary, a badarg ex-
	      ception is raised.

       bin_to_list(Subject, Pos, Len) -> [byte()]

	      Types:

		 Subject = binary()
		 Pos = integer() >= 0
		 Len = integer()

	      Same as bin_to_list(Subject, {Pos, Len}).

       compile_pattern(Pattern)	-> cp()

	      Types:

		 Pattern = binary() | [binary()]

	      Builds an	internal structure representing	 a  compilation	 of  a
	      search   pattern,	  later	 to  be	 used  in  functions  match/3,
	      matches/3, split/3, or replace/4.	The cp() returned  is  guaran-
	      teed  to	be  a tuple() to allow programs	to distinguish it from
	      non-precompiled search patterns.

	      When a list of binaries is specified, it denotes a set of	alter-
	      native  binaries	to  search  for.  For  example,	 if  [__"func-
	      tional"__,__"programming"__] is specified	as Pattern, this means
	      either  __"functional"__ or __"programming"__". The pattern is a
	      set of alternatives; when	only a single binary is	specified, the
	      set has only one element.	The order of alternatives in a pattern
	      is not significant.

	      The list of binaries used	for search alternatives	must  be  flat
	      and proper.

	      If  Pattern  is  not  a binary or	a flat proper list of binaries
	      with length > 0, a badarg	exception is raised.

       copy(Subject) ->	binary()

	      Types:

		 Subject = binary()

	      Same as copy(Subject, 1).

       copy(Subject, N)	-> binary()

	      Types:

		 Subject = binary()
		 N = integer() >= 0

	      Creates a	binary with the	content	of Subject duplicated N	times.

	      This function always creates a new binary, even if N = 1.	By us-
	      ing copy/1 on a binary referencing a larger binary, one can free
	      up the larger binary for garbage collection.

	  Note:
	      By deliberately copying a	single binary to avoid	referencing  a
	      larger  binary, one can, instead of freeing up the larger	binary
	      for later	garbage	collection, create much	more binary data  than
	      needed.  Sharing	binary	data  is usually good. Only in special
	      cases, when small	parts reference	large binaries and  the	 large
	      binaries	are  no	longer used in any process, deliberate copying
	      can be a good idea.

	      If N < 0,	a badarg exception is raised.

       decode_unsigned(Subject)	-> Unsigned

	      Types:

		 Subject = binary()
		 Unsigned = integer() >= 0

	      Same as decode_unsigned(Subject, big).

       decode_unsigned(Subject,	Endianness) -> Unsigned

	      Types:

		 Subject = binary()
		 Endianness = big | little
		 Unsigned = integer() >= 0

	      Converts the binary digit	representation,	in big endian or  lit-
	      tle  endian, of a	positive integer in Subject to an Erlang inte-
	      ger().

	      Example:

	      1> binary:decode_unsigned(<<169,138,199>>,big).
	      11111111

       encode_unsigned(Unsigned) -> binary()

	      Types:

		 Unsigned = integer() >= 0

	      Same as encode_unsigned(Unsigned,	big).

       encode_unsigned(Unsigned, Endianness) ->	binary()

	      Types:

		 Unsigned = integer() >= 0
		 Endianness = big | little

	      Converts a positive integer to the smallest possible representa-
	      tion in a	binary digit representation, either big	endian or lit-
	      tle endian.

	      Example:

	      1> binary:encode_unsigned(11111111, big).
	      <<169,138,199>>

       first(Subject) -> byte()

	      Types:

		 Subject = binary()

	      Returns the first	byte of	binary Subject as an integer.  If  the
	      size of Subject is zero, a badarg	exception is raised.

       last(Subject) ->	byte()

	      Types:

		 Subject = binary()

	      Returns  the  last  byte of binary Subject as an integer.	If the
	      size of Subject is zero, a badarg	exception is raised.

       list_to_bin(ByteList) ->	binary()

	      Types:

		 ByteList = iodata()

	      Works exactly as erlang:list_to_binary/1,	 added	for  complete-
	      ness.

       longest_common_prefix(Binaries) -> integer() >= 0

	      Types:

		 Binaries = [binary()]

	      Returns  the length of the longest common	prefix of the binaries
	      in list Binaries.

	      Example:

	      1> binary:longest_common_prefix([<<"erlang">>, <<"ergonomy">>]).
	      2
	      2> binary:longest_common_prefix([<<"erlang">>, <<"perl">>]).
	      0

	      If Binaries is not a flat	list of	binaries, a  badarg  exception
	      is raised.

       longest_common_suffix(Binaries) -> integer() >= 0

	      Types:

		 Binaries = [binary()]

	      Returns  the length of the longest common	suffix of the binaries
	      in list Binaries.

	      Example:

	      1> binary:longest_common_suffix([<<"erlang">>, <<"fang">>]).
	      3
	      2> binary:longest_common_suffix([<<"erlang">>, <<"perl">>]).
	      0

	      If Binaries is not a flat	list of	binaries, a  badarg  exception
	      is raised.

       match(Subject, Pattern) -> Found	| nomatch

	      Types:

		 Subject = binary()
		 Pattern = binary() | [binary()] | cp()
		 Found = part()

	      Same as match(Subject, Pattern, []).

       match(Subject, Pattern, Options)	-> Found | nomatch

	      Types:

		 Subject = binary()
		 Pattern = binary() | [binary()] | cp()
		 Found = part()
		 Options = [Option]
		 Option	= {scope, part()}
		 part()	= {Start :: integer() >= 0, Length :: integer()}

	      Searches	for the	first occurrence of Pattern in Subject and re-
	      turns the	position and length.

	      The function returns {Pos, Length} for the  binary  in  Pattern,
	      starting at the lowest position in Subject.

	      Example:

	      1> binary:match(<<"abcde">>, [<<"bcde">>,	<<"cd">>],[]).
	      {1,4}

	      Even  though  __"cd"__ ends before __"bcde"__, __"bcde"__	begins
	      first and	is therefore  the  first  match.  If  two  overlapping
	      matches begin at the same	position, the longest is returned.

	      Summary of the options:

		{scope,	{Start,	Length}}:
		  Only	the  specified	part  is searched. Return values still
		  have offsets from  the  beginning  of	 Subject.  A  negative
		  Length is allowed as described in section Data Types in this
		  manual.

	      If none of the strings in	Pattern	is found, the atom nomatch  is
	      returned.

	      For a description	of Pattern, see	function compile_pattern/1.

	      If {scope, {Start,Length}} is specified in the options such that
	      Start > size of Subject, Start + Length <	0 or Start + Length  >
	      size of Subject, a badarg	exception is raised.

       matches(Subject,	Pattern) -> Found

	      Types:

		 Subject = binary()
		 Pattern = binary() | [binary()] | cp()
		 Found = [part()]

	      Same as matches(Subject, Pattern,	[]).

       matches(Subject,	Pattern, Options) -> Found

	      Types:

		 Subject = binary()
		 Pattern = binary() | [binary()] | cp()
		 Found = [part()]
		 Options = [Option]
		 Option	= {scope, part()}
		 part()	= {Start :: integer() >= 0, Length :: integer()}

	      As  match/2,  but	Subject	is searched until exhausted and	a list
	      of all non-overlapping parts matching Pattern  is	 returned  (in
	      order).

	      The  first and longest match is preferred	to a shorter, which is
	      illustrated by the following example:

	      1> binary:matches(<<"abcde">>,
				[<<"bcde">>,<<"bc">>,<<"de">>],[]).
	      [{1,4}]

	      The result shows that <<"bcde">>	is  selected  instead  of  the
	      shorter match <<"bc">> (which would have given raise to one more
	      match, <<"de">>).	This corresponds to the	behavior of POSIX reg-
	      ular  expressions	(and programs like awk), but is	not consistent
	      with alternative matches in re (and Perl), where instead lexical
	      ordering in the search pattern selects which string matches.

	      If  none	of the strings in a pattern is found, an empty list is
	      returned.

	      For a description	of Pattern, see	compile_pattern/1. For	a  de-
	      scription	of available options, see match/3.

	      If {scope, {Start,Length}} is specified in the options such that
	      Start > size of Subject, Start + Length <	0 or Start + Length is
	      >	size of	Subject, a badarg exception is raised.

       part(Subject, PosLen) ->	binary()

	      Types:

		 Subject = binary()
		 PosLen	= part()

	      Extracts the part	of binary Subject described by PosLen.

	      A	 negative  length can be used to extract bytes at the end of a
	      binary:

	      1> Bin = <<1,2,3,4,5,6,7,8,9,10>>.
	      2> binary:part(Bin, {byte_size(Bin), -5}).
	      <<6,7,8,9,10>>

	  Note:
	      part/2 and part/3	are also available in the erlang module	 under
	      the  names  binary_part/2	 and binary_part/3. Those BIFs are al-
	      lowed in guard tests.

	      If PosLen	in any way references outside the binary, a badarg ex-
	      ception is raised.

       part(Subject, Pos, Len) -> binary()

	      Types:

		 Subject = binary()
		 Pos = integer() >= 0
		 Len = integer()

	      Same as part(Subject, {Pos, Len}).

       referenced_byte_size(Binary) -> integer() >= 0

	      Types:

		 Binary	= binary()

	      If a binary references a larger binary (often described as being
	      a	subbinary), it can be useful to	get the	size of	the referenced
	      binary.  This  function  can be used in a	program	to trigger the
	      use of copy/1. By	copying	a  binary,  one	 can  dereference  the
	      original,	possibly large,	binary that a smaller binary is	a ref-
	      erence to.

	      Example:

	      store(Binary, GBSet) ->
		NewBin =
		    case binary:referenced_byte_size(Binary) of
			Large when Large > 2 * byte_size(Binary) ->
			   binary:copy(Binary);
			_ ->
			   Binary
		    end,
		gb_sets:insert(NewBin,GBSet).

	      In this example, we chose	to copy	the binary content before  in-
	      serting  it in gb_sets:set() if it references a binary more than
	      twice the	data size we want to keep. Of course, different	 rules
	      apply when copying to different programs.

	      Binary sharing occurs whenever binaries are taken	apart. This is
	      the fundamental reason why binaries are fast, decomposition  can
	      always  be done with O(1)	complexity. In rare circumstances this
	      data sharing is however undesirable, why this function  together
	      with copy/1 can be useful	when optimizing	for memory use.

	      Example of binary	sharing:

	      1> A = binary:copy(<<1>>,	100).
	      <<1,1,1,1,1 ...
	      2> byte_size(A).
	      100
	      3> binary:referenced_byte_size(A)
	      100
	      4> <<_:10/binary,B:10/binary,_/binary>> =	A.
	      <<1,1,1,1,1 ...
	      5> byte_size(B).
	      10
	      6> binary:referenced_byte_size(B)
	      100

	  Note:
	      Binary  data is shared among processes. If another process still
	      references the larger binary, copying the	part this process uses
	      only consumes more memory	and does not free up the larger	binary
	      for garbage collection. Use this	kind  of  intrusive  functions
	      with extreme care	and only if a real problem is detected.

       replace(Subject,	Pattern, Replacement) -> Result

	      Types:

		 Subject = binary()
		 Pattern = binary() | [binary()] | cp()
		 Replacement = Result =	binary()

	      Same as replace(Subject, Pattern,	Replacement,[]).

       replace(Subject,	Pattern, Replacement, Options) -> Result

	      Types:

		 Subject = binary()
		 Pattern = binary() | [binary()] | cp()
		 Replacement = binary()
		 Options = [Option]
		 Option	= global | {scope, part()} | {insert_replaced, InsPos}
		 InsPos	= OnePos | [OnePos]
		 OnePos	= integer() >= 0
		   An integer()	=< byte_size(Replacement)
		 Result	= binary()

	      Constructs a new binary by replacing the parts in	Subject	match-
	      ing Pattern with the content of Replacement.

	      If the matching subpart of Subject giving	raise to the  replace-
	      ment  is	to be inserted in the result, option {insert_replaced,
	      InsPos} inserts the matching part	into Replacement at the	speci-
	      fied  position  (or positions) before inserting Replacement into
	      Subject.

	      Example:

	      1> binary:replace(<<"abcde">>,<<"b">>,<<"[]">>, [{insert_replaced,1}]).
	      <<"a[b]cde">>
	      2> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,1}]).
	      <<"a[b]c[d]e">>
	      3> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[]">>,[global,{insert_replaced,[1,1]}]).
	      <<"a[bb]c[dd]e">>
	      4> binary:replace(<<"abcde">>,[<<"b">>,<<"d">>],<<"[-]">>,[global,{insert_replaced,[1,2]}]).
	      <<"a[b-b]c[d-d]e">>

	      If any position specified	in InsPos > size  of  the  replacement
	      binary, a	badarg exception is raised.

	      Options  global and {scope, part()} work as for split/3. The re-
	      turn type	is always a binary().

	      For a description	of Pattern, see	compile_pattern/1.

       split(Subject, Pattern) -> Parts

	      Types:

		 Subject = binary()
		 Pattern = binary() | [binary()] | cp()
		 Parts = [binary()]

	      Same as split(Subject, Pattern, []).

       split(Subject, Pattern, Options)	-> Parts

	      Types:

		 Subject = binary()
		 Pattern = binary() | [binary()] | cp()
		 Options = [Option]
		 Option	= {scope, part()} | trim | global | trim_all
		 Parts = [binary()]

	      Splits Subject into a list of binaries based on Pattern. If  op-
	      tion  global is not specified, only the first occurrence of Pat-
	      tern in Subject gives rise to a split.

	      The parts	of Pattern found in Subject are	not  included  in  the
	      result.

	      Example:

	      1> binary:split(<<1,255,4,0,0,0,2,3>>, [<<0,0,0>>,<<2>>],[]).
	      [<<1,255,4>>, <<2,3>>]
	      2> binary:split(<<0,1,0,0,4,255,255,9>>, [<<0,0>>, <<255,255>>],[global]).
	      [<<0,1>>,<<4>>,<<9>>]

	      Summary of options:

		{scope,	part()}:
		  Works	as in match/3 and matches/3. Notice that this only de-
		  fines	the scope of the search	for matching strings, it  does
		  not  cut  the	 binary	before splitting. The bytes before and
		  after	the scope are kept in the result. See the example  be-
		  low.

		trim:
		  Removes  trailing empty parts	of the result (as does trim in
		  re:split/3.

		trim_all:
		  Removes all empty parts of the result.

		global:
		  Repeats the split until Subject is  exhausted.  Conceptually
		  option  global makes split work on the positions returned by
		  matches/3, while it normally works on	the position  returned
		  by match/3.

	      Example  of the difference between a scope and taking the	binary
	      apart before splitting:

	      1> binary:split(<<"banana">>, [<<"a">>],[{scope,{2,3}}]).
	      [<<"ban">>,<<"na">>]
	      2> binary:split(binary:part(<<"banana">>,{2,3}), [<<"a">>],[]).
	      [<<"n">>,<<"n">>]

	      The return type is always	a list of binaries that	are all	refer-
	      encing  Subject.	This  means  that  the	data in	Subject	is not
	      copied to	new binaries, and that Subject cannot be garbage  col-
	      lected until the results of the split are	no longer referenced.

	      For a description	of Pattern, see	compile_pattern/1.

Ericsson AB			  stdlib 3.8			     binary(3)

NAME | DESCRIPTION | DATA TYPES | EXPORTS

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=binary&sektion=3&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help