Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
SIMD-VITERBI(3)		   Library Functions Manual	       SIMD-VITERBI(3)

       create_viterbi27,      init_viterbi27,	  update_viterbi27,	chain-
       back_viterbi27, delete_viterbi27, create_viterbi29, init_viterbi29, up-
       date_viterbi29,	chainback_viterbi29,  delete_viterbi29 - IA32 SIMD-as-
       sisted Viterbi decoders

       #include	"viterbi27.h"
       void *create_viterbi27(int blocklen);
       int init_viterbi27(void *vp,int starting_state);
       int update_viterbi27(void *vp,unsigned char sym1,unsigned char sym2);
       int chainback_viterbi27(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi27(void *vp);
       void emms_viterbi27(void);
       extern char id_viterbi27[];

       #include	"viterbi29.h"
       void *create_viterbi29(int blocklen);
       int init_viterbi29(void *vp,int starting_state);
       int update_viterbi29(void *vp,unsigned char sym1,unsigned char sym2);
       int chainback_viterbi29(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi29(void *vp);
       void emms_viterbi29(void);
       extern char id_viterbi29[];

       These functions implement high performance  Viterbi  decoders  for  two
       convolutional  codes:  a	 rate  1/2  constraint	length	7  (k=7)  code
       ("viterbi27") and a rate	1/2 k=9	code ("viterbi29"). The	 decoders  use
       the  Intel IA32 SIMD instruction	sets, if available, to improve perfor-

       There are three different IA32 SIMD instruction sets. The  most	common
       is MMX, first implemented on later Intel	Pentiums and then on the Intel
       Pentium II and most Intel clones	(AMD K6, Transmeta Crusoe, etc).   SSE
       was  introduced	on  the	 Pentium  III and later	implemented in the AMD
       Athlon 4	(AMD calls it "3D Now! Professional"). Most recently, SSE2 was
       introduced  in the Intel	Pentium	4. As of late 2001, there are no other
       known implementations of	SSE2.

       Four separate static libraries implement	the decoders for the four dif-
       ferent  instruction  sets. -lviterbi_port uses no SIMD instructions; it
       is intended for	pre-MMX	 IA32  machines	 and  for  non-IA32  machines.
       -lviterbi_mmx  is for IA-32 machines that support the MMX instructions;
       -lviterbi_sse  is  for  machines	 with  the   SSE   instructions,   and
       -lviterbi_sse2  is  for	machines with SSE2 support. The	function names
       and calling conventions are the same for	all  four  versions,  although
       the size	of certain internal data structures are	different.

       A shared	library, -lviterbi is also provided; it	is assumed to refer to
       the correct version for the current machine.

       Two versions of each function are provided, one for the	k=7  code  and
       another for the k=9 code. In the	following discussion the k=7 code will
       be assumed. To use the  k=9  code,  simply  change  all	references  to
       "viterbi27" to "viterbi29".

       Before  Viterbi	decoding  can begin, an	instance must first be created
       with create_viterbi27().	 This function creates and returns  a  pointer
       to  an  internal	 control structure containing the path metrics and the
       branch decisions. create_viterbi27() takes one argument that gives  the
       length  of  the	data  block  in	bits. You must not attempt to decode a
       block longer than the length given to create_viterbi27().

       After a decoder instance	is created, and	before decoding	a  new	frame,
       init_viterbi27()	must be	called to reset	the decoder state.  It accepts
       the instance pointer returned by	 create_viterbi27()  and  the  initial
       starting	state of the convolutional encoder (usually 0).	If the initial
       starting	state is unknown or incorrect, the decoder will	still function
       but the decoded data may	be incorrect at	the start of the block.

       Each  pair  of  received	 symbols  is  processed	 with  a  call	to up-
       date_viterbi27().  Each symbol is expected to range from	0 through  15,
       with  0 corresponding to	a "strong 0" and 15 corresponding to a "strong
       1". The caller is responsible for determining the proper	pairing	of in-
       put symbols (commonly known as decoder symbol phasing).

       At  the	end  of	the block, the data is recovered with a	call to	chain-
       back_viterbi27(). The arguments are the	pointer	 to  the  decoder  in-
       stance, a pointer to a user-supplied buffer into	which the decoded data
       is to be	written, the number of data bits (not bytes) that  are	to  be
       decoded,	and the	terminal state of the convolutional encoder at the end
       of the frame (usually 0). If the	terminal state	is  incorrect  or  un-
       known, the decoded data bits at the end of the frame may	be unreliable.
       The decoded data	is written in big-endian order,	i.e., the first	bit in
       the  frame  is written into the high order bit of the first byte	in the
       buffer. If the frame is not an integral number of bytes long,  the  low
       order bits of the last byte in the frame	will be	unused.

       Note that the decoders assume the use of	a tail,	i.e., the encoding and
       transmission of a sufficient number of padding bits beyond the  end  of
       the  user data to force the convolutional encoder into the known	termi-
       nal state given to chainback_viterbi27(). The k=7 code uses 6 tail bits
       (12 tail	symbols) and the k=9 code uses 8 tail bits (16 tail symbols).

       The  tail  bits	are  not  included  in	the  length  arguments to cre-
       ate_viterbi27() and chainback_viterbi27(). For example,	if  the	 block
       contains	 1000 user bits, then this would be the	length parameter given
       to create_viterbi27() and chainback_viterbi27(),	and update_viterbi27()
       would  be called	a total	of 1006	times -	the last 6 with	the 12 encoded
       symbols representing the	tail bits.

       After the call to chainback_viterbi27(),	the decoder may	be reset  with
       a  call to init_viterbi27() and another block can be decoded.  Alterna-
       tively, delete_viterbi27() can be called	to free	all resources used  by
       the Viterbi decoder.

       The  MMX	and SSE	versions of the	decoder	use registers aliased onto the
       Intel  floating	point  registers,  so  you  must   insert   calls   to
       emms_viterbi27()	between	calls to update_viterbi27() and	any subsequent
       floating	point computations in your program. You	need not do this after
       every call to update_viterbi27()	if you perform floating	point only af-
       ter the end of the frame.  In this case	you  may  defer	 the  call  to
       emms_viterbi27()	until after chainback_viterbi27() has been called.

       emms_viterbi27()	 is  a	no-op in the portable and SSE2 versions	of the
       decoder,	so you can safely call it regardless of	library	version.  (The
       SSE2  version  uses  the	XMM registers, which do	not interfere with the
       X87 floating point stack. Hence emms calls are not necessary with  this

       The  global character string id_viterbi27[] identifies the decoder ver-
       sion in use.

       create_viterbi27() returns a pointer to the  structure  containing  the
       decoder	state.	update_viterbi27() returns the amount by which the de-
       coder path metrics were normalized in the current step. Only the	porta-
       ble  C,	SSE  and  SSE2 versions	perform	normalization; the MMX version
       uses modulo arithmetic.

       Phil Karn, KA9Q (



Want to link to this manual page? Use this URL:

home | help