Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Genezzo::Block::RDBlocUser Contributed Perl DocumentGenezzo::Block::RDBlock(3)

NAME
       Genezzo::Block::RDBlock.pm - Row	Directory Block	tied hash class.  A
       class that lets you treat the contents of a block (byte buffer) as a
       hash.

       Note: This implementation is almost, but	not quite, a pushhash.	The
       push hash implementation	is Genezzo::Row::RSBlock.  It also forms the
       basis of	a tied array in	Genezzo::Block::RDBArray.

SYNOPSIS
	use Genezzo::Block::RDBlock;
	use Genezzo::Block::Std;

	local $Genezzo::Block::Std::DEFBLOCKSIZE = 500;

	my $buff = "\0"	x 500; # construct an empty byte buffer

	my %tied_hash =	();

	my $tie_val =
	    tie	%tied_hash, 'Genezzo::Block::RDBlock', (refbufstr => \$buff);

	# pushhash style
	# (note	that the "PUSH"	pseudo key is not supported)...
	my $newkey = $tie_val->HPush("this is a	test");

	# or array style, your choice
	my $pushcount =	$tie_val->PUSH(qw(push lots of data));

	$tied_hash{$newkey} = "update this entry";

	# a hash that supports array style FETCHSIZE
	my $getcount = $tie_val->FETCHSIZE(); #	Note: not HCount

DESCRIPTION
       RDBlock is the basis for	persistent tied	hashes,	pushhashes, and	tied
       arrays.	After the hash is tied to the byte buffer, the buffer can be
       written to persistent storage.  The storage is designed such that
       inserts/appends/pushes are fairly efficient, and	deletes	are
       inexpensive.  The pctfree/pctused parameters allow some tuning to
       reserve space in	the buffer for updates that "grow" existing values.
       Updates that do not change the packed size of data are about as
       efficient as insert/appends -- just the cost to copy your bytes into
       the buffer -- but updates that do change	the size of stored values can
       require a large amount of byte shifting to open up storage space.
       Also, the buffer	does not grow to accomodate large values.  Wrapper
       classes are necessary to	specify	mechanisms for packing complex data
       structures and techniques to split objects across multiple buffers.

ARGUMENTS
       refbufstr (Required) - a	reference to the byte buffer used for storage.
       blocksize (Optional) - the size of the supplied byte buffer. Default is
       $Genezzo::Block::Std::DEFBLOCKSIZE.
       pctfree (Optional) - the	percentage of space kept free for future
       updates.	Default	is 30 (percent).
       pctused (Optional) - after the block is full, the percentage of space
       that must be open before	inserts	are re-enabled.	 Default is 50
       (percent).

CONCEPTS
       The structure and techniques for	the Row	Directory Block	are described
       in Chapter 14, "The Tuple-Oriented File System",	of "Transaction
       Processing: Concepts and	Techniques" by Jim Gray	and Andreas Reuter,
       1993.

       A tuple is a collection of values -- in the standard vernacular you
       would call it a "row" in	a database.  The refbufstr argument to the
       hash constructor	is a "block", a	fixed-size contiguous buffer of	bytes.
       When you	write ("STORE")	a value	into the RDBlock hash, it writes an
       entry into the block as a byte string, and reads	("FETCH") work in an
       analogous fashion.

       The RDBlock data	structures refer to stored values as "rows", but the
       basic "STORE" and "FETCH" only understand how to	store and retrieve
       individual byte strings.	 Wrapper classes for RDBlock must
       marshall/unmarshall (Freeze/Thaw) between simple	strings	and more
       complex data structures.

       The block has some header and footer information, plus a	row directory,
       a data structure	that records the offsets, extents, and status
       information of the stored row data.  While the physical location	of row
       data in a block may change as other rows	are added, deleted or
       modified, the row keeps the same	hash key.

       Each row	has an associated "status" bitfield, which is some combination
       of the following	values:

       deleted
	   set if row is deleted.  Deleted rows	are simply marked as deleted,
	   but the physical storaged is	not immediately	recouped.

       data
	   set for data	rows, unset for	metadata.  All information stored via
	   the standard	public interfaces is data.  You	can manipulate the
	   private interfaces to store "metadata", additional rows that
	   describe, for example, block	contents, transaction information, or
	   data	relationships, but are invisible to the	public interfaces.  By
	   convention, row 0 is	always a metadata row.

       lock
	   set if row is locked.  Not used in this base	class -- provided for
	   subclasses that must	supply and maintain the	appropriate metadata
	   to identify locker and transaction information.

       head
	   set if the stored value is the very first part of a row.

       tail
	   set if the stored value is the very last part of a row.  If "STORE"
	   writes a complete value it sets both	head and tail to true.	The
	   base	class only writes rows that fit	in a single block, so both
	   head	and tail are always set.

	   These flags are useful if you wish to write subclasses with rows
	   that	span multiple blocks.  Neither head nor	tail is	set if only
	   the middle section of a multi-part row is stored.

       isnull
	   If you supply "STORE" with a	value of undef,	it writes a marker for
	   a zero-length string	and sets this flag.  "FETCH" will correctly
	   return an undef.

	   Note: When packing more complex data	structures, make sure to use
	   an encoding that distinguishes between undefs and zero-length
	   strings.  A simple scheme for packing an array of strings is	to
	   prefix the packed array with	a bitstring that specifies which
	   entries are null.

FUNCTIONS
       RDBlock support all standard hash operations, with the exception	that
       you cannot create or insert a user key -- you must push new entries and
       use the generated key or	basic iteration	to retrieve your data.

       It also supports	three additional public	methods: an array style	"PUSH"
       and "FETCHSIZE",	plus a PushHash	style HPush.  Note that	these methods
       are associated with the tie value (i.e. the blessed ref for the RDBlock
       class), not the tied hash.  Finally, it has five	"private" methods that
       may be of use in	constructing subclasses: push_one, packdeleted,
       offset2hkey, lastkey, prevkey

       PUSH this, LIST
	   PUSH	appends	the list to the	end of the hash	and returns the	number
	   of items it pushed.

       FETCHSIZE
	   Returns the total number of valid, undeleted	data items in the
	   hash.

       HPush
	   HPush returns the new key for each pushed value.  It	only accepts a
	   single argument, not	a list.

	   my $newkey =	$tie_val->HPush("this is a test");

	   Note	that there is not a corresponding "pop"	operation.

   EXPORT
       DATAROW,	RowStat

LIMITATIONS
       The storage mechanism uses network longs	(32 bits?) to describe the
       lengths of rows and offsets within the block.  (That seems pretty large
       -- maybe	it should use shorts to	restrict blocksize and row piece
       length to 64K?  Or init should take an optional module name for block
       type that lets us vary the row directory, header	and footer sizing).

TODO
       use row directory rowlen	vs len/value for row storage
       meta row	- should binary	search for meta	id
       unicode support

AUTHOR
       Jeffrey I. Cohen, jcohen@genezzo.com

SEE ALSO
       perl(1).

       Copyright (c) 2003-2007 Jeffrey I Cohen.	 All rights reserved.

	   This	program	is free	software; you can redistribute it and/or modify
	   it under the	terms of the GNU General Public	License	as published by
	   the Free Software Foundation; either	version	2 of the License, or
	   any later version.

	   This	program	is distributed in the hope that	it will	be useful,
	   but WITHOUT ANY WARRANTY; without even the implied warranty of
	   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the
	   GNU General Public License for more details.

	   You should have received a copy of the GNU General Public License
	   along with this program; if not, write to the Free Software
	   Foundation, Inc., 51	Franklin St, Fifth Floor, Boston, MA  02110-1301  USA

       Address bug reports and comments	to: jcohen@genezzo.com

       For more	information, please visit the Genezzo homepage at
       <http://www.genezzo.com>

perl v5.32.0			  2007-11-18	    Genezzo::Block::RDBlock(3)

NAME | SYNOPSIS | DESCRIPTION | ARGUMENTS | CONCEPTS | FUNCTIONS | LIMITATIONS | TODO | AUTHOR | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Genezzo::Block::RDBlock&sektion=3&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help