Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
INDEXING(1)	      User Contributed Perl Documentation	   INDEXING(1)

NAME
       PDL::Indexing - Introduction to indexing	and slicing piddles.

OVERVIEW
       This man	page should serve as a first tutorial on the indexing and
       threading features of PDL.

       Like all	vectorized languages, PDL automates looping over multi-
       dimensional data	structures ("piddles") using a variant of mathematical
       vector notation.	 The automatic looping is called "threading", in part
       because ultimately PDL will implement parallel processing to speed up
       the loops.

       A lot of	the flexibility	and power of PDL relies	on the indexing	and
       threading features of the Perl extension.  Indexing allows access to
       the data	of a piddle in a very flexible way.  Threading provides
       efficient vectorization of simple operations.

       The values of a piddle are stored compactly as typed values in a	single
       block of	memory,	not (as	in a normal Perl list-of-lists)	as individual
       Perl scalars.

       In the sections that follow many	"methods" are called out -- these are
       Perl operators that apply to piddles.  From the perldl (or pdl2)	shell,
       you can find out	more about each	method by typing "?" followed by the
       method name.

   Dimension lists
       A piddle	(PDL variable),	in general, is an N-dimensional	array where N
       can be 0	(for a scalar),	1 (e.g.	for a sound sample), or	higher values
       for images and more complex structures.	Each dimension of the piddle
       has a positive integer size.  The "perl"	interpreter treats each	piddle
       as a special type of Perl scalar	(a blessed Perl	object,	actually --
       but you don't have to know that to use them) that can be	used anywhere
       you can put a normal scalar.

       You can access the dimensions of	a piddle as a Perl list	and otherwise
       determine the size of a piddle with several methods.  The important
       ones are:

       nelem - the total number	of elements in a piddle
       ndims - returns the number of dimensions	in a piddle
       dims - returns the dimension list of a piddle as	a Perl list
       dim - returns the size of a particular dimension	of a piddle

   Indexing and	Dataflow
       PDL maintains a notion of "dataflow" between a piddle and indexed
       subfields of that piddle.  When you produce an indexed subfield or
       single element of a parent piddle, the child and	parent remain attached
       until you manually disconnect them.  This lets you represent the	same
       data different ways within your code -- for example, you	can consider
       an RGB image simultaneously as a	collection of (R,G,B) values in	a 3 x
       1000 x 1000 image, and as three separate	1000 x 1000 color planes
       stored in different variables.  Modifying any of	the variables changes
       the underlying memory, and the changes are reflected in all
       representations of the data.

       There are two important methods that let	you control dataflow
       connections between a child and parent piddle:

       copy - forces an	explicit copy of a piddle
       sever - breaks the dataflow connection between a	piddle and its parents
       (if any)

   Threading and Dimension Order
       Most PDL	operations act on the first few	dimensions of their piddle
       arguments.  For example,	"sumover" sums all elements along the first
       dimension in the	list (dimension	0).  If	you feed in a three-
       dimensional piddle, then	the first dimension is considered the "active"
       dimension and the later dimensions are "thread" dimensions because they
       are simply looped over.	There are several ways to transpose or re-
       order the dimension list	of a piddle.  Those techniques are very	fast
       since they don't	touch the underlying data, only	change the way that
       PDL accesses the	data.  The main	dimension ordering functions are:

       mv - moves a particular dimension somewhere else	in the dimension list
       xchg - exchanges	two dimensions in the dimension	list, leaving the rest
       alone
       reorder - allows	wholesale mixing of the	dimensions
       clump - clumps together two or more small dimensions into one larger
       one
       squeeze - eliminates any	dimensions of size 1

   Physical and	Dummy Dimensions
       o    document Perl level	threading

       o    threadids

       o    update and correct description of slice

       o    new	functions in slice.pd (affine, lag, splitdim)

       o    reworking of paragraph on explicit threading

Indexing and threading with PDL
       A lot of	the flexibility	and power of PDL relies	on the indexing	and
       looping features	of the Perl extension. Indexing	allows access to the
       data of a piddle	in a very flexible way.	Threading provides efficient
       implicit	looping	functionality (since the loops are implemented as
       optimized C code).

       Piddles are Perl	objects	that represent multidimensional	arrays and
       operations on those. In contrast	to simple Perl @x style	lists the
       array data is compactly stored in a single block	of memory thus taking
       up a lot	less memory and	enabling use of	fast C code to implement
       operations (e.g.	addition, etc) on piddles.

   piddles can have children
       Central to many of the indexing capabilities of PDL are the relation of
       "parent"	and "child" between piddles. Many of the indexing commands
       create a	new piddle from	an existing piddle. The	new piddle is the
       "child" and the old one is the "parent".	The data of the	new piddle is
       defined by a transformation that	specifies how to generate (compute)
       its data	from the parent's data.	The relation between the child piddle
       and its parent are often	bidirectional, meaning that changes in the
       child's data are	propagated back	to the parent. (Note: You see, we are
       aiming in our terminology already towards the new dataflow features.
       The kind	of dataflow that is used by the	indexing commands (about which
       you will	learn in a minute) is always in	operation, not only when you
       have explicitly switched	on dataflow in your piddle by saying
       "$a->doflow". For further information about data	flow check the
       dataflow	man page.)

       Another way to interpret	the piddles created by our indexing commands
       is to view them as a kind of intelligent	pointer	that points back to
       some portion or all of its parent's data. Therefore, it is not
       surprising that the parent's data (or a portion of it) changes when
       manipulated through this	"pointer". After these introductory remarks
       that hopefully prepared you for what is coming (rather than confuse you
       too much) we are	going to dive right in and start with a	description of
       the indexing commands and some typical examples how they	might be used
       in PDL programs.	We will	further	illustrate the pointer/dataflow
       analogies in the	context	of some	of the examples	later on.

       There are two different implementations of this ``smart pointer''
       relationship: the first one, which is a little slower but works for any
       transformation is simply	to do the transformation forwards and
       backwards as necessary. The other is to consider	the child piddle a
       ``virtual'' piddle, which only stores a pointer to the parent and
       access information so that routines which use the child piddle actually
       directly	access the data	in the parent.	If the virtual piddle is given
       to a routine which cannot use it, PDL transparently physicalizes	the
       virtual piddle before letting the routine use it.

       Currently (1.94_01) all transformations which are ``affine'', i.e. the
       indices of the data item	in the parent piddle are determined by a
       linear transformation (+	constant) from the indices of the child	piddle
       result in virtual piddles. All other indexing routines (e.g.
       "->index(...)") result in physical piddles.  All	routines compiled by
       PP can accept affine piddles (except those routines that	pass pointers
       to external library functions).

       Note that whether something is affine or	not does not affect the
       semantics of what you do	in any way: both

	$a->index(...) .= 5;
	$a->slice(...) .= 5;

       change the data in $a. The affinity does, however, have a significant
       impact on memory	usage and performance.

   Slicing piddles
       Probably	the most important application of the concept of parent/child
       piddles is the representation of	rectangular slices of a	physical
       piddle by a virtual piddle. Having talked long enough about concepts
       let's get more specific.	Suppose	we are working with a 2D piddle
       representing a 5x5 image	(its unusually small so	that we	can print it
       without filling several screens full of digits ;).

	pdl> $im = sequence(5,5)
	pdl> p $im

	[
	 [ 0  1	 2  3  4]
	 [ 5  6	 7  8  9]
	 [10 11	12 13 14]
	 [15 16	17 18 19]
	 [20 21	22 23 24]
	]

	pdl> help vars
	PDL variables in package main::

	Name	     Type   Dimension	    Flow  State		 Mem
	----------------------------------------------------------------
	$im	     Double D [5,5]		   P		0.20Kb

       [ here it might be appropriate to quickly talk about the	"help vars"
       command that provides information about piddles in the interactive
       "perldl"	or "pdl2" shell	that comes with	PDL.  ]

       Now suppose we want to create a 1-D piddle that just references one
       line of the image, say line 2; or a piddle that represents all even
       lines of	the image (imagine we have to deal with	even and odd frames of
       an interlaced image due to some peculiar	behaviour of our frame
       grabber). As another frequent application of slices we might want to
       create a	piddle that represents a rectangular region of the image with
       top and bottom reversed.	All these effects (and many more) can be
       easily achieved with the	powerful slice function:

	pdl> $line = $im->slice(':,(2)')
	pdl> $even = $im->slice(':,1:-1:2')
	pdl> $area = $im->slice('3:4,3:1')
	pdl> help vars	# or just PDL->vars
	PDL variables in package main::

	Name	     Type   Dimension	    Flow  State		 Mem
	----------------------------------------------------------------
	$even	     Double D [5,2]		   -C		0.00Kb
	$im	     Double D [5,5]		   P		0.20Kb
	$line	     Double D [5]		   -C		0.00Kb
	$area	     Double D [2,3]		   -C		0.00Kb

       All three "child" piddles are children of $im or	in the other (largely
       equivalent) interpretation pointers to data of $im.  Operations on
       those virtual piddles access only those portions	of the data as
       specified by the	argument to slice. So we can just print	line 2:

	pdl> p $line
	[10 11 12 13 14]

       Also note the difference	in the "Flow State" of $area above and below:

	pdl> p $area
	pdl> help $area
	This variable is Double	D [2,3]		       VC	    0.00Kb

       The following demonstrates that $im and $line really behave as you
       would expect from a pointer-like	object (or in the dataflow picture:
       the changes in $line's data are propagated back to $im):

	pdl> $im++
	pdl> p $line
	[11 12 13 14 15]
	pdl> $line += 2
	pdl> p $im

	[
	 [ 1  2	 3  4  5]
	 [ 6  7	 8  9 10]
	 [13 14	15 16 17]
	 [16 17	18 19 20]
	 [21 22	23 24 25]
	]

       Note how	assignment operations on the child virtual piddles change the
       parent physical piddle and vice versa (however, the basic "="
       assignment doesn't, use ".=" to obtain that effect. See below for the
       reasons).  The virtual child piddles are	something like "live links" to
       the "original" parent piddle. As	previously said, they can be thought
       of to work similar to a C-pointer. But in contrast to a C-pointer they
       carry a lot more	information. Firstly, they specify the structure of
       the data	they represent (the dimensionality of the new piddle) and
       secondly, specify how to	create this structure from its parents data
       (the way	this works is buried in	the internals of PDL and not important
       for you to know anyway (unless you want to hack the core	in the future
       or would	like to	become a PDL guru in general (for a definition of this
       strange creature	see PDL::Internals)).

       The previous examples have demonstrated typical usage of	the slice
       function. Since the slicing functionality is so important here is an
       explanation of the syntax for the string	argument to slice:

	$vpdl =	$a->slice('ind0,ind1...')

       where "ind0" specifies what to do with index No 0 of the	piddle $a,
       etc. Each element of the	comma separated	list can have one of the
       following forms:

       ':'   Use the whole dimension

       'n'   Use only index "n". The dimension of this index in	the resulting
	     virtual piddle is 1. An example involving those first two index
	     formats:

	      pdl> $column = $im->slice('2,:')
	      pdl> $row	= $im->slice(':,0')
	      pdl> p $column

	      [
	       [ 3]
	       [ 8]
	       [15]
	       [18]
	       [23]
	      ]

	      pdl> p $row

	      [
	       [1 2 3 4	5]
	      ]

	      pdl> help	$column
	      This variable is Double D	[1,5]		     VC		  0.00Kb

	      pdl> help	$row
	      This variable is Double D	[5,1]		     VC		  0.00Kb

       '(n)' Use only index "n". This dimension	is removed from	the resulting
	     piddle (relying on	the fact that a	dimension of size 1 can	always
	     be	removed). The distinction between this case and	the previous
	     one becomes important in assignments where	left and right hand
	     side have to have appropriate dimensions.

	      pdl> $line = $im->slice(':,(0)')
	      pdl> help	$line
	      This variable is Double D	[5]		     -C		  0.00Kb

	      pdl> p $line
	      [1 2 3 4 5]

	     Spot the difference to the	previous example?

       'n1:n2' or 'n1:n2:n3'
	     Take the range of indices from "n1" to "n2" or (second form) take
	     the range of indices from "n1" to "n2" with step "n3". An example
	     for the use of this format	is the previous	definition of the sub-
	     image composed of even lines.

	      pdl> $even = $im->slice(':,1:-1:2')

	     This example also demonstrates that negative indices work like
	     they do for normal	Perl style arrays by counting backwards	from
	     the end of	the dimension. If "n2" is smaller than "n1" (in	the
	     example -1	is equivalent to index 4) the elements in the virtual
	     piddle are	effectively reverted with respect to its parent.

       '*[n]'
	     Add a dummy dimension. The	size of	this dimension will be 1 by
	     default or	equal to "n" if	the optional numerical argument	is
	     given.

	     Now, this is really something a bit strange on first sight. What
	     is	a dummy	dimension? A dummy dimension inserts a dimension where
	     there wasn't one before. How is that done ? Well, in the case of
	     the new dimension having size 1 it	can be easily explained	by the
	     way in which you can identify a vector (with "m" elements)	with
	     an	"(1,m)"	or "(m,1)" matrix. The same holds obviously for	higher
	     dimensional objects. More interesting is the case of a dummy
	     dimensions	of size	greater	than one (e.g. "slice('*5,:')"). This
	     works in the same way as a	call to	the dummy function creates a
	     new dummy dimension.  So read on and check	its explanation	below.

       '([n1:n2[:n3]]=i)'
	     [Not yet implemented ??????]  With	an argument like this you make
	     generalised diagonals. The	diagonal will be dimension no. "i" of
	     the new output piddle and (if optional part in brackets
	     specified)	will extend along the range of indices specified of
	     the respective parent piddle's dimension. In general an argument
	     like this only makes sense	if there are other arguments like this
	     in	the same call to slice.	The part in brackets is	optional for
	     this type of argument. All	arguments of this type that specify
	     the same target dimension "i" have	to relate to the same number
	     of	indices	in their parent	dimension. The best way	to explain it
	     is	probably to give an example, here we make a piddle that	refers
	     to	the elements along the space diagonal of its parent piddle (a
	     cube):

	      $cube = zeroes(5,5,5);
	      $sdiag = $cube->slice('(=0),(=0),(=0)');

	     The above command creates a virtual piddle	that represents	the
	     diagonal along the	parents' dimension no. 0, 1 and	2 and makes
	     its dimension 0 (the only dimension) of it. You use the extended
	     syntax if the dimension sizes of the parent dimensions you	want
	     to	build the diagonal from	have different sizes or	you want to
	     reverse the sequence of elements in the diagonal, e.g.

	      $rect = zeroes(12,3,5,6,2);
	      $vpdl = $rect->slice('2:7,(0:1=1),(4),(5:4=1),(=1)');

	     So	the elements of	$vpdl will then	be related to those of its
	     parent in way we can express as:

	       vpdl(i,j) = rect(i+2,j,4,5-j,j)	     0<=i<5, 0<=j<2

       [ work in the new index function: "$b = $a->index($c);" ???? ]

   There are different kinds of	assignments in PDL
       The previous examples have already shown	that virtual piddles can be
       used to operate on or access portions of	data of	a parent piddle. They
       can also	be used	as lvalues in assignments (as the use of "++" in some
       of the examples above has already demonstrated).	For explicit
       assignments to the data represented by a	virtual	piddle you have	to use
       the overloaded ".=" operator (which in this context we call propagated
       assignment). Why	can't you use the normal assignment operator "="?

       Well, you definitely still can use the '=' operator but it wouldn't do
       what you	want. This is due to the fact that the '=' operator cannot be
       overloaded in the same way as other assignment operators. If we tried
       to use '=' to try to assign data	to a portion of	a physical piddle
       through a virtual piddle	we wouldn't achieve the	desired	effect
       (instead	the variable representing the virtual piddle (a	reference to a
       blessed thingy) would after the assignment just contain the reference
       to another blessed thingy which would behave to future assignments as a
       "physical" copy of the original rvalue [this is actually	not yet	clear
       and subject of discussions in the PDL developers	mailing	list]. In that
       sense it	would break the	connection of the piddle to the	parent [ isn't
       this behaviour in a sense the opposite of what happens in dataflow,
       where ".=" breaks the connection	to the parent? ].

       E.g.

	pdl> $line = $im->slice(':,(2)')
	pdl> $line = zeroes(5);
	pdl> $line++;
	pdl> p $im

	[
	 [ 1  2	 3  4  5]
	 [ 6  7	 8  9 10]
	 [13 14	15 16 17]
	 [16 17	18 19 20]
	 [21 22	23 24 25]
	]

	pdl> p $line
	[1 1 1 1 1]

       But using ".="

	pdl> $line = $im->slice(':,(2)')
	pdl> $line .= zeroes(5)
	pdl> $line++
	pdl> p $im

	[
	 [ 1  2	 3  4  5]
	 [ 6  7	 8  9 10]
	 [ 1  1	 1  1  1]
	 [16 17	18 19 20]
	 [21 22	23 24 25]
	]

	pdl> print $line
	[1 1 1 1 1]

       Also, you can substitute

	pdl> $line .= 0;

       for the assignment above	(the zero is converted to a scalar piddle,
       with no dimensions so it	can be assigned	to any piddle).

       A nice feature in recent	perl versions is lvalue	subroutines (i.e.,
       versions	5.6.x and higher including all perls currently supported by
       PDL).  That allows one to use the slicing syntax	on both	sides of the
       assignment:

	pdl> $im->slice(':,(2)') .= zeroes(5)->xvals->float

       Related to the lvalue sub assignment feature is a little	trap for the
       unwary: recent perls introduced a "feature" which breaks	PDL's use of
       lvalue subs for slice assignments when running under the	perl debugger,
       "perl -d".  Under the debugger, the above usage gives an	error like: "
       Can't return a temporary	from lvalue subroutine... " So you must	use
       syntax like this:

	pdl> ($pdl = $im->slice(':,(2)')) .= zeroes(5)->xvals->float

       which works both	with and without the debugger but is arguably clumsy
       and awkward to read.

       Note that there can be a	problem	with assignments like this when	lvalue
       and rvalue piddles refer	to overlapping portions	of data	in the parent
       piddle:

	# revert the elements of the first line	of $a
	($tmp =	$a->slice(':,(1)')) .= $a->slice('-1:0,(1)');

       Currently, the parent data on the right side of the assignments is not
       copied before the (internal) assignment loop proceeds. Therefore, the
       outcome of this assignment will depend on the sequence in which
       elements	are assigned and almost	certainly not do what you wanted.  So
       the semantics are currently undefined for now and liable	to change
       anytime.	To obtain the desired behaviour, use

	($tmp =	$a->slice(':,(1)')) .= $a->slice('-1:0,(1)')->copy;

       which makes a physical copy of the slice	or

	($tmp =	$a->slice(':,(1)')) .= $a->slice('-1:0,(1)')->sever;

       which returns the same slice but	severs the connection of the slice to
       its parent.

   Other functions that	manipulate dimensions
       Having talked extensively about the slice function it should be noted
       that this is not	the only PDL indexing function.	There are additional
       indexing	functions which	are also useful	(especially in the context of
       threading which we will talk about later). Here are a list and some
       examples	how to use them.

       "dummy"
	   inserts a dummy dimension of	the size you specify (default 1) at
	   the chosen location.	You can't wait to hear how that	is achieved?
	   Well, all elements with index "(X,x,Y)" ("0<=x<size_of_dummy_dim")
	   just	map to the element with	index "(X,Y)" of the parent piddle
	   (where "X" and "Y" refer to the group of indices before and after
	   the location	where the dummy	dimension was inserted.)

	   This	example	calculates the x coordinate of the centroid of an
	   image (later	we will	learn that we didn't actually need the dummy
	   dimension thanks to the magic of implicit threading;	but using
	   dummy dimensions the	code would also	work in	a thread-less world;
	   though once you have	worked with PDL	threads	you wouldn't want to
	   live	without	them again).

	    # centroid
	    ($xd,$yd) =	$im->dims;
	    $xc	= sum($im*xvals(zeroes($xd))->dummy(1,$yd))/sum($im);

	   Let's explain how that works	in a little more detail. First,	the
	   product:

	    $xvs = xvals(zeroes($xd));
	    print $xvs->dummy(1,$yd);	   # repeat the	line $yd times
	    $prod = $im*xvs->dummy(1,$yd); # form the pixel-wise product with
					   # the repeated line of x-values

	   The rest is then summing the	results	of the pixel-wise product
	   together and	normalizing with the sum of all	pixel values in	the
	   original image thereby calculating the x-coordinate of the "center
	   of mass" of the image (interpreting pixel values as local mass)
	   which is known as the centroid of an	image.

	   Next	is a (from the point of	view of	memory consumption) very cheap
	   conversion from grey-scale to RGB, i.e. every pixel holds now a
	   triple of values instead of a scalar. The three values in the
	   triple are, fortunately, all	the same for a grey image, so that our
	   trick works well in that it maps all	the three members of the
	   triple to the same source element:

	    # a	cheap grey-scale to RGB	conversion
	    $rgb = $grey->dummy(0,3)

	   Unfortunately this trick cannot be used to convert your old B/W
	   photos to color ones	in the way you'd like. :(

	   Note	that the memory	usage of piddles with dummy dimensions is
	   especially sensitive	to the internal	representation.	If the piddle
	   can be represented as a virtual affine (``vaffine'')	piddle,	only
	   the control structures are stored. But if $b	in

	    $a = zeroes(10000);
	    $b = $a->dummy(1,10000);

	   is made physical by some routine, you will find that	the memory
	   usage of your program has suddenly grown by 100Mb.

       "diagonal"
	   replaces two	dimensions (which have to be of	equal size) by one
	   dimension that references all the elements along the	"diagonal"
	   along those two dimensions. Here, we	have two examples which	should
	   appear familiar to anyone who has ever done some linear algebra.
	   Firstly, make a unity matrix:

	    # unity matrix
	    $e = zeroes(float, 3, 3); #	make everything	zero
	    ($tmp = $e->diagonal(0,1)) .= 1; # set the elements	along the diagonal to 1
	    print $e;

	   Or the other	diagonal:

	    ($tmp = $e->slice(':-1:0')->diagonal(0,1)) .= 2;
	    print $e;

	   (Did	you notice how we used the slice function to revert the
	   sequence of lines before setting the	diagonal of the	new child,
	   thereby setting the cross diagonal of the parent ?)	Or a mapping
	   from	the space of diagonal matrices to the field over which the
	   matrices are	defined, the trace of a	matrix:

	    # trace of a matrix
	    $trace = sum($mat->diagonal(0,1));	# sum all the diagonal elements

       "xchg" and "mv"
	   xchg	exchanges or "transposes" the two  specified dimensions.  A
	   straightforward example:

	    # transpose	a matrix (without explicitly reshuffling data and
	    # making a copy)
	    $prod = $a x $a->xchg(0,1);

	   $prod should	now be pretty close to the unity matrix	if $a is an
	   orthogonal matrix. Often "xchg" will	be used	in the context of
	   threading but more about that later.

	   mv works in a similar fashion. It moves a dimension (specified by
	   its number in the parent) to	a new position in the new child
	   piddle:

	    $b = $a->mv(4,0);  # make the 5th dimension	of $a the first	in the
			       # new child $b

	   The difference between "xchg" and "mv" is that "xchg" only changes
	   the position	of two dimensions with each other, whereas "mv"
	   inserts the first dimension to the place of second, moving the
	   other dimensions around accordingly.

       "clump"
	   collapses several dimensions	into one. Its only argument specifies
	   how many dimensions of the source piddle should be collapsed
	   (starting from the first). An (admittedly unrealistic) example is a
	   3D piddle which holds data from a stack of image files that you
	   have	just read in. However, the data	from each image	really
	   represents a	1D time	series and has only been arranged that way
	   because it was digitized with a frame grabber. So to	have it	again
	   as an array of time sequences you say

	    pdl> $seqs = $stack->clump(2)
	    pdl> help vars
	    PDL	variables in package main::

	    Name	 Type	Dimension	Flow  State	     Mem
	    ----------------------------------------------------------------
	    $seqs	 Double	D [8000,50]	       -C	    0.00Kb
	    $stack	 Double	D [100,80,50]	       P	    3.05Mb

	   Unrealistic as it may seem, our confocal microscope software	writes
	   data	(sometimes) this way. But more often you use clump to achieve
	   a certain effect when using implicit	or explicit threading.

   Calls to indexing functions can be chained
       As you might have noticed in some of the	examples above calls to	the
       indexing	functions can be nicely	chained	since all of these functions
       return a	newly created child object. However, when doing	extensive
       index manipulations in a	chain be sure to keep track of what you	are
       doing, e.g.

	$a->xchg(0,1)->mv(0,4)

       moves the dimension 1 of	$a to position 4 since when the	second command
       is executed the original	dimension 1 has	been moved to position 0 of
       the new child that calls	the "mv" function. I think you get the idea
       (in spite of my convoluted explanations).

   Propagated assignments ('.=') and dummy dimensions
       A subtlety related to indexing is the assignment	to piddles containing
       dummy dimensions	of size	greater	than 1.	These assignments (using ".=")
       are forbidden since several elements of the lvalue piddle point to the
       same element of the parent. As a	consequence the	value of those parent
       elements	are potentially	ambiguous and would depend on the sequence in
       which the implementation	makes the assignments to elements. Therefore,
       an assignment like this:

	$a = pdl [1,2,3];
	$b = $a->dummy(1,4);
	$b .= yvals(zeroes(3,4));

       can produce unexpected results and the results are explicitly undefined
       by PDL because when PDL gets parallel computing features, the current
       result may well change.

       From the	point of view of dataflow the introduction of greater-size-
       than-one	dummy dimensions is regarded as	an irreversible	transformation
       (similar	to the terminology in thermodynamics) which precludes backward
       propagation of assignment to a parent (which you	had explicitly
       requested using the ".="	assignment). A similar problem to watch	out
       for occurs in the context of threading where sometimes dummy dimensions
       are created implicitly during the thread	loop (see below).

   Reasons for the parent/child	(or "pointer") concept
       [ this will have	to wait	a bit ]

	XXXXX being memory efficient
	XXXXX in the context of	threading
	XXXXX very flexible and	powerful way of	accessing portions of piddle data
	      (in much more general way	than sec, etc allow)
	XXXXX efficient	implementation
	XXXXX difference to section/at,	etc.

   How to make things physical again
       [ XXXXX fill in later when everything has settled a bit more ]

	** When	needed (xsub routine interfacing C lib function)
	** How achieved	(->physical)
	** How to test (isphysical (explain how	it works currently))
	** ->copy and ->sever

Threading
       In the previous paragraph on indexing we	have already mentioned the
       term occasionally but now its really time to talk explicitly about
       "threading" with	piddles. The term threading has	many different
       meanings	in different fields of computing. Within the framework of PDL
       it could	probably be loosely defined as an implicit looping facility.
       It is implicit because you don't	specify	anything like enclosing	for-
       loops but rather	the loops are automatically (or	'magically') generated
       by PDL based on the dimensions of the piddles involved. This should
       give you	a first	idea why the index/dimension manipulating functions
       you have	met in the previous paragraphs are especially important	and
       useful in the context of	threading.  The	other ingredient for threading
       (apart from the piddles involved) is a function that is threading aware
       (generally, these are PDL::PP compiled functions) and that the piddles
       are "threaded" over.  So	much about the terminology and now let's try
       to shed some light on what it all means.

   Implicit threading -	a first	example
       There are two slightly different	variants of threading. We start	with
       what we call "implicit threading". Let's	pick a practical example that
       involves	looping	of a function over many	elements of a piddle. Suppose
       we have an RGB image that we want to convert to grey-scale. The RGB
       image is	represented by a 3-dim piddle "im(3,x,y)" where	the first
       dimension contains the three color components of	each pixel and "x" and
       "y" are width and height	of the image, respectively. Next we need to
       specify how to convert a	color-triple at	a given	pixel into a grey-
       value (to be a realistic	example	it should represent the	relative
       intensity with which our	color insensitive eye cells would detect that
       color to	achieve	what we	would call a natural conversion	from color to
       grey-scale). An approximation that works	quite well is to compute the
       grey intensity from each	RGB triplet (r,g,b) as a weighted sum

	grey-value = 77/256*r +	150/256*g + 29/256*b =
	    inner([77,150,29]/256, [r,g,b])

       where the last form indicates that we can write this as an inner
       product of the 3-vector comprising the weights for red, green and blue
       components with the 3-vector containing the color components.
       Traditionally, we might have written a function like the	following to
       process the whole image:

	my @dims=$im->dims;
	# here normally	check that first dim has correct size (3), etc
	$grey=zeroes(@dims[1,2]);   # make the piddle for the resulting	grey image
	$w = pdl [77,150,29] / 256; # the vector of weights
	for ($j=0;$j<dims[2];$j++) {
	   for ($i=0;$i<dims[1];$i++) {
	       # compute the pixel value
	       $tmp = inner($w,$im->slice(':,(i),(j)'));
	       set($grey,$i,$j,$tmp); #	and set	it in the grey-scale image
	   }
	}

       Now we write the	same using threading (noting that "inner" is a
       threading aware function	defined	in the PDL::Primitive package)

	$grey =	inner($im,pdl([77,150,29]/256));

       We have ended up	with a one-liner that automatically creates the	piddle
       $grey with the right number and size of dimensions and performs the
       loops automatically (these loops	are implemented	as fast	C code in the
       internals of PDL).  Well, we still owe you an explanation how this
       'magic' is achieved.

   How does the	example	work ?
       The first thing to note is that every function that is threading	aware
       (these are without exception functions compiled from concise
       descriptions by PDL::PP,	later just called PP-functions)	expects	a
       defined (minimum) number	of dimensions (we call them core dimensions)
       from each of its	piddle arguments. The inner function expects two one-
       dimensional (input) parameters from which it calculates a zero-
       dimensional (output) parameter. We write	that symbolically as
       "inner((n),(n),[o]())" and call it "inner"'s signature, where n
       represents the size of that dimension. n	being equal in the first and
       second parameter	means that those dimensions have to be of equal	size
       in any call. As a different example take	the outer product which	takes
       two 1D vectors to generate a 2D matrix, symbolically written as
       "outer((n),(m),[o](n,m))". The "[o]" in both examples indicates that
       this (here third) argument is an	output argument. In the	latter example
       the dimensions of first and second argument don't have to agree but you
       see how they determine the size of the two dimensions of	the output
       piddle.

       Here is the point when threading	finally	enters the game. If you	call
       PP-functions with piddles that have more	than the required core
       dimensions the first dimensions of the piddle arguments are used	as the
       core dimensions and the additional extra	dimensions are threaded	over.
       Let us demonstrate this first with our example above

	$grey =	inner($im,$w); # w is the weight vector	from above

       In this case $w is 1D and so supplied just the core dimension, $im is
       3D, more	specifically "(3,x,y)".	The first dimension (of	size 3)	is the
       required	core dimension that matches (as	required by inner) the first
       (and only) dimension of $w. The second dimension	is the first thread
       dimension (of size "x") and the third is	here the second	thread
       dimension (of size "y").	The output piddle is automatically created (as
       requested by setting $grey to "null" prior to invocation). The output
       dimensions are obtained by appending the	loop dimensions	(here "(x,y)")
       to the core output dimensions (here 0D) to yield	the final dimensions
       of the auto-created piddle (here	"0D+2D=2D" to yield a 2D output	of
       size "(x,y)").

       So the above command calls the core functionality that computes the
       inner product of	two 1D vectors "x*y" times with	$w and all 1D slices
       of the form "(':,(i),(j)')" of $im and sets the respective elements of
       the output piddle "$grey(i,j)" to the result of each computation. We
       could write that	symbolically as

	$grey(0,0) = f($w,$im(:,(0),(0)))
	$grey(1,0) = f($w,$im(:,(1),(0)))
	    .
	    .
	    .
	$grey(x-2,y-1) = f($w,$im(:,(x-2),(y-1)))
	$grey(x-1,y-1) = f($w,$im(:,(x-1),(y-1)))

       But this	is done	automatically by PDL without writing any explicit Perl
       loops.  We see that the command really creates an output	piddle with
       the right dimensions and	sets the elements indeed to the	result of the
       computation for each pixel of the input image.

       When even more piddles and extra	dimensions are involved	things get a
       bit more	complicated. We	will first give	the general rules how the
       thread dimensions depend	on the dimensions of input piddles enabling
       you to figure out the dimensionality of an auto-created output piddle
       (for any	given set of input piddles and core dimensions of the PP-
       function	in question). The general rules	will most likely appear	a bit
       confusing on first sight	so that	we'll set out to illustrate the	usage
       with a set of further examples (which will hopefully also demonstrate
       that there are indeed many practical situations where threading comes
       in extremely handy).

   A call for coding discipline
       Before we point out the other technical details of threading, please
       note this call for programming discipline when using threading:

       In order	to preserve human readability, PLEASE comment any nontrivial
       expression in your code involving threading.  Most importantly, for any
       subroutine, include information at the beginning	about what you expect
       the dimensions to represent (or ranges of dimensions).

       As a warning, look at this undocumented function	and try	to guess what
       might be	going on:

	sub lookup {
	  my ($im,$palette) = @_;
	  my $res;
	  index($palette->xchg(0,1),
		     $im->long->dummy(0,($palette->dim)[0]),
		     ($res=null));
	  return $res;
	}

       Would you agree that it might be	difficult to figure out	expected
       dimensions, purpose of the routine, etc ?  (If you want to find out
       what this piece of code does, see below)

   How to figure out the loop dimensions
       There are a couple of rules that	allow you to figure out	number and
       size of loop dimensions (and if the size	of your	input piddles comply
       with the	threading rules). Dimensions of	any piddle argument are	broken
       down into two groups in the following: Core dimensions (as defined by
       the PP-function,	see Appendix B for a list of PDL primitives) and extra
       dimensions which	comprises all remaining	dimensions of that piddle. For
       example calling a function "func" with the signature
       "func((n,m),[o](n))" with a piddle "a(2,4,7,1,3)" as "f($a,($o =
       null))" results in the semantic splitting of a's	dimensions into: core
       dimensions "(2,4)" and extra dimensions "(7,1,3)".

       R0    Core dimensions are identified with the first N dimensions	of the
	     respective	piddle argument	(and are required). Any	further
	     dimensions	are extra dimensions and used to determine the loop
	     dimensions.

       R1    The number	of (implicit) loop dimensions is equal to the maximal
	     number of extra dimensions	taken over the set of piddle
	     arguments.

       R2    The size of each of the loop dimensions is	derived	from the size
	     of	the respective dimensions of the piddle	arguments. The size of
	     a loop dimension is given by the maximal size found in any	of the
	     piddles having this extra dimension.

       R3    For all piddles that have a given extra dimension the size	must
	     be	equal to the size of the loop dimension	(as determined by the
	     previous rule) or 1; otherwise you	raise a	runtime	exception. If
	     the size of the extra dimension in	a piddle is one	it is
	     implicitly	treated	as a dummy dimension of	size equal to that
	     loop dim size when	performing the thread loop.

       R4    If	a piddle doesn't have a	loop dimension,	in the thread loop
	     this piddle is treated as if having a dummy dimension of size
	     equal to the size of that loop dimension.

       R5    If	output auto-creation is	used (by setting the relevant piddle
	     to	"PDL->null" before invocation) the number of dimensions	of the
	     created piddle is equal to	the sum	of the number of core output
	     dimensions	+ number of loop dimensions. The size of the core
	     output dimensions is derived from the relevant dimension of input
	     piddles (as specified in the function definition) and the sizes
	     of	the other dimensions are equal to the size of the loop
	     dimension it is derived from. The automatically created piddle
	     will be physical (unless dataflow is in operation).

       In this context,	note that you can run into the problem with assignment
       to piddles containing greater-than-one dummy dimensions (see above).
       Although	your output piddle(s) didn't contain any dummy dimensions in
       the first place they may	end up with implicitly created dummy
       dimensions according to R4.

       As an example, suppose we have a	(here unspecified) PP-function with
       the signature:

	func((m,n),(m,n,o),(m),[o](m,o))

       and you call it with 3 piddles "a(5,3,10,11)", "b(5,3,2,10,1,12)", and
       "c(5,1,11,12)" as

	func($a,$b,$c,($d=null))

       then the	number of loop dimensions is 3 (by "R0+R1" from	$b and $c)
       with sizes "(10,11,12)" (by R2);	the two	output core dimensions are
       "(5,2)" (from the signature of func) resulting in a 5-dimensional
       output piddle $c	of size	"(5,2,10,11,12)" (see R5) and (the
       automatically created) $d is derived from "($a,$b,$c)" in a way that
       can be expressed	in pdl pseudo-code as

	$d(:,:,i,j,k) .= func($a(:,:,i,j),$b(:,:,:,i,0,k),$c(:,0,j,k))
	   with	0<=i<10, 0<=j<=11, 0<=k<12

       If we analyze the color to grey-scale conversion	again with these rules
       in mind we note another great advantage of implicit threading.  We can
       call the	conversion with	a piddle representing a	pixel (im(3)), a line
       of rgb pixels ("im(3,x)"), a proper color image ("im(3,x,y)") or	a
       whole stack of RGB images ("im(3,x,y,z)"). As long as $im is of the
       form "(3,...)" the automatically	created	output piddle will contain the
       right number of dimensions and contain the intensity data as we expect
       it since	the loops have been implicitly performed thanks	to implicit
       threading. You can easily convince yourself that	calling	with a color
       pixel $grey is 0D, with a line it turns out 1D grey(x), with an image
       we get "grey(x,y)" and finally we get a converted image stack
       "grey(x,y,z)".

       Let's fill these	general	rules with some	more life by going through a
       couple of further examples. The reader may try to figure	out equivalent
       formulations with explicit for-looping and compare the flexibility of
       those routines using implicit threading to the explicit formulation.
       Furthermore, especially when using several thread dimensions it is a
       useful exercise to check	the relative speed by doing some benchmark
       tests (which we still have to do).

       First in	the row	is a slightly reworked centroid	example, now coded
       with threading in mind.

	# threaded mult	to calculate centroid coords, works for	stacks as well
	$xc = sumover(($im*xvals(($im->dims)[0]))->clump(2)) /
	      sumover($im->clump(2));

       Let's analyze what's going on step by step. First the product:

	$prod =	$im*xvals(zeroes(($im->dims)[0]))

       This will actually work for $im being one, two, three, and higher
       dimensional. If $im is one-dimensional it's just	an ordinary product
       (in the sense that every	element	of $im is multiplied with the
       respective element of "xvals(...)"), if $im has more dimensions further
       threading is done by adding appropriate dummy dimensions	to
       "xvals(...)"  according to R4.  More importantly, the two sumover
       operations show a first example of how to make use of the dimension
       manipulating commands. A	quick look at sumover's	signature will remind
       you that	it will	only "gobble up" the first dimension of	a given	input
       piddle. But what	if we want to really compute the sum over all elements
       of the first two	dimensions? Well, nothing keeps	us from	passing	a
       virtual piddle into sumover which in this case is formed	by clumping
       the first two dimensions	of the "parent piddle" into one. From the
       point of	view of	the parent piddle the sum is now computed over the
       first two dimensions, just as we	wanted,	though sumover has just	done
       the job as specified by its signature. Got it ?

       Another little finesse of writing the code like that: we	intentionally
       used "sumover($pdl->clump(2))" instead of "sum($pdl)" so	that we	can
       either pass just	an image "(x,y)" or a stack of images "(x,y,t)"	into
       this routine and	get either just	one x-coordiante or a vector of
       x-coordinates (of size t) in return.

       Another set of common operations	are what one could call	"projection
       operations". These operations take a N-D	piddle as input	and return a
       (N-1)-D "projected" piddle. These operations are	often performed	with
       functions like sumover, prodover, minimum and maximum.  Using again
       images as examples we might want	to calculate the maximum pixel value
       for each	line of	an image or image stack. We know how to	do that

	# maxima of lines (as function of line number and time)
	maximum($stack,($ret=null));

       But what	if you want to calculate maxima	per column when	implicit
       threading always	applies	the core functionality to the first dimension
       and threads over	all others? How	can we achieve that instead the	core
       functionality is	applied	to the second dimension	and threading is done
       over the	others.	Can you	guess it? Yes, we make a virtual piddle	that
       has the second dimension	of the "parent piddle" as its first dimension
       using the "mv" command.

	# maxima of columns (as	function of column number and time)
	maximum($stack->mv(1,0),($ret=null));

       and calculating all the sums of sub-slices over the third dimension is
       now almost too easy

	# sums of pixels in time (assuming time	is the third dim)
	sumover($stack->mv(2,0),($ret=null));

       Finally,	if you want to apply the operation to all elements (like max
       over all	elements or sum	over all elements) regardless of the
       dimensions of the piddle	in question "clump" comes in handy. As an
       example look at the definition of "sum" (as defined in "Ufunc.pm"):

	sub sum	{
	  PDL::Ufunc::sumover($name->clump(-1),($tmp=null));
	  return $tmp->at(); # return a	Perl number, not a 0D piddle
	}

       We have already mentioned that all basic	operations support threading
       and assignment is no exception. So here are a couple of threaded
       assignments

	pdl> $im = zeroes(byte,	10,20)
	pdl> $line = exp(-rvals(10)**2/9)
	# threaded assignment
	pdl> $im .= $line      # set every line	of $im to $line
	pdl> $im2 .= 5	       # set every element of $im2 to 5

       By now you probably see how it works and	what it	does, don't you?

       To finish the examples in this paragraph	here is	a function to create
       an RGB image from what is called	a palette image. The palette image
       consists	of two parts: an image of indices into a color lookup table
       and the color lookup table itself. [ describe how it works ] We are
       going to	use a PP-function we haven't encoutered	yet in the previous
       examples. It is the aptly named index function, signature
       "((n),(),[o]())"	(see Appendix B) with the core functionality that
       "index(pdl (0,2,4,5),2,($ret=null))" will return	the element with index
       2 of the	first input piddle. In this case, $ret will contain the	value
       4.  So here is the example:

	# a threaded index lookup to generate an RGB, or RGBA or YMCK image
	# from a palette image (represented by a lookup	table $palette and
	# an color-index image $im)
	# you can say just dummy(0) since the rules of threading make it fit
	pdl> index($palette->xchg(0,1),
		      $im->long->dummy(0,($palette->dim)[0]),
		      ($res=null));

       Let's go	through	it and explain the steps involved. Assuming we are
       dealing with an RGB lookup-table	$palette is of size "(3,x)". First we
       exchange	the dimensions of the palette so that looping is done over the
       first dimension of $palette (of size 3 that represent r,	g, and b
       components). Now	looking	at $im,	we add a dummy dimension of size equal
       to the length of	the number of components (in the case we are
       discussing here we could	have just used the number 3 since we have 3
       color components). We can use a dummy dimension since for red, green
       and blue	color components we use	the same index from the	original
       image, e.g.  assuming a certain pixel of	$im had	the value 4 then the
       lookup should produce the triple

	[palette(0,4),palette(1,4),palette(2,4)]

       for the new red,	green and blue components of the output	image.
       Hopefully by now	you have some sort of idea what	the above piece	of
       code is supposed	to do (it is often actually quite complicated to
       describe	in detail how a	piece of threading code	works; just go ahead
       and experiment a	bit to get a better feeling for	it).

       If you have read	the threading rules carefully, then you	might have
       noticed that we didn't have to explicitly state the size	of the dummy
       dimension that we created for $im; when we create it with size 1	(the
       default)	the rules of threading make it automatically fit to the
       desired size (by	rule R3, in our	example	the size would be 3 assuming a
       palette of size "(3,x)"). Since situations like this do occur often in
       practice	this is	actually why rule R3 has been introduced (the part
       that makes dimensions of	size 1 fit to the thread loop dim size). So we
       can just	say

	pdl> index($palette->xchg(0,1),$im->long->dummy(0),($res=null));

       Again, you can convince yourself	that this routine will create the
       right output if called with a pixel ($im	is 0D),	a line ($im is 1D), an
       image ($im is 2D), ..., an RGB lookup table (palette is "(3,x)")	and
       RGBA lookup table (palette is "(4,x)", see e.g. OpenGL).	This
       flexibility is achieved by the rules of threading which are made	to do
       the right thing in most situations.

       To wrap it all up once again, the general idea is as follows. If	you
       want to achieve looping over certain dimensions and have	the core
       functionality applied to	another	specified set of dimensions you	use
       the dimension manipulating commands to create a (or several) virtual
       piddle(s) so that from the point	of view	of the parent piddle(s)	you
       get what	you want (always having	the signature of the function in
       question	and R1-R5 in mind!). Easy, isn't it ?

   Output auto-creation	and PP-function	calling	conventions
       At this point we	have to	divert to some technical detail	that has to do
       with the	general	calling	conventions of PP-functions and	the automatic
       creation	of output arguments.  Basically, there are two ways of
       invoking	PDL routines, namely

	$result	= func($a,$b);

       and

	func($a,$b,$result);

       If you are only using implicit threading	then the output	variable can
       be automatically	created	by PDL.	You flag that to the PP-function by
       setting the output argument to a	special	kind of	piddle that is
       returned	from a call to the function "PDL->null"	that returns an
       essentially "empty" piddle (for those interested	in details there is a
       flag in the C pdl structure for this). The dimensions of	the created
       piddle are determined by	the rules of implicit threading: the first
       dimensions are the core output dimensions to which the threading
       dimensions are appended (which are in turn determined by	the dimensions
       of the input piddles as described above).  So you can say

	func($a,$b,($result=PDL->null));

       or

	$result	= func($a,$b)

       which are exactly equivalent.

       Be warned that you can not use output auto-creation when	using explicit
       threading (for reasons explained	in the following section on explicit
       threading, the second variant of	threading).

       In "tight" loops	you probably want to avoid the implicit	creation of a
       temporary piddle	in each	step of	the loop that comes along with the
       "functional" style but rather say

	# create output	piddle of appropriate size only	at first invocation
	$result	= null;
	for (0...$n) {
	     func($a,$b,$result); # in all but the first invocation $result
	     func2($b);		  # is defined and has the right size to
				  # take the output provided $b's dims don't change
	     twiddle($result,$a); # do something from $result to $a for	iteration
	}

       The take-home message of	this section once more:	be aware of the
       limitation on output creation when using	explicit threading.

   Explicit threading
       Having so far only talked about the first flavour of threading it is
       now about time to introduce the second variant. Instead of shuffling
       around dimensions all the time and relying on the rules of implicit
       threading to get	it all right you sometimes might want to specify in a
       more explicit way how to	perform	the thread loop. It is probably	not
       too surprising that this	variant	of the game is called explicit
       threading.  Now,	before we create the wrong impression: it is not
       either implicit or explicit; the	two flavours do	mix. But more about
       that later.

       The two most used functions with	explicit threading are thread and
       unthread.  We start with	an example that	illustrates typical usage of
       the former:

	[ # ** this is the worst possible example to start with	]
	#  but can be used to show that	$mat +=	$line is different from
	#				$mat->thread(0)	+= $line
	# explicit threading to	add a vector to	each column of a matrix
	pdl> $mat  = zeroes(4,3)
	pdl> $line = pdl (3.1416,2,-2)
	pdl> ($tmp = $mat->thread(0)) += $line

       In this example,	"$mat->thread(0)" tells	PDL that you want the second
       dimension of this piddle	to be threaded over first leading to a thread
       loop that can be	expressed as

	for (j=0; j<3; j++) {
	   for (i=0; i<4; i++) {
	       mat(i,j)	+= src(j);
	   }
	}

       "thread"	takes a	list of	numbers	as arguments which explicitly specify
       which dimensions	to thread over first. With the introduction of
       explicit	threading the dimensions of a piddle are conceptually split
       into three different groups the latter two of which we have already
       encountered: thread dimensions, core dimensions and extra dimensions.

       Conceptually, it	is best	to think of those dimensions of	a piddle that
       have been specified in a	call to	"thread" as being taken	away from the
       set of normal dimensions	and put	on a separate stack. So	assuming we
       have a piddle "a(4,7,2,8)" saying

	$b = $a->thread(2,1)

       creates a new virtual piddle of dimension "b(4,8)" (which we call the
       remaining dims) that also has 2 thread dimensions of size "(2,7)". For
       the purposes of this document we	write that symbolically	as
       "b(4,8){2,7}". An important difference to the previous examples where
       only implicit threading was used	is the fact that the core dimensions
       are matched against the remaining dimensions which are not necessarily
       the first dimensions of the piddle. We will now specify how the
       presence	of thread dimensions changes the rules R1-R5 for thread	loops
       (which apply to the special case	where none of the piddle arguments has
       any thread dimensions).

       T0  Core	dimensions are matched against the first n remaining
	   dimensions of the piddle argument (note the difference to R1). Any
	   further remaining dimensions	are extra dimensions and are used to
	   determine the implicit loop dimensions.

       T1a The number of implicit loop dimensions is equal to the maximal
	   number of extra dimensions taken over the set of piddle arguments.

       T1b The number of explicit loop dimensions is equal to the maximal
	   number of thread dimensions taken over the set of piddle arguments.

       T1c The total number of loop dimensions is equal	to the sum of explicit
	   loop	dimensions and implicit	loop dimensions. In the	thread loop,
	   explicit loop dimensions are	threaded over first followed by
	   implicit loop dimensions.

       T2  The size of each of the loop	dimensions is derived from the size of
	   the respective dimensions of	the piddle arguments. It is given by
	   the maximal size found in any piddles having	this thread dimension
	   (for	explicit loop dimensions) or extra dimension (for implicit
	   loop	dimensions).

       T3  This	rule applies to	any explicit loop dimension as well as any
	   implicit loop dimension. For	all piddles that have a	given
	   thread/extra	dimension the size must	be equal to the	size of	the
	   respective explicit/implicit	loop dimension or 1; otherwise you
	   raise a runtime exception. If the size of a thread/extra dimension
	   of a	piddle is one it is implicitly treated as a dummy dimension of
	   size	equal to the explicit/implicit loop dimension.

       T4  If a	piddle doesn't have a thread/extra dimension that corresponds
	   to an explicit/implicit loop	dimension, in the thread loop this
	   piddle is treated as	if having a dummy dimension of size equal to
	   the size of that loop dimension.

       T4a All piddles that do have thread dimensions must have	the same
	   number of thread dimensions.

       T5  Output auto-creation	cannot be used if any of the piddle arguments
	   has any thread dimensions. Otherwise	R5 applies.

       The same	restrictions apply with	regard to implicit dummy dimensions
       (created	by application of T4) as already mentioned in the section on
       implicit	threading: if any of the output	piddles	has an (explicit or
       implicitly created) greater-than-one dummy dimension a runtime
       exception will be raised.

       Let us demonstrate these	rules at work in a generic case.  Suppose we
       have a (here unspecified) PP-function with the signature:

	func((m,n),(m),(),[o](m))

       and you call it with 3 piddles "a(5,3,10,11)", "b(3,5,10,1,12)",
       "c(10)" and an output piddle "d(3,11,5,10,12)" (which can here not be
       automatically created) as

	func($a->thread(1,3),$b->thread(0,3),$c,$d->thread(0,1))

       From the	signature of func and the above	call the piddles split into
       the following groups of core, extra and thread dimensions (written in
       the form	"pdl(core dims){thread dims}[extra dims]"):

	a(5,10){3,11}[]	b(5){3,1}[10,12] c(){}[10] d(5){3,11}[10,12]

       With this to help us along (it is in general helpful to write the
       arguments down like this	when you start playing with threading and want
       to keep track of	what is	going on) we further deduce that the number of
       explicit	loop dimensions	is 2 (by T1b from $a and $b) with sizes
       "(3,11)"	(by T2); 2 implicit loop dimensions (by	T1a from $b and	$d) of
       size "(10,12)" (by T2) and the elements of are computed from the	input
       piddles in a way	that can be expressed in pdl pseudo-code as

	for (l=0;l<12;l++)
	 for (k=0;k<10;k++)
	  for (j=0;j<11;j++)	     effect of treating	it as dummy dim	(index j)
	   for (i=0;i<3;i++)			     |
	      d(i,j,:,k,l) = func(a(:,i,:,j),b(i,:,k,0,l),c(k))

       Ugh, this example was really not	easy in	terms of bookkeeping. It
       serves mostly as	an example how to figure out what's going on when you
       encounter a complicated looking expression. But now it is really	time
       to show that threading is useful	by giving some more of our so called
       "practical" examples.

       [ The following examples	will need some additional explanations in the
       future. For the moment please try to live with the comments in the code
       fragments. ]

       Example 1:

	*** inverse of matrix represented by eigvecs and eigvals
	** given a symmetrical matrix M	= A^T x	diag(lambda_i) x A
	**    =>  inverse M^-1 = A^T x diag(1/lambda_i)	x A
	** first $tmp =	diag(1/lambda_i)*A
	** then	 A^T * $tmp by threaded	inner product
	# index	handling so that matrices print	correct	under pdl
	$inv .=	$evecs*0;  # just copy to get appropriately sized output
	$tmp .=	$evecs;	   # initialise, no back-propagation
	($tmp2 = $tmp->thread(0)) /= $evals;	#  threaded division
	# and now a matrix multiplication in disguise
	PDL::Primitive::inner($evecs->xchg(0,1)->thread(-1,1),
			      $tmp->thread(0,-1),
			      $inv->thread(0,1));
	# alternative for matrix mult using implicit threading,
	# first	xchg only for transpose
	PDL::Primitive::inner($evecs->xchg(0,1)->dummy(1),
			      $tmp->xchg(0,1)->dummy(2),
			      ($inv=null));

       Example 2:

	# outer	product	by threaded multiplication
	# stress that we need to do it with explicit call to my_biop1
	# when using explicit threading
	$res=zeroes(($a->dims)[0],($b->dims)[0]);
	my_biop1($a->thread(0,-1),$b->thread(-1,0),$res->(0,1),"*");
	# similar thing	by implicit threading with auto-created	piddle
	$res = $a->dummy(1) * $b->dummy(0);

       Example 3:

	# different use	of thread and unthread to shuffle a number of
	# dimensions in	one go without lots of calls to	->xchg and ->mv

	# use thread/unthread to shuffle dimensions around
	# just try it out and compare the child	piddle with its	parent
	$trans = $a->thread(4,1,0,3,2)->unthread;

       Example 4:

	# calculate a couple of	bounding boxes
	# $bb will hold	BB as [xmin,xmax],[ymin,ymax],[zmin,zmax]
	# we use again thread and unthread to shuffle dimensions around
	pdl> $bb = zeroes(double, 2,3 );
	pdl> minimum($vertices->thread(0)->clump->unthread(1), $bb->slice('(0),:'));
	pdl> maximum($vertices->thread(0)->clump->unthread(1), $bb->slice('(1),:'));

       Example 5:

	# calculate a self-rationed (i.e. self normalized) sequence of images
	# uses explicit	threading and an implicitly threaded division
	$stack = read_image_stack();
	# calculate the	average	(per pixel average) of the first $n+1 images
	$aver =	zeroes([stack->dims]->[0,1]);  # make the output piddle
	sumover($stack->slice(":,:,0:$n")->thread(0,1),$aver);
	$aver /= ($n+1);
	$stack /= $aver;  # normalize the stack	by doing a threaded division
	# implicit versus explicit
	# alternatively	calculate $aver	with implicit threading	and auto-creation
	sumover($stack->slice(":,:,0:$n")->mv(2,0),($aver=null));
	$aver /= ($n+1);
	#

   Implicit versus explicit threading
       In this paragraph we are	going to illustrate when explicit threading is
       preferable over implicit	threading and vice versa. But then again, this
       is probably not the best	way of putting the case	since you already
       know: the two flavours do mix. So, it's more about how to get the best
       of both worlds and, anyway, in the best of Perl traditions: TIMTOWTDI !

       [ Sorry,	this still has to be filled in in a later release; either
       refer to	above examples or choose some new ones ]

       Finally,	this may be a good place to justify all	the technical detail
       we have been going on about for a couple	of pages: why threading	?

       Well, code that uses threading should be	(considerably) faster than
       code that uses explicit for-loops (or similar Perl constructs) to
       achieve the same	functionality. Especially on supercomputers (with
       vector computing	facilities/parallel processing)	PDL threading will be
       implemented in a	way that takes advantage of the	additional facilities
       of these	machines. Furthermore, it is a conceptually simply construct
       (though technical details might get involved at times) and can greatly
       reduce the syntactical complexity of PDL	code (but keep the admonition
       for documentation in mind). Once	you are	comfortable with the threading
       way of thinking (and coding) it shouldn't be too	difficult to
       understand code that somebody else has written than (provided he	gave
       you an idea what	expected input dimensions are, etc.). As a general tip
       to increase the performance of your code: if you	have to	introduce a
       loop into your code try to reformulate the problem so that you can use
       threading to perform the	loop (as with anything there are exceptions to
       this rule of thumb; but the authors of this document tend to think that
       these are rare cases ;).

PDL::PP
   An easy way to define functions that	are aware of indexing and threading
       (and the	universe and everything)
       PDL:PP is part of the PDL distribution. It is used to generate
       functions that are aware	of indexing and	threading rules	from very
       concise descriptions. It	can be useful for you if you want to write
       your own	functions or if	you want to interface functions	from an
       external	library	so  that they support indexing and threading (and
       maybe dataflow as well, see PDL::Dataflow). For further details check
       PDL::PP.

Appendix A
   Affine transformations - a special class of simple and powerful
       transformations
       [ This is also something	to be added in future releases.	Do we already
       have the	general	make_affine routine in PDL ? It	is possible that we
       will reference another appropriate man page from	here ]

Appendix B
   signatures of standard PDL::PP compiled functions
       A selection of signatures of PDL	primitives to show how many dimensions
       PP compiled functions gobble up (and therefore you can figure out what
       will be threaded	over). Most of those functions are the basic ones
       defined in "primitive.pd"

	# functions in primitive.pd
	#
	sumover	       ((n),[o]())
	prodover       ((n),[o]())
	axisvalues     ((n))				       inplace
	inner	       ((n),(n),[o]())
	outer	       ((n),(m),[o](n,m))
	innerwt	       ((n),(n),(n),[o]())
	inner2	       ((m),(m,n),(n),[o]())
	inner2t	       ((j,n),(n,m),(m,k),[o]())
	index	       (1D,0D,[o])
	minimum	       (1D,[o])
	maximum	       (1D,[o])
	wstat	       ((n),(n),(),[o],())
	assgn	       ((),())

	# basic	operations
	binary operations ((),(),[o]())
	unary operations  ((),[o]())

AUTHOR & COPYRIGHT
       Copyright (C) 1997 Christian Soeller (c.soeller@auckland.ac.nz) &
       Tuomas J. Lukka (lukka@fas.harvard.edu).	All rights reserved. Although
       destined	for release as a man page with the standard PDL	distribution,
       it is not public	domain.	Permission is granted to freely	distribute
       verbatim	copies of this document	provided that no modifications outside
       of formatting be	made, and that this notice remain intact.  You are
       permitted and encouraged	to use its code	and derivatives	thereof	in
       your own	source code for	fun or for profit as you see fit.

perl v5.32.0			  2018-05-05			   INDEXING(1)

NAME | OVERVIEW | Indexing and threading with PDL | Threading | PDL::PP | Appendix A | Appendix B | AUTHOR & COPYRIGHT

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=PDL::Indexing&sektion=1&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help