Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
Prima::Drawable::GlyphUser Contributed Perl DocumentPrima::Drawable::Glyphs(3)

       Prima::Drawable::Glyphs - helper	routines for bi-directional text input
       and complex scripts output

	  use Prima;
	  $::application-> begin_paint;
	  a$::application-> text_shape_out('xxxx!123', 0,0);


       The class implements an abstraction over	a set of glyphs	that can be
       rendered	to represent text strings. Objects of the class	are created
       and returned from "Prima::Drawable::text_shape" calls, see more in
       "text_shape" in Prima::Drawable.	An object is a blessed array reference
       that can	contain	either two or four packed arrays with 16-bit integers,
       representing, correspondingly, a	set of glyph indexes, a	set of
       character indexes, a set	of glyph advances, and a set of	glyph position
       offsets per glyph. Additionally,	the class implements several sets of
       helper routines that aim	to address common tasks	when displaying	glyph-
       based strings.

       Each array is an	instance of "Prima::array", an effective plain memory
       structure that provides standard	perl interface over a string scalar
       filled with fixed-width integers.

       The following methods provide read-only access to these arrays:

	   Contains set	of unsigned 16-bit integers where each is a glyph
	   number corresponding	to the font that was used when shaping the
	   text. These glyph numbers are only applicable to that font. Zero is
	   usually treated as a	default	glyph in vector	fonts, when shaping
	   cannot map a	character; in bitmap fonts this	number it is usually a

	   This	array is recognized as a special case when is set to
	   "text_out" or "get_text_width", that	can process it without other
	   arrays. In this case, no special advances and glyph positions are
	   taken into the account though.

	   Each	glyph is not necessarily mapped	to a character,	and quite
	   often it is not, even in english left-to-right texts. F ex
	   character combinations like "ff", "fi", "fl"	can be mapped as
	   single ligature glyphs. When	right-to-left, RTL, text direction is
	   taken into the account, the glyph positions may change, too.	 See
	   "indexes" below that	addresses mapping of glyph to characters.

	   Contains set	of unsigned 16-bit integers where each is an offset
	   corresponding to the	text was used in shaping. Each glyph position
	   thus	points to a first character in the text	that maps to the

	   There can be	more than one characters per glyphs, such as the above
	   example with	a "ff" ligature. There can also	be cases with more
	   than	one characher per more than one	glyph, such is the case	in
	   indic scripts. In these cases it is easier to operate neither by
	   character offsets nor glyph offsets,	but rather by clusters,	where
	   each	is an individual syntax	unit that contains one or more
	   characters perl one or more glyphs.

	   In addition to the text offset, each	index value can	be flagged
	   with	a "to::RTL" bit, signifying that the character in question has
	   RTL direction.  This	is not necessarily semitic characters from RTL
	   languages that only have that attributes set; spaces	in these
	   languages are normally attributed the RTL bit too, sometimes	also
	   numbers. Use	of explicit direction control characters from U+20XX
	   block can result in any character being assigned or not assigned
	   the RTL bit.

	   The array has an extra item added to	its end, the length of the
	   text	that was used in the snaping. This helps for easy calculation
	   of cluster length in	characters, especially of the last one,	where
	   difference between indexes is, basically, the cluster length.

	   The array is	not used for text drawing or calculation, but only for
	   conversion between character, glyph,	and cluster coordinates	(see
	   "Coordinates" below).

	   Contains set	of unsigned 16-bit integers where each is a pixel
	   distance of how much	space the glyph	occupies. Where	the advances
	   array is not	present, or filled by "advances" options in
	   "text_shape", it is basically a sum of a, b,	and c widths of	a
	   glyph. However there	are cases when depending on shaping input,
	   these values	can differ.

	   One of those	cases is combining graphemes, where text consisting of
	   two characters, "A" and combining grave accent U+300	should be
	   drawn as a single "A" symbol, but font doesn't have that single
	   glyph but rather two	individual glyphs "A" and "`". There, where
	   grave glyph has its own advance for standalone usage, in this case
	   it should be	ignored	though,	and that is achieved by	setting	the
	   advance of the "`" to zero.

	   The array content is	respected by "text_out"	and "get_text_width",
	   and its content can be changed at will to produce gaps in the text
	   quite easily. F ex "Prima::Edit" uses that to display tab
	   characters as spaces	with 8x	advance.

	   Contains set	of pairs of signed 16-bit integers where each is a X
	   and Y pixel offset for each glyph. Like in the previous example
	   with	the "A"	symbol,	the grave glyph	"`" may	be positioned
	   differently on the vertical f ex on "A" and "A " graphemes.

	   The array is	respected by "text_out"	(but not by "get_text_width").

	   Contains set	of unsigned 16-bit integers where each is an index in
	   the font substitution list (see "fontMapperPalette" in
	   Prima::Drawable). Zero means	the current font.

	   The font substitution is applied by "text_shape" when "polyfont"
	   options is set (it is by default), and when the shaper cannot match
	   all fonts. If the current font contains all needed glyphs, this
	   entry is not	present	at all.

	   The array is	respected by "text_out"	and "get_text_width".

       In addition to natural character	coordinates, where each	index is an
       offset that can be directly used	in "substr" perl function, this	class
       offers two additional coordinate	systems	that help abstract the object
       data for	display	and navigation.

       The glyph coordinate is a rather	straighforward copy of the character
       coordinates, where each number is an offset in the "glyphs" array.
       Similarly, these	offsets	can be used to address individual glyphs,
       indexes,	advances, and positions. However these are not easy to use
       when one	needs, for example, to select a	grapheme with a	mouse, or
       break set of glyphs in such a way so that a grapheme is not broken.
       These can be managed easier in the cluster coordinate system.

       The cluster coordinates are virtually superimposed set of offset	where
       each correspond to a set	of one or more characters displayed by a one
       or more glyphs. Most useful functions below operate in this system.

       Practically, most useful	coordinates that can be	used for implementing
       selection is either character or	cluster, but not glyphs. The charater-
       based selections	makes trivial extraction or replacement	of the
       selected	text, while the	cluster-based makes it easier to manipulate (f
       ex with Shift- arrow keys) the selection	itself.

       The class supports both,	by operatin on selection maps or selection
       chunks, where each represent same information but in different ways.
       For example, consider embedded number in	a bidi text. For the sake of
       clarity I'll use	latin characters here. Let's have a text scalar
       containing these	characters:


       where ABC is right-to-left text,	and which, when	rendered on screen,
       should be displayed as


       (and index array	is (3,4,5,2,1,0) ).

       Next, the user clicks the mouse between A and B (in text	offset 1),
       drags the mouse then to the left, and finally stops between characters
       2 and 3 (text offset 4).	The resulting selection	then should not	be, as
       one might naively expect, this:


       but this	instead:


       because the next	character after	C is 1,	and the	range of the selected
       sub-text	is from	characters 1 to	4.

       The class offers	to encode such information in a	map, i.e. array	of
       integers	"1,1,0,1,1,0", where each entry	is either 0 or 1 depending on
       whether the cluster is or is not	selected.  Alternatively, the same
       information can be encoded in chunks, or	RLE sets, as array
       "0,2,1,2,1", where the first integer signifies number of	non-selected
       clusters	to display, the	second - number	of selected clusters, the
       third the non-selected again, etc. If the first character belongs to
       the selected chunk, the first integer in	the result is set to 0.

   Bidi	input
       When sending input to a widget in order to type in text,	the otherwise
       trivial case of figuring	out at which position the text should be
       inserted	(or removed, for that matter), becomes interesting when	there
       are characters with mixed direction.

       F ex it is indeed trivial, when the latin text is "AB", and the cursor
       is positioned between "A" and "B", to figure out	that whenever the user
       types "C", the result should become "ACB". Likewise, when the text is
       RTL and both text and input is arabic, the result is the	same. However
       when f.ex. the text is "A1", that is displayed as "1A" because of RTL
       shaping,	and the	cursor is positioned between 1 (LTR) and "A" (RTL), it
       is not clear whether that means the new input should be appended	after
       1 and become "A1C", or after "A", and become, correspondingly, "AC1".

       There is	no easy	solution for this problem, and different programs
       approach	this differently, and some go as far as	to provide two cursors
       for both	directions. The	class offers its own solution that uses	some
       primitive heuristics to detect whether cursor belongs to	the left or to
       the right glyph.	 This is the area that can be enhanced,	and any	help
       from native users of RTL	languages can be greatly appreciated.

       abc $CANVAS, $INDEX
	   Returns a, b, c metrics from	the glyph $INDEX

	   Read-only accessor to the advances array, see Structure above.

	   Clones the object

       cluster2glyph $FROM, $LENGTH
	   Maps	a range	of clusters starting with $FROM	with size $LENGTH into
	   the corresponding range of glyphs. Undefined	$LENGTH	calculates the
	   range from $FROM till the object end.

       cluster2index $CLUSTER
	   Returns character offset of the first character in cluster

	   Note: result	may contain "to::RTL" flag.

       cluster2range $CLUSTER
	   Returns character offset of the first character in cluster $CLUSTER
	   and how many	characters are there in	the cluster.

	   Returns array of integers where each	is a first character offsets
	   per cluster.

       cursor2offset $AT_CLUSTER, $PREFERRED_RTL
	   Given a cursor positioned next to the cluster $AT_CLUSTER, runs
	   simple heuristics to	see what character offset it corresponds to.
	   $PREFERRED_RTL is used when object data are not enough.

	   See "Bidi input" above.

       def $CANVAS, $INDEX
	   Returns d, e, f metrics from	the glyph $INDEX

	   Read-only accessor to the font indexes, see Structure above.

       get_box $CANVAS
	   Return box metrics of the glyph object.

	   See "get_text_box" in Prima::Drawable.

       get_sub $FROM, $LENGTH
	   Extracts and	clones a new object that constains data	from cluster
	   offset $FROM, with cluster length $LENGTH.

       get_sub_box $CANVAS, $FROM, $LENGTH
	   Calculate box metrics of a glyph string from	the cluster $FROM with
	   size	$LENGTH.

       get_sub_width $CANVAS, $FROM, $LENGTH
	   Calculate pixel width of a glyph string from	the cluster $FROM with
	   size	$LENGTH.

       get_width $CANVAS, $WITH_OVERHANGS
	   Return width	of the glyph objects, with overhangs if	requested.

       glyph2cluster $GLYPH
	   Return the cluster that contains $GLYPH.

	   Read-only accessor to the glyph indexes, see	Structure above.

	   Returns array where each glyph position is set to a number showing
	   how many glyphs the cluster occupies	at this	position

       index2cluster $INDEX
	   Returns the cluster that contains the character offset $INDEX.

	   Read-only accessor to the indexes, see Structure above.

	   Returns array where each glyph position is set to a number showing
	   how many characters the cluster occupies at this position

	   First integer from the "overhangs" result.

	   Returns a map of integers where each	character position corresponds
	   to a	glyph position.	The name is a rudiment from pure fribidi
	   shaping, where "log2vis" and	"vis2log" were mapper functions	with
	   the same functionality.

	   Calculates how many clusters	the object contains.

       new @ARRAYS
	   Create new object. Not used directly, but rather from inside
	   "text_shape"	calls.

       new_array NAME
	   Creates an array suitable for the object for	direct insertion, if
	   manual construction of the object is	needed.	F ex one may set
	   missing "fonts" array like this:

	      $obj->[ Prima::Drawable::Glyphs::FONTS() ] = $obj->new_array('fonts');
	      $obj->fonts->[0] = 1;

	   The newly created array is filled with zeros.

	   Creates a new empty object.

	   Calculates two pixel	widths for overhangs in	the beginning and in
	   the end of the glyph	string.	 This is used in emulation of a
	   "get_text_width" call with the "to::AddOverhangs" flag.

	   Read-only accessor to the positions array, see Structure above.

       reorder_text TEXT
	   Returns a visual representation of "TEXT" assuming it was the input
	   of the "text_shape" call that created the object.

	   Creates a new object	that has all arrays reversed. User for
	   calculation of pixel	offset from the	right end of a glyph string.

	   Second integer from the "overhangs" result.

       selection2range $CLUSTER_START $CLUSTER_END
	   Converts cluster selection range into text selection	range

       selection_chunks_clusters, selection_chunks_glyphs $START, $END
	   Calculates a	set of chunks of texts,	that, given a text selection
	   from	positions $START to $END, represent each either	a set of
	   selected and	non-selected clusters/glyphs.

       selection_diff $OLD, $NEW
	   Given set of	two chunk lists, in format as returned by
	   "selection_chunks_clusters" or "selection_chunks_glyphs",
	   calculates the list of chunks affected by the selection change. Can
	   be used for efficient repaints when the user	interactively changes
	   text	selection, to redraw only the changed regions.

       selection_map_clusters, selection_map_glyphs $START, $END
	   Same	as "selection_chunks_XXX", but instead of RLE chunks returns
	   full	array for each cluster/glyph, where each entry is a boolean
	   value corresponding to whether that cluster/glyph is	to be
	   displayed as	selected, or not.

       selection_walk $CHUNKS, $FROM, $TO = length, $SUB
	   Walks the selection chunks array, returned by "selection_chunks",
	   between $FROM and $TO clusters/glyphs, and for each chunk calls the
	   provided "$SUB->($offset, $length, $selected)", where each call
	   contains 2 integers to chunk	offset and length, and a boolean flag
	   whether the chunk is	selected or not.

	   Can be also used on a result	of "selection_diff", in	which case
	   $selected flag is irrelevant.

       sub_text_out $CANVAS, $FROM, $LENGTH, $X, $Y
	   Optimized version of	"$CANVAS->text_out( $self->get_sub($FROM,
	   $LENGTH), $X, $Y )".

       sub_text_wrap $CANVAS, $FROM, $LENGTH, $WIDTH, $OPT, $TABS
	   Optimized version of	"$CANVAS->text_wrap( $self->get_sub($FROM,
	   $LENGTH), $WIDTH, $OPT, $TABS )".  The result is also converted to

	   Returns the length of the text that was shaped and that produced
	   the object.

       x2cluster $CANVAS, $X, $FROM, $LENGTH
	   Given sub-cluster from $FROM	with size $LENGTH, calculates how many
	   clusters would fit in width $X.

       This section is only there to test proper rendering

	   Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed	do
	   eiusmod tempor incididunt ut	labore et dolore magna aliqua.

	      Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

       Latin combining
	   DIuIiIsI aI,uI<micro>tIeI~ iILrI!uIrIeIY dIoIlIoI>>rI<< iI(R)nI
	   rIeIpIrI"eI3/4hIeInIdIeI rIiI3/4tI iILnI.
				 _							       _
	      vI<degree>eI.lI^3iI tI1/2	eI<micro>sI1/4sIeI(R) cIiI<micro>lIlIuImI dI?oIlIoIrIeI	 eI<degree>uI fI(C)uI<section>gI|iI(C)aItI nIuI1/4lI(C)lI aI pIaI1/2rIiIaItI^3uIrI

	   Lorem Ipsum Dh,NDh?Dh3/4Dh>>NDh.NNN Dh?Dh3/4NDh3/4Dh1/4N, NNDh3/4
	   NDh3/4N Dh3/4Dh+-Dh<micro>NDh?Dh<micro>NDh,Dh^2Dh<degree>Dh<micro>N
	   Dh+-Dh3/4Dh>>Dh<micro>Dh<micro> Dh,Dh>>Dh,

	   Dh<degree> NDh<degree>DhoDh<paragraph>Dh<micro>
	   Dh+-NDhoDh^2	Dh, Dh?NDh3/4Dh+-Dh<micro>Dh>>Dh3/4Dh^2	Dh^2

	   xxxx	xcxxxx xxxx!x!xa x(C)xxcxax x(C)x xx<section>xx"x xaxxx
	   xxx!xxa xcx xxx xx<section>xx! x<section>x"xx xxx(C)x" xxx xxxx

	     xxxx"x xx(C)xxxx(C) x-Lorem Ipsum xxx x(C)xx(C) xx	xxxxxa xx xxxax" xaxxxx|x x(C)x	xxxaxxxa, xx xxxx xxxx

       Arabic		  _  _
	   O<section>UO^1O UO  UU O"O+-O<section>UO O<section>UUO'O+-
	   O<section>UUUOaO"U UO"O+-O<section>UO OaOOUO+- O<micro>UO-
	   O<section>Oa	O<section>UUUO"	OaO^3OaO(R)O U UUO+-UU OYUO"O^3UU
	   O"O'UU OYUOaO+-O<section>O<paragraph>U
											 _										_  _
	     UUUUO<degree>O~ O^1U O<section>UUO<micro>O	UOYO<degree>O<section> UUOa O"OYO O(R)O<section>U "lorem ipsum"	UU OLU UOO+-U O"OO<< O^3OaO,UO+- O<section>UO^1O UO  UU

	   Lorem Ipsum axaY axaxax<paragraph> axax ax<degree>aYaxa ax(R)aYax
	   axaxaax^2ax~aYax<section> ax^1aYax, ax^2aYaxax?ax"
	   ax~ax^1aYax(R)axx axaY axax?ax,aY axax"aYax	ax<degree>aYaxa
	   ax(R)aYax axaax<degree>ax?ax<micro>ax<degree>aYaxxax" axax3/4
	   ax,ax3/4ax(R)ax"ax3/4 axax<degree>ax"ax3/_ axaax!ax1/4ax3/4 ax^1aY,
	   ax^1ax3/4ax,aYax  ax!ax3/4ax^2ax"ax3/4 ax ax3/4
	   axaYax<degree>ax(R)ax<degree>ax^1ax?axx ax<paragraph>ax~aYax| ,
										 _						 _								       _									     _										 _
	     axaY axxax"ax?ax axaY ax<micro>ax?ax<paragraph>aYax<micro>ax,ax"aYax  ax"ax^1aYax ax^2ax ax<degree>ax^1aY ax^1aY. ax ax|ax? axaxa Lorem Ipsum axaY	axax axax"aYaxaYaxaYax|	axax3/4	axaxaax	aYax axax<degree>ax"aY axax3/4 ax<degree>ax^1aY	ax^1aYax, axxaY	axaxa axaY ax axaYax" ax|ax?ax^2ax3/4 ax|aYax axax? axaax3/4ax	axaY ax(R)ax<section>aYax  ax(R)aYax ax<micro>ax^1ax3/4ax axaYax axaY ax<paragraph>ax<degree>aYax(R)ax"ax3/4ax axax?axaax3/4 ax^1aYax ax"ax^1aYax ax^1aY.

       Chinese _		_									    _  _
	   ae a	a|e(R)xi1/4a1/2e >>ea"ae<micro>e<section>a,a,ae!<micro>eccaecae<paragraph>i1/4e3/4aa1/4ec<<a ee	>>caa(R)^1aeaaeLae^3"aeaa
					   _		_
	     Lorem Ipsumcc(R)ca<degree>+-ae a,oaoa?aeaae axaxa<degree>a<degree>ae aaa^1^3

       Largest well-known grapheme cluster in Unicode


       Dmitry Karasik, <>.


perl v5.32.0			  2020-05-26	    Prima::Drawable::Glyphs(3)


Want to link to this manual page? Use this URL:

home | help