Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
PerlIO::via::UnidecodeUser Contributed Perl DocumentaPerlIO::via::Unidecode(3)

NAME
       PerlIO::via::Unidecode -	a perlio layer for Unidecode

SYNOPSIS
	 # An example program using the	perlio layer:

	 % cat utf8translit
	 #!/usr/bin/perl
	 use strict;
	 use PerlIO::via::Unidecode;
	 foreach my $fs	(@ARGV)	{
	   open( my $IN,
	     '<:encoding(utf8):via(Unidecode)',	# the layers
	     $fs
	    ) or die "$f -> $!\n";
	   print while <$IN>;
	   close($IN);
	 }
	 __END__

	 # We're feeding it this file, which is	the Chinese
	 # characters for Beijing (in UTF8)

	 % od -x home_city.txt
	 000000:  E5 8C	97 E4 BA B0 0D 0A

	 So:

	 % utf8translit	home_city.txt
	 Bei Jing

DESCRIPTION
       PerlIO::via::Unidecode implements a PerlIO::via layer that applies
       Unidecode (Text::Unidecode) to data passed through it.

       You can use PerlIO::via::Unidecode on already-Unicode data, as in the
       example in the SYNOPSIS;	or you can combine it with other layers, as in
       this little program that	converts KOI8R text into Unicode and then
       feeds it	to Unidecode, which then outputs an ASCII transliteration:

	 % cat transkoi8r
	 #!/usr/bin/perl
	 use strict;
	 use PerlIO::via::Unidecode;
	 foreach my $filespec (@ARGV) {
	   open(	  # Three-argument open	is always great
	     my	$IN,
	     '<:encoding(koi8-r):via(Unidecode)',  # the layers
	     $filespec ) or die	$!;

	   print while <$IN>;
	   close($IN);
	 }
	 __END__

	 % cat fet_koi8r.txt

	 A<<AAAA AAAAAA	AA AAAAAAAAAAA AAAAAA,
	 A<section>AA AAAAAA AAAAAAA AAA AAAAAA	AAAA AAAAAA
	 A(C) AAAAAAA AAAAAAA AAAAAAAAAA AAAAAA,-
	       A(R)A AAAAAAAAA AA A AAA?

	 % transkoi8r fet_koi8r.txt

	 Koghda	chitala	ty muchitiel'nyie stroki,
	 Gdie sierdtsa zvuchnyi	pyl siian'ie l'iet krughom
	 I strasti rokovoi vzdymaiutsia	potoki,-
	     Nie vspomnila l' o	chiem?

       Of course, you could do this all	by manually calling Text::Unidecode's
       "unidecode(...)"	function on every line you fetch, but that's just what
       ":via(...)" layers do automatically do for you.

       Note that you can also use ":via(Unidecode)" as an output layer too.
       In that case, add a dummy ":utf8" after it, as below, just to silence
       some "wide character in print" warnings that you	might otherwise	see.

	 % cat writebei.pl
	 use PerlIO::via::Unidecode;
	 open(
	   my $OUT,
	   ">:via(Unidecode):utf8",  # the layers
	   "roman_bei.txt"
	  ) or die $!;
	 print $OUT "\x{5317}\x{4EB0}\n";
	   # those are the Chinese characters for Beijing
	 close($OUT);

	 % perl	writebei.pl

	 % cat roman_bei.txt
	 Bei Jing

FUNCTIONS AND METHODS
       This module provides no public functions	or methods a everything	is
       done thru the "via" interface.  If you want a function, see
       Text::Unidecode.

TIPS
       Don't forget the	"use PerlIO::via::Unidecode;" line, and	be sure	to get
       the case	right.

       Don't type "Unicode" when you mean "Unidecode", nor vice	versa.

       Handy layer-modes to remember:

	 <:encoding(utf8):via(Unidecode)
	 <:encoding(some-other-encoding):via(Unidecode)
	 >:via(Unidecode):utf8

SEE ALSO
       Text::Unidecode

       PerlIO::via

       Encode and Encode::Supported (even though the modes they	implement are
       called as "":encoding(...)"").

       PerlIO::via::PinyinConvert

       perlunitut and perlunicode

       <https://en.wikipedia.org/wiki/Afanasy_Fet>

NOTES
       Note that if Unidecode's	transliteration	of something changes, so will
       its effect on ":via(Unidecode)".	 So the	first word of the above	text
       is "Koghda" from	one particular version of Unidecode, and "Kogda" from
       another.

       Thanks for Jarkko Hietaniemi for	help with this module and many other
       things besides.

THE POEM
       In the first release of this module, I forgot to	give the source	of the
       above Russian text!  So here it is:

       The Russian text	is the first stanza of a poem by Afanasy Afanasevich
       Fet (1822-1892).	 Above I have shown only its first stanza ("Koghda
       chitala..."), first in raw KOI8R, then passed through Unidecode.	 But
       here it is, in its entirety:

	 DhDh3/4Dh^3Dh'Dh<degree> NDh,NDh<degree>Dh>>Dh<degree>	NN Dh1/4NNDh,NDh<micro>Dh>>NDh1/2NDh<micro> NNNDh3/4DhoDh,,
	 DhDh'Dh<micro>	NDh<micro>NDh'NDh<degree> Dh.Dh^2NNDh1/2NDh^1 Dh?NDh>> NDh,NDh1/2NDh<micro> Dh>>NNN DhoNNDh^3Dh3/4Dh1/4
	 Dh NNNDh<degree>NNDh, NDh3/4DhoDh3/4Dh^2Dh3/4Dh^1 Dh^2Dh.Dh'NDh1/4Dh<degree>NNNN Dh?Dh3/4NDh3/4DhoDh,,a
	   DhDh<micro> Dh^2NDh?Dh3/4Dh1/4Dh1/2Dh,Dh>>Dh<degree>	Dh>>N Dh3/4 NNDh1/4?
	   _
	 Dh  Dh^2Dh<micro>NDh,NN Dh1/2Dh<micro>	NDh3/4NN! DhDh3/4Dh^3Dh'Dh<degree> Dh^2	NNDh<micro>Dh?Dh,, DhoDh<degree>Dho Dh'Dh,Dh^2Dh3/4,
	 Dh Dh?Dh3/4Dh>>Dh1/2Dh3/4NDh1/2Dh3/4Dh^1 NDh<micro>Dh1/4Dh1/2Dh3/4NDh<micro> Dh+-Dh<micro>Dh.Dh^2NDh<micro>Dh1/4Dh<micro>Dh1/2Dh1/2Dh3/4 Dh^3Dh3/4NN,
	 DhDh'Dh<degree>Dh>>Dh,	Dh?Dh<micro>NDh<micro>Dh' NDh3/4Dh+-Dh3/4Dh^1 Dh?NDh3/4Dh.NDh<degree>NDh1/2Dh3/4 Dh, DhoNDh<degree>NDh,Dh^2Dh3/4
	   DhNNDh<degree>Dh^2Dh<degree>Dh>>Dh<degree> Dh^2Dh'NNDh^3 Dh.Dh<degree>NN.

	 Dh Dh^2 NNN DhoNDh<degree>NDh3/4NN Dh1/2Dh<micro>Dh^2Dh3/4Dh>>NDh1/2Dh3/4 Dh^2Dh.Dh3/4N NNDh1/2NDh>>Dh3/4,
	 Dh NDh3/4N Dh^2Dh<micro>Dh>>Dh,NDh<degree>Dh^2NDh^1 Dh+-Dh>>Dh<micro>NDho Dh.Dh<degree> NNDh1/4Dh1/2NDh^1 Dh^2Dh<micro>NN Dh?NDh<micro>Dh'Dh<micro>Dh>>,a
	 DhLDh<paragraph>Dh<micro>Dh>>N	Dh1/2Dh,NNDh3/4	NDh<micro>Dh+-Dh<micro>	Dh^2 NDh3/4 Dh^2NDh<micro>Dh1/4N Dh1/2Dh<micro>	NDh<micro>Dh?Dh1/2NDh>>Dh3/4:
	   A<<DhcDh<degree>Dh1/4 NDh<micro>Dh>>Dh3/4Dh^2Dh<micro>Dho NDh^3Dh3/4NDh<micro>Dh>>!A>>

	    aDhNDh<degree>Dh1/2Dh<degree>NDh,Dh^1 DhNDh<degree>Dh1/2Dh<degree>NNDh<micro>Dh^2Dh,N DhxDh<micro>N, 15 NDh<micro>Dh^2NDh<degree>Dh>>N 1887

       Its conventional	English	title is a translation of the first line,
       "When you were reading those tormented lines"a which I found rather apt
       for a poem about	mangled	encodings.

COPYRIGHT AND DISCLAIMER
       With the	exception of the text of the poem, this	is copyright 2003,
       2014, Sean M. Burke sburke@cpan.org, all	rights reserved. This program
       is free software; you can redistribute it and/or	modify it under	the
       same terms as Perl itself.

       The programs and	documentation in this dist are distributed in the hope
       that they will be useful, but without any warranty; without even	the
       implied warranty	of merchantability or fitness for a particular
       purpose.

AUTHOR
       Sean M. Burke  sburke@cpan.org

perl v5.32.0			  2014-07-27	     PerlIO::via::Unidecode(3)

NAME | SYNOPSIS | DESCRIPTION | FUNCTIONS AND METHODS | TIPS | SEE ALSO | NOTES | THE POEM | COPYRIGHT AND DISCLAIMER | AUTHOR

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=PerlIO::via::Unidecode&sektion=3&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help