Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Jcode(3)	      User Contributed Perl Documentation	      Jcode(3)

NAME
       Jcode - Japanese	Charset	Handler

SYNOPSIS
	use Jcode;
	#
	# traditional
	Jcode::convert(\$str, $ocode, $icode, "z");
	# or OOP!
	print Jcode->new($str)->h2z->tr($from, $to)->utf8;

DESCRIPTION
       <Japanese document is now available as Jcode::Nihongo. >

       Jcode.pm	supports both object and traditional approach.	With object
       approach, you can go like;

	 $iso_2022_jp =	Jcode->new($str)->h2z->jis;

       Which is	more elegant than:

	 $iso_2022_jp =	$str;
	 &jcode::convert(\$iso_2022_jp,	'jis', &jcode::getcode(\$str), "z");

       For those unfamiliar with objects, Jcode.pm still supports "getcode()"
       and "convert()."

       If the perl version is 5.8.1, Jcode acts	as a wrapper to	Encode,	the
       standard	charset	handler	module for Perl	5.8 or later.

Methods
       Methods mentioned here all return Jcode object unless otherwise
       mentioned.

   Constructors
       $j = Jcode->new($str [, $icode])
	 Creates Jcode object $j from $str.  Input code	is automatically
	 checked unless	you explicitly set $icode. For available charset, see
	 getcode below.

	 For perl 5.8.1	or better, $icode can be any encoding name that	Encode
	 understands.

	   $j =	Jcode->new($european, 'iso-latin1');

	 When the object is stringified, it returns the	EUC-converted string
	 so you	can <print $j> instead of <print $j->euc>.

	 Passing Reference
	   Instead of scalar value, You	can use	reference as

	   Jcode->new(\$str);

	   This	saves time a little bit.  In exchange of the value of $str
	   being converted. (In	a way, $str is now "tied" to jcode object).

       $j->set($str [, $icode])
	 Sets $j's internal string to $str.  Handy when	you use	Jcode object
	 repeatedly (saves time	and memory to create object).

	  # converts mailbox to	SJIS format
	  my $jconv = new Jcode;
	  $/ = 00;
	  while(&lt;&gt;){
	      print $jconv->set(\$_)->mime_decode->sjis;
	  }

       $j->append($str [, $icode]);
	 Appends $str to $j's internal string.

       $j = jcode($str [, $icode]);
	 shortcut for Jcode->new() so you can go like;

   Encoded Strings
       In general, you can retrieve encoded string as $j->encoded.

       $sjis = jcode($str)->sjis
       $euc = $j->euc
       $jis = $j->jis
       $sjis = $j->sjis
       $ucs2 = $j->ucs2
       $utf8 = $j->utf8
	 What you code is what you get :)

       $iso_2022_jp = $j->iso_2022_jp
	 Same as "$j->h2z->jis".  Hankaku Kanas	are forcibly converted to
	 Zenkaku.

	 For perl 5.8.1	and better, you	can also use any encoding names	and
	 aliases that Encode supports.	For example:

	   $european = $j->iso_latin1; # replace '-' with '_' for names.

	 FYI: Encode::Encoder uses similar trick.

	 $j->fallback($fallback)
	   For perl is 5.8.1 or	better,	Jcode stores the internal string in
	   UTF-8.  Any character that does not map to -_encoding are replaced
	   with	a '?', which is	Encode standard.

	     my	$unistr	= "\x{262f}"; #	YIN YANG
	     my	$j = jcode($unistr);  #	$j->euc	is '?'

	   You can change this behavior	by specifying fallback like Encode.
	   Values are the same as Encode.  "Jcode::FB_PERLQQ",
	   "Jcode::FB_XMLCREF",	"Jcode::FB_HTMLCREF" are aliased to those of
	   Encode for convenice.

	     print $j->fallback(Jcode::FB_PERLQQ)->euc;	  # '\x{262f}'
	     print $j->fallback(Jcode::FB_XMLCREF)->euc;  # '&#x262f;'
	     print $j->fallback(Jcode::FB_HTMLCREF)->euc; # '&#9775;'

	   The global variable $Jcode::FALLBACK	stores the default fallback so
	   you can override that by assigning the value.

	     $Jcode::FALLBACK =	Jcode::FB_PERLQQ; # set	default	fallback scheme

       [@lines =] $jcode->jfold([$width, $newline_str, $kref])
	 folds lines in	jcode string every $width (default: 72)	where $width
	 is the	number of "halfwidth" character.  Fullwidth Characters are
	 counted as two.

	 with a	newline	string spefied by $newline_str (default: "\n").

	 Rudimentary kinsoku suppport is now available for Perl	5.8.1 and
	 better.

       $length = $jcode->jlength();
	 returns character length properly, rather than	byte length.

   Methods that	use MIME::Base64
       To use methods below, you need MIME::Base64.  To	install, simply

	  perl -MCPAN -e 'CPAN::Shell->install("MIME::Base64")'

       If your perl is 5.6 or better, there is no need since MIME::Base64 is
       bundled.

       $mime_header = $j->mime_encode([$lf, $bpl])
	 Converts $str to MIME-Header documented in RFC1522.  When $lf is
	 specified, it uses $lf	to fold	line (default: \n).  When $bpl is
	 specified, it uses $bpl for the number	of bytes (default: 76; this
	 number	must be	smaller	than 76).

	 For Perl 5.8.1	or better, you can also	encode MIME Header as:

	   $mime_header	= $j->MIME_Header;

	 In which case the resulting $mime_header is MIME-B-encoded UTF-8
	 whereas "$j->mime_encode()" returnes MIME-B-encoded ISO-2022-JP.
	 Most modern MUAs support both.

       $j->mime_decode;
	 Decodes MIME-Header in	Jcode object.  For perl	5.8.1 or better, you
	 can also do the same as:

	   Jcode->new($str, 'MIME-Header')

   Hankaku vs. Zenkaku
       $j->h2z([$keep_dakuten])
	 Converts X201 kana (Hankaku) to X208 kana (Zenkaku).  When
	 $keep_dakuten is set, it leaves dakuten as is (That is, "ka +
	 dakuten" is left as is	instead	of being converted to "ga")

	 You can retrieve the number of	matches	via $j->nmatch;

       $j->z2h
	 Converts X208 kana (Zenkaku) to X201 kana (Hankaku).

	 You can retrieve the number of	matches	via $j->nmatch;

   Regexp emulators
       To use "->m()" and "->s()", you need perl 5.8.1 or better.

       $j->tr($from, $to, $opt);
	 Applies "tr/$from/$to/" on Jcode object where $from and $to are EUC-
	 JP strings.  On perl 5.8.1 or better, $from and $to can also be
	 flagged UTF-8 strings.

	 If $opt is set, "tr/$from/$to/$opt" is	applied.  $opt must be 'c',
	 'd' or	the combination	thereof.

	 You can retrieve the number of	matches	via $j->nmatch;

	 The following methods are available only for perl 5.8.1 or better.

       $j->s($patter, $replace,	$opt);
	 Applies "s/$pattern/$replace/$opt". $pattern and "replace" must be in
	 EUC-JP	or flagged UTF-8. $opt are the same as regexp options.	See
	 perlre	for regexp options.

	 Like "$j->tr()", "$j->s()" returns the	object itself so you can nest
	 the operation as follows;

	   $j->tr("a-z", "A-Z")->s("foo", "bar");

       [@match = ] $j->m($pattern, $opt);
	 Applies "m/$patter/$opt".  Note that this method DOES NOT RETURN AN
	 OBJECT	so you can't chain the method like  "$j->s()".

   Instance Variables
       If you need to access instance variables	of Jcode object, use access
       methods below instead of	directly accessing them	(That's	what OOP is
       all about)

       FYI, Jcode uses a ref to	array instead of ref to	hash (common way) to
       optimize	speed (Actually	you don't have to know as long as you use
       access methods instead;	Once again, that's OOP)

       $j->r_str
	 Reference to the EUC-coded String.

       $j->icode
	 Input charcode	in recent operation.

       $j->nmatch
	 Number	of matches (Used in $j->tr, etc.)

Subroutines
       ($code, [$nmatch]) = getcode($str)
	 Returns char code of $str. Return codes are as	follows

	  ascii	  Ascii	(Contains no Japanese Code)
	  binary  Binary (Not Text File)
	  euc	  EUC-JP
	  sjis	  SHIFT_JIS
	  jis	  JIS (ISO-2022-JP)
	  ucs2	  UCS2 (Raw Unicode)
	  utf8	  UTF8

	 When array context is used instead of scaler, it also returns how
	 many character	codes are found.  As mentioned above, $str can be
	 \$str instead.

	 jcode.pl Users:  This function	is 100%	upper-conpatible with
	 jcode::getcode() -- well, almost;

	  * When its return value is an	array, the order is the	opposite;
	    jcode::getcode() returns $nmatch first.

	  * jcode::getcode() returns 'undef' when the number of	EUC characters
	    is equal to	that of	SJIS.  Jcode::getcode()	returns	EUC.  for
	    Jcode.pm there is no in-betweens.

       Jcode::convert($str, [$ocode, $icode, $opt])
	 Converts $str to char code specified by $ocode.  When $icode is
	 specified also, it assumes $icode for input string instead of the one
	 checked by getcode(). As mentioned above, $str	can be \$str instead.

	 jcode.pl Users:  This function	is 100%	upper-conpatible with
	 jcode::convert() !

BUGS
       For perl	is 5.8.1 or later, Jcode acts as a wrapper to Encode.  Meaning
       Jcode is	subject	to bugs	therein.

ACKNOWLEDGEMENTS
       This package owes a lot in motivation, design, and code,	to the
       jcode.pl	for Perl4 by Kazumasa Utashiro <utashiro@iij.ad.jp>.

       Hiroki Ohzaki <ohzaki@iod.ricoh.co.jp> has helped me polish regexp from
       the very	first stage of development.

       JEncode by makamaka@donzoko.net has inspired me to integrate Encode to
       Jcode.  He has also contributed Japanese	POD.

       And folks at Jcode Mailing list <jcode5@ring.gr.jp>.  Without them, I
       couldn't	have coded this	far.

SEE ALSO
       Encode

       Jcode::Nihongo

       <http://www.iana.org/assignments/character-sets>

COPYRIGHT
       Copyright 1999-2005 Dan Kogai <dankogai@dan.co.jp>

       This library is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.

perl v5.24.1			  2008-05-10			      Jcode(3)

NAME | SYNOPSIS | DESCRIPTION | Methods | Subroutines | BUGS | ACKNOWLEDGEMENTS | SEE ALSO | COPYRIGHT

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Jcode&sektion=3&manpath=FreeBSD+12.1-RELEASE+and+Ports>

home | help