Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
Jcode(3)	      User Contributed Perl Documentation	      Jcode(3)

       Jcode - Japanese	Charset	Handler

	use Jcode;
	# traditional
	Jcode::convert(\$str, $ocode, $icode, "z");
	# or OOP!
	print Jcode->new($str)->h2z->tr($from, $to)->utf8;

       <Japanese document is now available as Jcode::Nihongo. >	supports both object and traditional approach.	With object
       approach, you can go like;

	 $iso_2022_jp =	Jcode->new($str)->h2z->jis;

       Which is	more elegant than:

	 $iso_2022_jp =	$str;
	 &jcode::convert(\$iso_2022_jp,	'jis', &jcode::getcode(\$str), "z");

       For those unfamiliar with objects, still supports "getcode()"
       and "convert()."

       If the perl version is 5.8.1, Jcode acts	as a wrapper to	Encode,	the
       standard	charset	handler	module for Perl	5.8 or later.

       Methods mentioned here all return Jcode object unless otherwise

       $j = Jcode->new($str [, $icode])
	 Creates Jcode object $j from $str.  Input code	is automatically
	 checked unless	you explicitly set $icode. For available charset, see
	 getcode below.

	 For perl 5.8.1	or better, $icode can be any encoding name that	Encode

	   $j =	Jcode->new($european, 'iso-latin1');

	 When the object is stringified, it returns the	EUC-converted string
	 so you	can <print $j> instead of <print $j->euc>.

	 Passing Reference
	   Instead of scalar value, You	can use	reference as


	   This	saves time a little bit.  In exchange of the value of $str
	   being converted. (In	a way, $str is now "tied" to jcode object).

       $j->set($str [, $icode])
	 Sets $j's internal string to $str.  Handy when	you use	Jcode object
	 repeatedly (saves time	and memory to create object).

	  # converts mailbox to	SJIS format
	  my $jconv = new Jcode;
	  $/ = 00;
	      print $jconv->set(\$_)->mime_decode->sjis;

       $j->append($str [, $icode]);
	 Appends $str to $j's internal string.

       $j = jcode($str [, $icode]);
	 shortcut for Jcode->new() so you can go like;

   Encoded Strings
       In general, you can retrieve encoded string as $j->encoded.

       $sjis = jcode($str)->sjis
       $euc = $j->euc
       $jis = $j->jis
       $sjis = $j->sjis
       $ucs2 = $j->ucs2
       $utf8 = $j->utf8
	 What you code is what you get :)

       $iso_2022_jp = $j->iso_2022_jp
	 Same as "$j->h2z->jis".  Hankaku Kanas	are forcibly converted to

	 For perl 5.8.1	and better, you	can also use any encoding names	and
	 aliases that Encode supports.	For example:

	   $european = $j->iso_latin1; # replace '-' with '_' for names.

	 FYI: Encode::Encoder uses similar trick.

	   For perl is 5.8.1 or	better,	Jcode stores the internal string in
	   UTF-8.  Any character that does not map to -_encoding are replaced
	   with	a '?', which is	Encode standard.

	     my	$unistr	= "\x{262f}"; #	YIN YANG
	     my	$j = jcode($unistr);  #	$j->euc	is '?'

	   You can change this behavior	by specifying fallback like Encode.
	   Values are the same as Encode.  "Jcode::FB_PERLQQ",
	   "Jcode::FB_XMLCREF",	"Jcode::FB_HTMLCREF" are aliased to those of
	   Encode for convenice.

	     print $j->fallback(Jcode::FB_PERLQQ)->euc;	  # '\x{262f}'
	     print $j->fallback(Jcode::FB_XMLCREF)->euc;  # '&#x262f;'
	     print $j->fallback(Jcode::FB_HTMLCREF)->euc; # '&#9775;'

	   The global variable $Jcode::FALLBACK	stores the default fallback so
	   you can override that by assigning the value.

	     $Jcode::FALLBACK =	Jcode::FB_PERLQQ; # set	default	fallback scheme

       [@lines =] $jcode->jfold([$width, $newline_str, $kref])
	 folds lines in	jcode string every $width (default: 72)	where $width
	 is the	number of "halfwidth" character.  Fullwidth Characters are
	 counted as two.

	 with a	newline	string spefied by $newline_str (default: "\n").

	 Rudimentary kinsoku suppport is now available for Perl	5.8.1 and

       $length = $jcode->jlength();
	 returns character length properly, rather than	byte length.

   Methods that	use MIME::Base64
       To use methods below, you need MIME::Base64.  To	install, simply

	  perl -MCPAN -e 'CPAN::Shell->install("MIME::Base64")'

       If your perl is 5.6 or better, there is no need since MIME::Base64 is

       $mime_header = $j->mime_encode([$lf, $bpl])
	 Converts $str to MIME-Header documented in RFC1522.  When $lf is
	 specified, it uses $lf	to fold	line (default: \n).  When $bpl is
	 specified, it uses $bpl for the number	of bytes (default: 76; this
	 number	must be	smaller	than 76).

	 For Perl 5.8.1	or better, you can also	encode MIME Header as:

	   $mime_header	= $j->MIME_Header;

	 In which case the resulting $mime_header is MIME-B-encoded UTF-8
	 whereas "$j->mime_encode()" returnes MIME-B-encoded ISO-2022-JP.
	 Most modern MUAs support both.

	 Decodes MIME-Header in	Jcode object.  For perl	5.8.1 or better, you
	 can also do the same as:

	   Jcode->new($str, 'MIME-Header')

   Hankaku vs. Zenkaku
	 Converts X201 kana (Hankaku) to X208 kana (Zenkaku).  When
	 $keep_dakuten is set, it leaves dakuten as is (That is, "ka +
	 dakuten" is left as is	instead	of being converted to "ga")

	 You can retrieve the number of	matches	via $j->nmatch;

	 Converts X208 kana (Zenkaku) to X201 kana (Hankaku).

	 You can retrieve the number of	matches	via $j->nmatch;

   Regexp emulators
       To use "->m()" and "->s()", you need perl 5.8.1 or better.

       $j->tr($from, $to, $opt);
	 Applies "tr/$from/$to/" on Jcode object where $from and $to are EUC-
	 JP strings.  On perl 5.8.1 or better, $from and $to can also be
	 flagged UTF-8 strings.

	 If $opt is set, "tr/$from/$to/$opt" is	applied.  $opt must be 'c',
	 'd' or	the combination	thereof.

	 You can retrieve the number of	matches	via $j->nmatch;

	 The following methods are available only for perl 5.8.1 or better.

       $j->s($patter, $replace,	$opt);
	 Applies "s/$pattern/$replace/$opt". $pattern and "replace" must be in
	 EUC-JP	or flagged UTF-8. $opt are the same as regexp options.	See
	 perlre	for regexp options.

	 Like "$j->tr()", "$j->s()" returns the	object itself so you can nest
	 the operation as follows;

	   $j->tr("a-z", "A-Z")->s("foo", "bar");

       [@match = ] $j->m($pattern, $opt);
	 Applies "m/$patter/$opt".  Note that this method DOES NOT RETURN AN
	 OBJECT	so you can't chain the method like  "$j->s()".

   Instance Variables
       If you need to access instance variables	of Jcode object, use access
       methods below instead of	directly accessing them	(That's	what OOP is
       all about)

       FYI, Jcode uses a ref to	array instead of ref to	hash (common way) to
       optimize	speed (Actually	you don't have to know as long as you use
       access methods instead;	Once again, that's OOP)

	 Reference to the EUC-coded String.

	 Input charcode	in recent operation.

	 Number	of matches (Used in $j->tr, etc.)

       ($code, [$nmatch]) = getcode($str)
	 Returns char code of $str. Return codes are as	follows

	  ascii	  Ascii	(Contains no Japanese Code)
	  binary  Binary (Not Text File)
	  euc	  EUC-JP
	  sjis	  SHIFT_JIS
	  jis	  JIS (ISO-2022-JP)
	  ucs2	  UCS2 (Raw Unicode)
	  utf8	  UTF8

	 When array context is used instead of scaler, it also returns how
	 many character	codes are found.  As mentioned above, $str can be
	 \$str instead. Users:  This function	is 100%	upper-conpatible with
	 jcode::getcode() -- well, almost;

	  * When its return value is an	array, the order is the	opposite;
	    jcode::getcode() returns $nmatch first.

	  * jcode::getcode() returns 'undef' when the number of	EUC characters
	    is equal to	that of	SJIS.  Jcode::getcode()	returns	EUC.  for there is no in-betweens.

       Jcode::convert($str, [$ocode, $icode, $opt])
	 Converts $str to char code specified by $ocode.  When $icode is
	 specified also, it assumes $icode for input string instead of the one
	 checked by getcode(). As mentioned above, $str	can be \$str instead. Users:  This function	is 100%	upper-conpatible with
	 jcode::convert() !

       For perl	is 5.8.1 or later, Jcode acts as a wrapper to Encode.  Meaning
       Jcode is	subject	to bugs	therein.

       This package owes a lot in motivation, design, and code,	to the	for Perl4 by Kazumasa Utashiro <>.

       Hiroki Ohzaki <> has helped me polish regexp from
       the very	first stage of development.

       JEncode by has inspired me to integrate Encode to
       Jcode.  He has also contributed Japanese	POD.

       And folks at Jcode Mailing list <>.  Without them, I
       couldn't	have coded this	far.




       Copyright 1999-2005 Dan Kogai <>

       This library is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.

perl v5.32.0			  2008-05-10			      Jcode(3)


Want to link to this manual page? Use this URL:

home | help