Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Lingua::Han::Utils(3) User Contributed Perl DocumentationLingua::Han::Utils(3)

NAME
       Lingua::Han::Utils - The	utility	tools of Chinese character(HanZi)

SYNOPSIS
	   use Lingua::Han::Utils qw/Unihan_value csplit cdecode csubstr clength/;

	   # cdecode
	   # the same as decode('cp936', $word)	in ASCII editing mode
	   #	     and decode('utf8',	$word) in Unicode editing mode
	   my $word = cdecode($word);

	   # Unihan_value
	   # return the	first field of Unihan.txt on unicode.org
	   my $word = "ae";
	   my $unihan =	Unihan_value($word); # return '6211'
	   my $words = "c+-a1/2	";
	   my @unihan =	Unihan_value($word); # return (7231, 4F60)
	   my $unihan =	Unihan_value($word); # return 72314F60

	   # csplit
	   # split the Chinese characters into an array
	   my $words = "aec+-a1/2 ";
	   my @words = csplit($words); # return	("ae", "c+-", "a1/2 ")

	   # csubstr
	   # treat the Chinese characters as one
	   # so	it's the same as splice(csplit($words),	$offset, $length)
	   my $words = "aec+-a1/2 a";
	   my @words = csubstr($words, 1, 2); #	return ("c+-", "a1/2 ")
	   my @words = csubstr($words, 1); # return ("c+-", "a1/2 ", "a")
	   my $words = csubstr($words, 1, 2); #	c+-a1/2

	   # clength
	   # treat the Chinese character as one
	   my $words = "aec+-a1/2 ";
	   print clength($words); # 3

EXPORT
       Nothing is exported by default.

EXPORT_OK
       cdecode
	   use Encode::Guess to	decode the character. It behavers like:
	   decode('cp936', $word) under	ASCII editing mode and decode('utf8',
	   $word) under	Unicode	editing	mode.

       Unihan_value
	   the first field of Unihan.txt is the	Unicode	scalar value as
	   U+[x]xxxx, we return	the [x]xxxx.

       csplit
	   split the Chinese characters	into an	array, English words can be
	   mixed in.

       csubstr(WORD, OFFSET, LENGTH)
	   treat the Chinese character as one word, substr it.

	   (BE CAFEFUL!	it's NOT lvalue, we cann't use csubstr($word, 2, 3) =
	   $REPLACEMENT)

	   if no LENGTH	is specified, substr form OFFSET to END.

       clength
	   treat the Chinese character as one word(length 1).

DOCUMENT
       a Chinese version of document can be found @
       <http://www.fayland.org/journal/Lingua-Han-Utils.html>

AUTHOR
       Fayland Lam, "<fayland at gmail.com>"

BUGS
       Please report any bugs or feature requests to "bug-lingua-han-utils at
       rt.cpan.org", or	through	the web	interface at
       <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Lingua-Han-Utils>.  I
       will be notified, and then you'll automatically be notified of progress
       on your bug as I	make changes.

SUPPORT
       You can find documentation for this module with the perldoc command.

	   perldoc Lingua::Han::Utils

       You can also look for information at:

       o   AnnoCPAN: Annotated CPAN documentation

	   <http://annocpan.org/dist/Lingua-Han-Utils>

       o   CPAN	Ratings

	   <http://cpanratings.perl.org/d/Lingua-Han-Utils>

       o   RT: CPAN's request tracker

	   <http://rt.cpan.org/NoAuth/Bugs.html?Dist=Lingua-Han-Utils>

       o   Search CPAN

	   <http://search.cpan.org/dist/Lingua-Han-Utils>

ACKNOWLEDGEMENTS
       the wonderful Encode::Guess

COPYRIGHT & LICENSE
       Copyright 2005 Fayland Lam, all rights reserved.

       This program is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.

perl v5.32.1			  2014-09-16		 Lingua::Han::Utils(3)

NAME | SYNOPSIS | EXPORT | EXPORT_OK | DOCUMENT | AUTHOR | BUGS | SUPPORT | ACKNOWLEDGEMENTS | COPYRIGHT & LICENSE

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Lingua::Han::Utils&sektion=3&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help