Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
MARC::Charset(3)      User Contributed Perl Documentation     MARC::Charset(3)

NAME
       MARC::Charset - convert MARC-8 encoded strings to UTF-8

SYNOPSIS
	   # import the	marc8_to_utf8 function
	   use MARC::Charset 'marc8_to_utf8';

	   # prepare STDOUT for	utf8
	   binmode(STDOUT, 'utf8');

	   # print out some marc8 as utf8
	   print marc8_to_utf8($marc8_string);

DESCRIPTION
       MARC::Charset allows you	to turn	MARC-8 encoded strings into UTF-8
       strings.	MARC-8 is a single byte	character encoding that	predates
       unicode,	and allows you to put non-Roman	scripts	in MARC	bibliographic
       records.

	   http://www.loc.gov/marc/specifications/spechome.html

EXPORTS
   ignore_errors()
       Tells MARC::Charset whether or not to ignore all	encoding errors, and
       returns the current setting.  This is helpful if	you have records that
       contain both MARC8 and UNICODE characters.

	   my $ignore =	MARC::Charset->ignore_errors();

	   MARC::Charset->ignore_errors(1); # ignore errors
	   MARC::Charset->ignore_errors(0); # DO NOT ignore errors

   assume_unicode()
       Tells MARC::Charset whether or not to assume UNICODE when an error is
       encountered in ignore_errors mode and returns the current setting.
       This is helpful if you have records that	contain	both MARC8 and UNICODE
       characters.

	   my $setting = MARC::Charset->assume_unicode();

	   MARC::Charset->assume_unicode(1); # assume characters are unicode (utf-8)
	   MARC::Charset->assume_unicode(0); # DO NOT assume characters	are unicode

   assume_encoding()
       Tells MARC::Charset whether or not to assume a specific encoding	when
       an error	is encountered in ignore_errors	mode and returns the current
       setting.	 This is helpful if you	have records that contain both MARC8
       and other characters.

	   my $setting = MARC::Charset->assume_encoding();

	   MARC::Charset->assume_encoding('cp850'); # assume characters	are cp850
	   MARC::Charset->assume_encoding(''); # DO NOT	assume any encoding

   marc8_to_utf8()
       Converts	a MARC-8 encoded string	to UTF-8.

	   my $utf8 = marc8_to_utf8($marc8);

       If you'd	like to	ignore errors pass in a	true value as the 2nd
       parameter or call MARC::Charset->ignore_errors()	with a true value:

	   my $utf8 = marc8_to_utf8($marc8, 'ignore-errors');

	 or

	   MARC::Charset->ignore_errors(1);
	   my $utf8 = marc8_to_utf8($marc8);

   utf8_to_marc8()
       Will attempt to translate utf8 into marc8.

	   my $marc8 = utf8_to_marc8($utf8);

       If you'd	like to	ignore errors, or characters that can't	be converted
       to marc8	then pass in a true value as the second	parameter:

	   my $marc8 = utf8_to_marc8($utf8, 'ignore-errors');

	 or

	   MARC::Charset->ignore_errors(1);
	   my $utf8 = marc8_to_utf8($marc8);

DEFAULT	CHARACTER SETS
       If you need to alter the	default	character sets you can set the
       $MARC::Charset::DEFAULT_G0 and $MARC::Charset::DEFAULT_G1 variables to
       the appropriate character set code:

	   use MARC::Charset::Constants	qw(:all);
	   $MARC::Charset::DEFAULT_G0 =	BASIC_ARABIC;
	   $MARC::Charset::DEFAULT_G1 =	EXTENDED_ARABIC;

SEE ALSO
       o   MARC::Charset::Constant

       o   MARC::Charset::Table

       o   MARC::Charset::Code

       o   MARC::Charset::Compiler

       o   MARC::Record

       o   MARC::XML

AUTHOR
       Ed Summers (ehs@pobox.com)

perl v5.32.1			  2013-08-14		      MARC::Charset(3)

NAME | SYNOPSIS | DESCRIPTION | EXPORTS | DEFAULT CHARACTER SETS | SEE ALSO | AUTHOR

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=MARC::Charset&sektion=3&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help