Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Audio::Scan(3)	      User Contributed Perl Documentation	Audio::Scan(3)

NAME
       Audio::Scan - Fast C metadata and tag reader for	all common audio file
       formats

SYNOPSIS
	   use Audio::Scan;

	   my $data = Audio::Scan->scan('/path/to/file.mp3');

	   # Just file info
	   my $info = Audio::Scan->scan_info('/path/to/file.mp3');

	   # Just tags
	   my $tags = Audio::Scan->scan_tags('/path/to/file.mp3');

	   # Scan without reading (possibly large) artwork into	memory.
	   # Instead of	binary artwork data, the size of the artwork will be returned instead.
	   {
	       local $ENV{AUDIO_SCAN_NO_ARTWORK} = 1;
	       my $data	= Audio::Scan->scan('/path/to/file.mp3');
	   }

	   # Scan a filehandle
	   open	my $fh,	'<', 'my.mp3';
	   my $data = Audio::Scan->scan_fh( mp3	=> $fh );
	   close $fh;

	   # Scan and compute an audio MD5 checksum
	   my $data = Audio::Scan->scan( '/path/to/file.mp3', {	md5_size => 100	* 1024 } );
	   my $md5 = $data->{info}->{audio_md5};

DESCRIPTION
       Audio::Scan is a	C-based	scanner	for audio file metadata	and tag
       information. It currently supports MP3, MP4, Ogg	Vorbis,	FLAC, ASF,
       WAV, AIFF, Musepack, Monkey's Audio, and	WavPack.

       See below for specific details about each file format.

METHODS
   scan( $path,	[ \%OPTIONS ] )
       Scans $path for both metadata and tag information.  The type of scan
       performed is determined by the file's extension.	 Supported extensions
       are:

	   MP3:	 mp3, mp2
	   MP4:	 mp4, m4a, m4b,	m4p, m4v, m4r, k3g, skm, 3gp, 3g2, mov
	   AAC (ADTS): aac
	   Ogg:	 ogg, oga
	   FLAC: flc, flac, fla
	   ASF:	 wma, wmv, asf
	   Musepack:  mpc, mpp,	mp+
	   Monkey's Audio:  ape, apl
	   WAV:	wav
	   AIFF: aiff, aif
	   WavPack: wv

       This method returns a hashref containing	two other hashrefs: info and
       tags.  The contents of the info and tag hashes vary depending on	file
       format, see below for details.

       An optional hashref may be provided with	the following values:

	   md5_size => $audio_bytes_to_checksum

       An MD5 will be computed of the first N audio bytes. Any tags in the
       file are	automatically skipped, so this is a useful way of determining
       if a file's audio content is the	same even if tags may have been
       changed.	 The hex MD5 value is returned in the $info->{audio_md5} key.
       This option will	reduce performance, so choose a	small enough size that
       works for you, you should probably avoid	using more than	64K for
       example.

       For FLAC	files that already contain an MD5 checksum, this value will be
       used instead of calculating a new one.

	   md5_offset => $offset

       Begin computing the audio_md5 value starting at $offset.	 If this value
       is not specified, $offset defaults to a point in	the middle of the
       file.

   scan_info( $path, [ \%OPTIONS ] )
       If you only need	file metadata and don't	care about tags, you can use
       this method.

   scan_tags( $path, [ \%OPTIONS ] )
       If you only need	the tags and don't care	about the metadata, use	this
       method.

   scan_fh( $type => $fh, [ \%OPTIONS ]	)
       Scans a filehandle. $type is the	type of	file to	scan as, i.e. "mp3" or
       "ogg".  Note that FLAC does not support reading from a filehandle.

   find_frame( $path, $timestamp_in_ms )
       Returns the byte	offset to the first audio frame	starting from the
       given timestamp (in milliseconds).

       MP3, Ogg, FLAC, ASF, MP4
	   The byte offset to the data packet containing this timestamp	will
	   be returned.	For file formats that don't provide timestamp
	   information such as MP3, the	best estimate for the location of the
	   timestamp will be returned.	This will be more accurate if the file
	   has a Xing header or	is CBR for example.

       WAV, AIFF, Musepack, Monkey's Audio, WavPack
	   Not yet supported by	find_frame.

   find_frame_return_info( $mp4_path, $timestamp_in_ms )
       The header of an	MP4 file contains various metadata that	refers to the
       structure of the	audio data, making seeking more	difficult to perform.
       This method will	return the usual $info hash with 2 additional keys:

	   seek_offset - The seek offset in bytes
	   seek_header - A rewritten MP4 header	that can be prepended to the audio data
			 found at seek_offset to construct a valid bitstream. Specifically,
			 the following boxes are rewritten: stts, stsc,	stsz, stco

       For example, to seek 30 seconds into a file and write out a new MP4
       file seeked to this point:

	   my $info = Audio::Scan->find_frame_return_info( $file, 30000	);

	   open	my $f, '<', $file;
	   sysseek $f, $info->{seek_offset}, 1;

	   open	my $fh,	'>', 'seeked.m4a';
	   print $fh $info->{seek_header};

	   while ( sysread( $f,	my $buf, 65536 ) ) {
	       print $fh $buf;
	   }

	   close $f;
	   close $fh;

   find_frame_fh( $type	=> $fh,	$offset	)
       Same as "find_frame", but with a	filehandle.

   find_frame_fh_return_info( $type => $fh, $offset )
       Same as "find_frame_return_info", but with a filehandle.

   has_flac()
       Deprecated.  Always returns 1 now that FLAC is always enabled.

   is_supported( $path )
       Returns 1 if the	given path can be scanned by Audio::Scan, or 0 if not.

   get_types()
       Returns an array	of strings of the file types supported by Audio::Scan.

   extensions_for( $type )
       Returns an array	of strings of the file extensions that are considered
       to be the file type $type.

   type_for( $extension	)
       Returns file type for a given extension.	Returns	undef for unsupported
       extensions.

SKIPPING ARTWORK
       To save memory while reading tags, you can opt to skip potentially
       large embedded artwork.	To do this, set	the environment	variable
       AUDIO_SCAN_NO_ARTWORK:

	   local $ENV{AUDIO_SCAN_NO_ARTWORK} = 1;
	   my $tags = Audio::Scan->scan_tags($file);

       This will return	the length of the embedded artwork instead of the
       actual image data.  In some cases it will also return a byte offset to
       the image data, which can be used to extract the	image using more
       efficient means.	 Note that the offset is not always returned so	if you
       want to use this	data make sure to check	for offset.  If	offset is not
       present,	the only way to	get the	image data is to perform a normal tag
       scan without the	environment variable set.

       One limitation that currently exists is that memory for embedded	images
       is still	allocated for ASF and Ogg Vorbis files.

       This information	is returned in different ways depending	on the format:

       ID3 (MP3, AAC, WAV, AIFF):

	   $tags->{APIC}->[3]: image length
	   $tags->{APIC}->[4]: image offset (unless APIC would need unsynchronization)

       MP4:

	   $tags->{COVR}: image	length
	   $tags->{COVR_offset}: image offset (always available)

       Ogg Vorbis:

	   $tags->{ALLPICTURES}->[0]->{image_data}: image length
	   Image offset	is not supported with Vorbis because the data is always	base64-encoded.

       FLAC:

	   $tags->{ALLPICTURES}->[0]->{image_data}: image length
	   $tags->{ALLPICTURES}->[0]->{offset}:	image offset (always available)

       ASF:

	   $tags->{'WM/Picture'}->{image}: image length
	   $tags->{'WM/Picture'}->{offset}: image offset (always available)

       APE, Musepack, WavPack, MP3 with	APEv2:

	   $tags->{'COVER ART (FRONT)'}: image length
	   $tags->{'COVER ART (FRONT)_offset'}:	image offset (always available)

MP3
   INFO
       The following metadata about a file may be returned:

	   id3_version (i.e. "ID3v2.4.0")
	   song_length_ms (duration in milliseconds)
	   layer (i.e. 3)
	   stereo
	   samples_per_frame
	   padding
	   audio_size (size of all audio frames)
	   audio_offset	(byte offset to	first audio frame)
	   bitrate (in bps, determined using Xing/LAME/VBRI if possible, or average in the worst case)
	   samplerate (in kHz)
	   vbr (1 if file is VBR)
	   dlna_profile	(if file is compliant)

	   If a	Xing header is found:
	   xing_frames
	   xing_bytes
	   xing_quality

	   If a	VBRI header is found:
	   vbri_delay
	   vbri_frames
	   vbri_bytes
	   vbri_quality

	   If a	LAME header is found:
	   lame_encoder_version
	   lame_tag_revision
	   lame_vbr_method
	   lame_lowpass
	   lame_replay_gain_radio
	   lame_replay_gain_audiophile
	   lame_encoder_delay
	   lame_encoder_padding
	   lame_noise_shaping
	   lame_stereo_mode
	   lame_unwise_settings
	   lame_source_freq
	   lame_surround
	   lame_preset

   TAGS
       Raw tags	are returned as	found.	This means older tags such as ID3v1
       and ID3v2.2/v2.3	are converted to ID3v2.4 tag names.  Multiple
       instances of a tag in a file will be returned as	arrays.	 Complex tags
       such as APIC and	COMM are returned as arrays.  All tag fields are
       converted to upper-case.	 All text is converted to UTF-8.

       Sample tag data:

	   tags	=> {
		 ALBUMARTISTSORT => "Solar Fields",
		 APIC => [ "image/jpeg", 3, "",	<binary	data snipped> ],
		 CATALOGNUMBER => "INRE	017",
		 COMM => ["eng", "", "Amazon.com Song ID: 202981429"],
		 "MUSICBRAINZ ALBUM ARTIST ID" => "a2af1f31-c9eb-4fff-990c-c4f547a11b75",
		 "MUSICBRAINZ ALBUM ID"	=> "282143c9-6191-474d-a31a-1117b8c88cc0",
		 "MUSICBRAINZ ALBUM RELEASE COUNTRY" =>	"FR",
		 "MUSICBRAINZ ALBUM STATUS" => "official",
		 "MUSICBRAINZ ALBUM TYPE" => "album",
		 "MUSICBRAINZ ARTIST ID" => "a2af1f31-c9eb-4fff-990c-c4f547a11b75",
		 "REPLAYGAIN_ALBUM_GAIN" => "-2.96 dB",
		 "REPLAYGAIN_ALBUM_PEAK" => "1.045736",
		 "REPLAYGAIN_TRACK_GAIN" => "+3.60 dB",
		 "REPLAYGAIN_TRACK_PEAK" => "0.892606",
		 TALB => "Leaving Home",
		 TCOM => "Magnus Birgersson",
		 TCON => "Ambient",
		 TCOP => "2005 ULTIMAE RECORDS",
		 TDRC => "2004-10",
		 TIT2 => "Home",
		 TPE1 => "Solar	Fields",
		 TPE2 => "Solar	Fields",
		 TPOS => "1/1",
		 TPUB => "Ultimae Records",
		 TRCK => "1/11",
		 TSOP => "Solar	Fields",
		 UFID => [
		       "http://musicbrainz.org",
		       "1084278a-2254-4613-a03c-9fed7a8937ca",
		 ],
	   },

MP4
   INFO
       The following metadata about a file may be returned:

	   audio_offset	(byte offset to	start of mdat)
	   audio_size
	   compatible_brands
	   file_size
	   leading_mdat	(if file has mdat before moov)
	   major_brand
	   minor_version
	   song_length_ms
	   timescale
	   dlna_profile	(if file is compliant)
	   tracks (array of tracks in the file)
	       Each track may contain:

	       audio_type
	       avg_bitrate
	       bits_per_sample
	       channels
	       duration
	       encoding
	       handler_name
	       handler_type
	       id
	       max_bitrate
	       samplerate

   TAGS
       Tags are	returned in a hash with	all keys converted to upper-case.
       Keys starting with 0xA9 (copyright symbol) will have this character
       stripped	out.  Sample tag data:

	   tags	=> {
	      AART		=> "Album Artist",
	      ALB		=> "Album",
	      ART		=> "Artist",
	      CMT		=> "Comments",
	      COVR		=> <binary data	snipped>,
	      CPIL		=> 1,
	      DAY		=> 2009,
	      DESC		=> "Video Description",
	      DISK		=> "1/2",
	      "ENCODING	PARAMS"	=> "vers\0\0\0\1acbf\0\0\0\2brat\0\1w\0cdcv\0\1\6\5",
	      GNRE		=> "Jazz",
	      GRP		=> "Grouping",
	      ITUNNORM		=> " 00000000 00000000 00000000	00000000 00000000 00000000 00000000 00000000 00000000 00000000",
	      ITUNSMPB		=> " 00000000 00000840 000001E4	00000000000001DC 00000000 00000000 00000000 00000000 00000000 00000000 00000000	00000000",
	      LYR		=> "Lyrics",
	      NAM		=> "Name",
	      PGAP		=> 1,
	      SOAA		=> "Sort Album Artist",
	      SOAL		=> "Sort Album",
	      SOAR		=> "Sort Artist",
	      SOCO		=> "Sort Composer",
	      SONM		=> "Sort Name",
	      SOSN		=> "Sort Show",
	      TMPO		=> 120,
	      TOO		=> "iTunes 8.1.1, QuickTime 7.6",
	      TRKN		=> "1/10",
	      TVEN		=> "Episode ID",
	      TVES		=> 12,
	      TVSH		=> "Show",
	      TVSN		=> 12,
	      WRT		=> "Composer",
	   },

AAC (ADTS)
   INFO
       The following metadata about a file is returned:

	   audio_offset
	   audio_size
	   bitrate (in bps)
	   channels
	   file_size
	   profile (Main, LC, or SSR)
	   samplerate (in kHz)
	   song_length_ms (duration in milliseconds)
	   dlna_profile	(if file is compliant)

OGG VORBIS
   INFO
       The following metadata about a file is returned:

	   version
	   channels
	   stereo
	   samplerate (in kHz)
	   bitrate_average (in bps)
	   bitrate_upper
	   bitrate_nominal
	   bitrate_lower
	   blocksize_0
	   blocksize_1
	   audio_offset	(byte offset to	audio)
	   audio_size
	   song_length_ms (duration in milliseconds)

   TAGS
       Raw Vorbis comments are returned.  All comment keys are capitalized.

FLAC
   INFO
       The following metadata about a file is returned:

	   channels
	   samplerate (in kHz)
	   bitrate (in bps)
	   file_size
	   audio_offset	(byte offset to	first audio frame)
	   audio_size
	   song_length_ms (duration in milliseconds)
	   bits_per_sample
	   frames
	   minimum_blocksize
	   maximum_blocksize
	   minimum_framesize
	   maximum_framesize
	   audio_md5
	   total_samples

   TAGS
       Raw FLAC	comments are returned.	All comment keys are capitalized.
       Some data returned is special:

       APPLICATION

	   Each	application block is returned in the APPLICATION tag keyed by application ID.

       CUESHEET_BLOCK

	   The CUESHEET_BLOCK tag is an	array containing each line of the cue sheet.

       ALLPICTURES

	   Embedded pictures are returned in an	ALLPICTURES array.  Each picture has the following metadata:

	       mime_type
	       description
	       width
	       height
	       depth
	       color_index
	       image_data
	       picture_type

ASF (Windows Media Audio/Video)
   INFO
       The following metadata about a file may be returned.  Reading the ASF
       spec is encouraged if you want to find out more about any of these
       values.

	   audio_offset	(byte offset to	first data packet)
	   audio_size
	   broadcast (boolean, whether the file	is a live broadcast or not)
	   codec_list (array of	information about codecs used in the file)
	   creation_date (UNIX timestamp when file was created)
	   data_packets
	   drm_key
	   drm_license_url
	   drm_protection_type
	   drm_data
	   file_id (unique file	ID)
	   file_size
	   index_blocks
	   index_entry_interval	(in milliseconds)
	   index_offsets (byte offsets for each	second of audio, per stream. Useful for	seeking)
	   index_specifiers (indicates which stream a given index_offset points	to)
	   language_list (array	of languages referenced	by the file's metadata)
	   lossless (boolean)
	   max_bitrate
	   max_packet_size
	   min_packet_size
	   mutex_list (mutually	exclusive stream information)
	   play_duration_ms
	   preroll
	   script_commands
	   script_types
	   seekable (boolean, whether the file is seekable or not)
	   send_duration_ms
	   song_length_ms (the actual length of	the audio, in milliseconds)
	   dlna_profile	(if file is compliant)

       STREAMS

       The streams array contains metadata related to an individul stream
       within the file.	 The following metadata	may be returned:

	   DeviceConformanceTemplate
	   IsVBR
	   alt_bitrate
	   alt_buffer_fullness
	   alt_buffer_size
	   avg_bitrate (most accurate bitrate for this stream)
	   avg_bytes_per_sec (audio only)
	   bitrate
	   bits_per_sample (audio only)
	   block_alignment (audio only)
	   bpp (video only)
	   buffer_fullness
	   buffer_size
	   channels (audio only)
	   codec_id (audio only)
	   compression_id (video only)
	   encode_options
	   encrypted (boolean)
	   error_correction_type
	   flag_seekable (boolean)
	   height (video only)
	   index_type
	   language_index (offset into language_list array)
	   max_object_size
	   samplerate (in kHz) (audio only)
	   samples_per_block
	   stream_number
	   stream_type
	   super_block_align
	   time_offset
	   width (video	only)

   TAGS
       Raw tags	are returned.  Tags that occur more than once are returned as
       arrays.	In contrast to the other formats, tag keys are NOT
       capitalized. There is one special key:

       WM/Picture

       Pictures	are returned as	a hash with the	following keys:

	   image_type (numeric type, same as ID3v2 APIC)
	   mime_type
	   description
	   image

WAV
   INFO
       The following metadata about a file may be returned.

	   audio_offset
	   audio_size
	   bitrate (in bps)
	   bits_per_sample
	   block_align
	   channels
	   dlna_profile	(if file is compliant)
	   file_size
	   format (WAV format code, 1 == PCM)
	   id3_version (if an ID3v2 tag	is found)
	   samplerate (in kHz)
	   song_length_ms

   TAGS
       WAV files can contain several different types of	tags.  "Native"	WAV
       tags found in a LIST block may include these and	others:

	   IARL	- Archival Location
	   IART	- Artist
	   ICMS	- Commissioned
	   ICMT	- Comment
	   ICOP	- Copyright
	   ICRD	- Creation Date
	   ICRP	- Cropped
	   IENG	- Engineer
	   IGNR	- Genre
	   IKEY	- Keywords
	   IMED	- Medium
	   INAM	- Name (Title)
	   IPRD	- Product (Album)
	   ISBJ	- Subject
	   ISFT	- Software
	   ISRC	- Source
	   ISRF	- Source Form
	   TORG	- Label
	   LOCA	- Location
	   TVER	- Version
	   TURL	- URL
	   TLEN	- Length
	   ITCH	- Technician
	   TRCK	- Track
	   ITRK	- Track

       ID3v2 tags can also be embedded within WAV files.  These	are returned
       exactly as for MP3 files.

AIFF
   INFO
       The following metadata about a file may be returned.

	   audio_offset
	   audio_size
	   bitrate (in bps)
	   bits_per_sample
	   block_align
	   channels
	   compression_name (if	AIFC)
	   compression_type (if	AIFC)
	   dlna_profile	(if file is compliant)
	   file_size
	   id3_version (if an ID3v2 tag	is found)
	   samplerate (in kHz)
	   song_length_ms

   TAGS
       ID3v2 tags can be embedded within AIFF files.  These are	returned
       exactly as for MP3 files.

MONKEY'S AUDIO (APE)
   INFO
       The following metadata about a file may be returned.

	   audio_offset
	   audio_size
	   bitrate (in bps)
	   channels
	   compression
	   file_size
	   samplerate (in kHz)
	   song_length_ms
	   version

   TAGS
       APEv2 tags are returned as a hash of key/value pairs.

MUSEPACK
   INFO
       The following metadata about a file may be returned.

	   audio_offset
	   audio_size
	   bitrate (in bps)
	   channels
	   encoder
	   file_size
	   profile
	   samplerate (in kHz)
	   song_length_ms

   TAGS
       Musepack	uses APEv2 tags.  They are returned as a hash of key/value
       pairs.

WAVPACK
   INFO
       The following metadata about a file may be returned.

	   audio_offset
	   audio_size
	   bitrate (in bps)
	   bits_per_sample
	   channels
	   encoder_version
	   file_size
	   hybrid (1 if	file is	lossy) (v4 only)
	   lossless (1 if file is lossless) (v4	only)
	   samplerate
	   song_length_ms
	   total_samples

   TAGS
       WavPack uses APEv2 tags.	 They are returned as a	hash of	key/value
       pairs.

DSF
   INFO
       The following metadata about a file may be returned.

	   audio_offset
	   audio_size
	   bits_per_sample
	   channels
	   song_length_ms
	   samplerate
	   block_size_per_channel

   TAGS
       ID3v2 tags can be embedded within DSF files.  These are returned
       exactly as for MP3 files.

DSDIFF (DFF)
   INFO
       The following metadata about a file may be returned.

	   audio_offset
	   audio_size
	   bits_per_sample
	   channels
	   song_length_ms
	   samplerate
	   tag_diti_title
	   tag_diar_artist

   TAGS
       No separate tags	are supported by the DSDIFF format.

THANKS
       Logitech	& Slim Devices,	for letting us release so much of our code to
       the world.  Long	live Squeezebox!

       Kimmo Taskinen, Adrian Smith, Clive Messer, and Jurgen Kramer for
       DSF/DSDIFF support and various other fixes.

       Some code from the Rockbox project was very helpful in implementing ASF
       and MP4 seeking.

       Some of the file	format parsing code was	derived	from the mt-daapd
       project,	and adapted by Netgear.	 It has	been heavily rewritten to fix
       bugs and	add more features.

       The source to the original Netgear C scanner for	SqueezeCenter is
       located at
       <http://svn.slimdevices.com/repos/slim/7.3/trunk/platforms/readynas/contrib/scanner>

       The audio MD5 feature uses an MD5 implementation	by L. Peter Deutsch,
       <ghost@aladdin.com>.

SEE ALSO
       ASF Spec
       <http://www.microsoft.com/windows/windowsmedia/forpros/format/asfspec.aspx>

       MP4 Info:
       <http://standards.iso.org/ittf/PubliclyAvailableStandards/c051533_ISO_IEC_14496-12_2008.zip>
       <http://www.geocities.com/xhelmboyx/quicktime/formats/mp4-layout.txt>

AUTHORS
       Andy Grundman, <andy@hybridized.org>

       Dan Sully, <daniel@cpan.org>

COPYRIGHT AND LICENSE
       Copyright (C) 2010-2011 Logitech, Inc.

       This program is free software; you can redistribute it and/or modify it
       under the terms of the GNU General Public License as published by the
       Free Software Foundation; either	version	2 of the License, or (at your
       option) any later version.

perl v5.24.1			  2017-04-28			Audio::Scan(3)

NAME | SYNOPSIS | DESCRIPTION | METHODS | SKIPPING ARTWORK | MP3 | MP4 | AAC (ADTS) | OGG VORBIS | FLAC | ASF (Windows Media Audio/Video) | WAV | AIFF | MONKEY'S AUDIO (APE) | MUSEPACK | WAVPACK | DSF | DSDIFF (DFF) | THANKS | SEE ALSO | AUTHORS | COPYRIGHT AND LICENSE

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Audio::Scan&sektion=3&manpath=FreeBSD+12.0-RELEASE+and+Ports>

home | help