Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Sequin(3)	      User Contributed Perl Documentation	     Sequin(3)

NAME
       URI::Sequin - Extract information from the URLs of Search-Engines

SYNOPSIS
	       use URI::Sequin qw/se_extract key_extract log_extract %log_types/;

	       $url = &log_extract($line_from_log_file,	'NCSA');

	       $log_types{'MyLogType'} = '^(.+?) -> .+$';
	       $url = &log_extract($line_from_log_file,	'MyLogType');

	       $keyword_string = &key_extract($url);

	       ($search_engine_name, $search_engine_url) = @{&se_extract($url)};

DESCRIPTION
       This module provides three tools	to aid people trying to	analyse
       Search-Engine URLs. Itas	meant mainly for those who want	to analyse
       referrer	logs and pick out key information about	site visitors, such as
       which Search-Engine and keywords	they used to find the site.

       The functions and globals provided (and exported	by default) from this
       module are:

       log_extract($log_line, 'Type')
	   This	will pick out the referring URL	from a line of a logfile. The
	   'type' can be one of	the built in types or can be a user-created
	   one.	For more information, see %log_types below. This subroutine
	   accepts a scalar, and returns a scalar.

       key_extract($url)
	   This	will try and determine the keywords used in $url. It accepts a
	   scalar and returns a	scalar.	Should nothing be found, it returns an
	   undefined value.

       se_extract($url)
	   This	will try and determine the name	of the Search-Engine used and
	   its URL.  It	accepts	a scalar, and returns an array containing
	   firstly the Search- Engineas	name and secondly the Search-Engineas
	   URL.	Should the URL appear not to be	from a Search Query, it
	   returns a reference to an empty array.

       %log_types
	   There are five built-in logfile types already in this hash. They
	   are:

	   o   IIS1 - Microsoft	IIS 3.0	and 2.0

	   o   IIS2 - Microsoft	IIS4.0 (W3SVC format)

	   o   NCSA - For APACHE, NETSCAPE and any other NCSA format logs

	   o   ORW - O'Reilly WebSite format

	   o   General - A generalised one that	will work with most logfiles

	   Itas	easy to	add another one. Simply	add a key to the hash, with a
	   value that is a regex. Parenthesise the part	that is	the referring
	   URL,	as the script uses $1 to obtain	the URL. (see the example in
	   the Synopsis	section).

	   I have only one request for people who use this module. *Please*
	   tell	me where and how you've	used it, and if	you have any thoughts
	   or suggestions on it, tell me!

BUGS
       Doesn't like the	Amnesi Search Engine. But then,	neither	do I. Also,
       the 'General' log type needs to be used with discretion ... be sure
       that none of the	URLs contain literal " if you use it.

AUTHOR
       Peter Sergeant <pete@grou.ch>

COPYRIGHT
       Copyright 2001 Peter Sergeant.

       This program is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.

POD ERRORS
       Hey! The	above document had some	coding errors, which are explained
       below:

       Around line 419:
	   Non-ASCII character seen before =encoding in	'Itas'.	Assuming
	   CP1252

perl v5.32.0			  2003-09-01			     Sequin(3)

NAME | SYNOPSIS | DESCRIPTION | BUGS | AUTHOR | COPYRIGHT | POD ERRORS

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=URI::Sequin&sektion=3&manpath=FreeBSD+12.1-RELEASE+and+Ports>

home | help