Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Gungho::Component::ThrUsereContributed Perl DocuGungho::Component::Throttle(3)

NAME
       Gungho::Component::Throttle - Base Class	To Throttle Requests

SYNOPSIS
	 package Gungho::Component::Throttle::Domain;
	 use base qw(Gungho::Component::Throttle);

DESCRIPTION
       If you create a serious enough crawler, throttling will become a	major
       issue.  After all, you want to *crawl* the sites, not overwhelm them
       with requests.

       While the concept is simple, implementing this on your own is
       relatively costly, so Gungho provides a few simple ways to work with
       this problem.

       Gungho::Component::Throttle::Simple will	throttle simply	by the number
       of requests being sent at a time, regardless of what they are. This
       simple approach will work well if your client-side resources are
       limited -- for example, you don't want your requests to hog up too much
       bandwidth, so you limit the actual number of requests being sent.

	 # throttle down to 100	requests / hour
	 components:
	   - Throttle::Simple
	 throttle:
	   simple:
	     max_iterms: 100
	     interval: 3600

       In most cases, however, you will	probably want
       Gungho::Component::Throttle::Domain, which throttles requests on	a per-
       domain basis. This way you can, for example, limit the number of
       requests	being sent to one host,	while letting the remaining time
       slices to be used against some other host.

	 # throttle down to 100	requests / host	/ hour
	 components:
	   - Throttle::Domain
	 throttle:
	   domain:
	     max_iterms: 100
	     interval: 3600

       This component utilises Data::Throttler or Data::Throttler::Memcached
       for the main engine to keep track of the	throttling. Data::Throttler
       will suffice if you are working from a single host. You will need
       Data::Throttler::Memcached if you have a	farm of	crawlers that may
       potentially be residing on different hosts.

       By default Data::Throttler will be used.	If you want to override	this,
       specify the throttler argument in the configuration:

	 components:
	   - Throttle::Domain
	 throttle:
	   domain:
	     throttler:	Data::Throttler::Memcached
	     cache:
	       data: 127.0.0.1:11211
	     max_items:	100
	     interval: 3600

       Starting	from 0.09003, you can stack throttlers.	For example, you can
       throttle	by Throttle::Simple first, and if Throttle::Simple allowed the
       request to go, then you can  throttle with Throttle::Domain as well to
       make sure that the same host doesn't get	beaten up.

METHODS
   feature_name
   throttle
   send_request
perl v5.32.0			  2007-11-29	Gungho::Component::Throttle(3)

NAME | SYNOPSIS | DESCRIPTION | METHODS

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Gungho::Component::Throttle&sektion=3&manpath=FreeBSD+12.2-RELEASE+and+Ports>

home | help