# FreeBSD Manual Pages

```Discrete(3)	      User Contributed Perl Documentation	   Discrete(3)

NAME
Statistics::Descriptive::Discrete - Compute descriptive statistics for
discrete	data sets.

SYNOPSIS
use Statistics::Descriptive::Discrete;

my \$stats = new Statistics::Descriptive::Discrete;
print "count =	",\$stats->count(),"\n";
print "uniq  =	",\$stats->uniq(),"\n";
print "sum = ",\$stats->sum(),"\n";
print "min = ",\$stats->min(),"\n";
print "max = ",\$stats->max(),"\n";
print "mean = ",\$stats->mean(),"\n";
print "standard_deviation = ",\$stats->standard_deviation(),"\n";
print "variance = ",\$stats->variance(),"\n";
print "sample_range = ",\$stats->sample_range(),"\n";
print "mode = ",\$stats->mode(),"\n";
print "median = ",\$stats->median(),"\n";

DESCRIPTION
This module provides basic functions used in descriptive	statistics.
It borrows very heavily from Statistics::Descriptive::Full (which is
included	with Statistics::Descriptive) with one major difference.  This
module is optimized for discretized data	e.g. data from an A/D
conversion that has a discrete set of possible values.  E.g. if your
data is produced	by an 8	bit A/D	then you'd have	only 256 possible
values in your data set.	 Even though you might have a million data
points, you'd only have 256 different values in those million points.
Instead of storing the entire data set as Statistics::Descriptive does,
this module only	stores the values it's seen and	the number of times
it's seen each value.

For very	large data sets, this storage method results in	significant
speed and memory	improvements.  In a test case with 2.6 million data
points from a real world	application, Statistics::Descriptive::Discrete
took 40 seconds to calculate a set of statistics	instead	of the 561
seconds required	by Statistics::Descriptive::Full.  It also required
only 4MB	of RAM instead of the 400MB used by
Statistics::Descriptive::Full for the same data set.

METHODS
\$stat = Statistics::Descriptive::Discrete->new();
Create a new	statistics object.

Adds	data to	the statistics object.	Sets a flag so that the
statistics will be recomputed the next time they're needed.

Adds	data to	the statistics object where every two elements are a
value and a count (how many times did the value occur?)  The	above
is equivalent to \$stat->add_data(1,1,42,42,42); Use this when your
data	is in a	form isomorphic	to (\$value, \$occurrence).

\$stat->max();
Returns the maximum value of	the data set.

\$stat->min();
Returns the minimum value of	the data set.

\$stat->count();
Returns the total number of elements	in the data set.

\$stat->uniq();
Returns the total number of unique elements in the data set.	 For
example, if your data set is	(1,2,2,3,3,3), uniq will return	3.

\$stat->sum();
Returns the sum of all the values in	the data set.

\$stat->mean();
Returns the mean of the data.

\$stat->median();
Returns the median value of the data.

\$stat->mode();
Returns the mode of the data.

\$stat->variance();
Returns the variance	of the data.

\$stat->standard_deviation();
Returns the standard_deviation of the data.

\$stat->sample_range();
Returns the sample range (max - min)	of the data set.

\$stat->get_data();
Returns a copy of the data array.  Note: This array could be	very
large and would thus	defeat the purpose of using this module.  Make
sure	you really need	it before using	get_data().

NOTE
The interface for this module is	almost identical to
Statistics::Descriptive.	 This module is	incomplete and not fully
tested.

BUGS
o   Code	for calculating	mode is	not as robust as it should be.

o   Other bugs are lurking I'm sure.

TODO
o   Make	test suite more	robust

o   Add rest of methods (at least ones that don't depend	on original
order of data) from Statistics::Descriptive

AUTHOR
Rhet Turnbull, RhetTbull	on perlmonks.org, rhettbull at hotmail.com

If you find this	code useful, I would appreciate	an email letting me
know.

CREDIT
Thanks to the following individuals for finding bugs, providing
feedback, and submitting	changes:

o   Peter Dienes	for finding and	fixing a bug in	the variance
calculation.

o   Bill	Dueber for suggesting the add_data_tuple method.

program is free software; you can redistribute	it and/or modify it
under the same	terms as Perl itself.

Portions of this code is from Statistics::Descriptive which is	under

program is free software; you can redistribute	it and/or modify it
under the same	terms as Perl itself.

Copyright (c) 1998 Andrea Spinelli. All rights	reserved.  This	program
is free software; you can redistribute	it and/or modify it under the
same terms as Perl itself.

Copyright (c) 1994,1995 Jason Kastner.	All rights
reserved.  This program is free software; you can redistribute	it
and/or	modify it under	the same terms as Perl itself.