Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
PT-STALK(1)	      User Contributed Perl Documentation	   PT-STALK(1)

NAME
       pt-stalk	- Collect forensic data	about MySQL when problems occur.

SYNOPSIS
       Usage: pt-stalk [OPTIONS]

       pt-stalk	waits for a trigger condition to occur,	then collects data to
       help diagnose problems.	The tool is designed to	run as a daemon	with
       root privileges,	so that	you can	diagnose intermittent problems that
       you cannot observe directly.  You can also use it to execute a custom
       command,	or to collect data on demand without waiting for the trigger
       to occur.

RISKS
       Percona Toolkit is mature, proven in the	real world, and	well tested,
       but all database	tools can pose a risk to the system and	the database
       server.	Before using this tool,	please:

       o   Read	the tool's documentation

       o   Review the tool's known "BUGS"

       o   Test	the tool on a non-production server

       o   Backup your production server and verify the	backups

DESCRIPTION
       Sometimes a problem happens infrequently	and for	a short	time, giving
       you no chance to	see the	system when it happens.	How do you solve
       intermittent MySQL problems when	you can't observe them?	That's why pt-
       stalk exists. In	addition to using it when there's a known problem on
       your servers, it	is a good idea to run pt-stalk all the time, even when
       you think nothing is wrong.  You	will appreciate	the data it collects
       when a problem occurs, because problems such as MySQL lockups or	spikes
       in activity typically leave no evidence to use in root cause analysis.

       pt-stalk	does two things: it watches a MySQL server and waits for a
       trigger condition to occur, and it collects diagnostic data when	that
       trigger occurs.	To avoid false-positives caused	by short-lived
       problems, the trigger condition must be true at least "--cycles"	times
       before a	"--collect" is triggered.

       To use pt-stalk effectively, you	need to	define a good trigger.	A good
       trigger is sensitive enough to fire reliably when a problem occurs, so
       that you	don't miss a chance to solve problems.	On the other hand, a
       good trigger isn't prone	to false positives, so you don't gather
       information when	the server is functioning normally.

       The most	reliable triggers for MySQL tend to be the number of
       connections to the server, and the number of queries running
       concurrently. These are available in the	SHOW GLOBAL STATUS command as
       Threads_connected and Threads_running.  Sometimes Threads_connected is
       not a reliable indicator	of trouble, but	Threads_running	usually	is.
       Your job, as the	tool's user, is	to define an appropriate trigger
       condition for the tool.	Choose carefully, because the quality of your
       results will depend on the trigger you choose.

       You define the trigger with the "--function", "--variable",
       "--threshold", and "--cycles" options.  The default values for these
       options define a	reasonable trigger, but	you should adjust or change
       them to suite your particular system and	needs.

       By default, pt-stalk tool watches MySQL forever until the trigger
       occurs, then it collects	diagnostic data	for a while, and sleeps
       afterwards to avoid repeatedly collecting data if the trigger remains
       true.  The general order	of operations is:

	  while	true; do
	     if	--variable from	--function > --threshold; then
		cycles_true++
		if cycles_true >= --cycles; then
		   --notify-by-email
		   if --collect; then
		      if --disk-bytes-free and --disk-pct-free ok; then
			 (--collect for	--run-time seconds) &
		      fi
		      rm files in --dest older than --retention-time
		   fi
		   iter++
		   cycles_true=0
		fi
		if iter	< --iterations;	then
		   sleep --sleep seconds
		else
		   break
		fi
	     else
		if iter	< --iterations;	then
		   sleep --interval seconds
		else
		   break
		fi
	     fi
	  done
	  rm old --dest	files older than --retention-time
	  if --collect process are still running; then
	     wait up to	--run-time * 3 seconds
	     kill any remaining	--collect processes
	  fi

       The diagnostic data is written to files whose names begin with a
       timestamp, so you can distinguish samples from each other in case the
       tool collects data multiple times.  The pt-sift tool is designed	to
       help you	browse and analyze the resulting data samples.

       Although	this sounds simple enough, in practice there are a number of
       subtleties, such	as detecting when the disk is beginning	to fill	up so
       that the	tool doesn't cause the server to run out of disk space.	 This
       tool handles these types	of potential problems, so it's a good idea to
       use this	tool instead of	writing	something from scratch and possibly
       experiencing some of the	hazards	this tool is designed to avoid.

CONFIGURING
       You can use standard Percona Toolkit configuration files	to set command
       line options.

       You will	probably want to run the tool as a daemon and customize	at
       least the "--threshold".	 Here's	a sample configuration file for
       triggering when there are more than 20 queries running at once:

	 daemonize
	 threshold=20

       If you don't run	the tool as root, then you will	need specify several
       options,	such as	"--pid", "--log", and "--dest",	else the tool will
       probably	fail to	start.

OPTIONS
       --ask-pass
	   Prompt for a	password when connecting to MySQL.

       --collect
	   default: yes; negatable: yes

	   Collect diagnostic data when	the trigger occurs.  Specify
	   "--no-collect" to make the tool watch the system but	not collect
	   data.

	   See also "--stalk".

       --collect-gdb
	   Collect GDB stacktraces.  This is achieved by attaching to MySQL
	   and printing	stack traces from all threads. This will freeze	the
	   server for some period of time, ranging from	a second or so to much
	   longer on very busy systems with a lot of memory and	many threads
	   in the server.  For this reason, it is disabled by default.
	   However, if you are trying to diagnose a server stall or lockup,
	   freezing the	server causes no additional harm, and the stack	traces
	   can be vital	for diagnosis.

	   In addition to freezing the server, there is	also some risk of the
	   server crashing or performing badly after GDB detaches from it.

       --collect-oprofile
	   Collect oprofile data.  This	is achieved by starting	an oprofile
	   session, letting it run for the collection time, and	then stopping
	   and saving the resulting profile data in the	system's default
	   location.  Please read your system's	oprofile documentation to
	   learn more about this.

       --collect-strace
	   Collect strace data.	This is	achieved by attaching strace to	the
	   server, which will make it run very slowly until strace detaches.
	   The same cautions apply as those listed in --collect-gdb.  You
	   should not enable this option together with --collect-gdb, because
	   GDB and strace can't	attach to the server process simultaneously.

       --collect-tcpdump
	   Collect tcpdump data. This option causes tcpdump to capture all
	   traffic on all interfaces for the port on which MySQL is listening.
	   You can later use pt-query-digest to	decode the MySQL protocol and
	   extract a log of query traffic from it.

       --config
	   type: string

	   Read	this comma-separated list of config files.  If specified, this
	   must	be the first option on the command line.

       --cycles
	   type: int; default: 5

	   How many times "--variable" must be greater than "--threshold"
	   before triggering "--collect".  This	helps prevent false positives,
	   and makes the trigger condition less	likely to fire when the
	   problem recovers quickly.

       --daemonize
	   Daemonize the tool.	This causes the	tool to	fork into the
	   background and log its output as specified in --log.

       --defaults-file
	   short form: -F; type: string

	   Only	read mysql options from	the given file.	 You must give an
	   absolute pathname.

       --dest
	   type: string; default: /var/lib/pt-stalk

	   Where to save diagnostic data from "--collect".  Each time the tool
	   collects data, it writes to a new set of files, which are named
	   with	the current system timestamp.

       --disk-bytes-free
	   type: size; default:	100M

	   Do not "--collect" if the disk has less than	this much free space.
	   This	prevents the tool from filling up the disk with	diagnostic
	   data.

	   If the "--dest" directory contains a	previously captured sample of
	   data, the tool will measure its size	and use	that as	an estimate of
	   how much data is likely to be gathered this time, too.  It will
	   then	be even	more pessimistic, and will refuse to collect data
	   unless the disk has enough free space to hold the sample and	still
	   have	the desired amount of free space.  For example,	if you'd like
	   100MB of free space and the previous	diagnostic sample consumed
	   100MB, the tool won't collect any data unless the disk has 200MB
	   free.

	   Valid size value suffixes are k, M, G, and T.

       --disk-pct-free
	   type: int; default: 5

	   Do not "--collect" if the disk has less than	this percent free
	   space.  This	prevents the tool from filling up the disk with
	   diagnostic data.

	   This	option works similarly to "--disk-bytes-free" but specifies a
	   percentage margin of	safety instead of a bytes margin of safety.
	   The tool honors both	options, and will not collect any data unless
	   both	margins	are satisfied.

       --function
	   type: string; default: status

	   What	to watch for the trigger.  The default value watches "SHOW
	   GLOBAL STATUS", but you can also watch "SHOW	PROCESSLIST" and
	   specify a file with your own	custom code.  This function supplies
	   the value of	"--variable", which is then compared against
	   "--threshold" to see	if the the trigger condition is	met.
	   Additional options may be required as well; see below. Possible
	   values are:

	   o   status

	       Watch "SHOW GLOBAL STATUS" for the trigger.  The	value of
	       "--variable" then defines which status counter is the trigger.

	   o   processlist

	       Watch "SHOW FULL	PROCESSLIST" for the trigger.  The trigger
	       value is	the count of processes whose "--variable" column
	       matches the "--match" option.  For example, to trigger
	       "--collect" when	more than 10 processes are in the "statistics"
	       state, specify:

		  --function processlist \
		  --variable State	 \
		  --match statistics	 \
		  --threshold 10

	   In addition,	you can	specify	a file that contains your custom
	   trigger function, written in	Unix shell script.  This can be	a
	   wrapper that	executes anything you wish.  If	the argument to
	   "--function"	is a file, then	it takes precedence over built-in
	   functions, so if there is a file in the working directory named
	   "status" or "processlist" then the tool will	use that file even
	   though are valid built-in values.

	   The file works by providing a function called "trg_plugin", and the
	   tool	simply sources the file	and executes the function.  For
	   example, the	file might contain:

	      trg_plugin() {
		 mysql $EXT_ARGV -e "SHOW ENGINE INNODB	STATUS"	\
		   | grep -c "has waited at"
	      }

	   This	snippet	will count the number of mutex waits inside InnoDB.
	   It illustrates the general principle: the function must output a
	   number, which is then compared to "--threshold" as usual.  The
	   $EXT_ARGV variable contains the MySQL options mentioned in the
	   "SYNOPSIS" above.

	   The file should not alter the tool's	existing global	variables.
	   Prefix any file-specific global variables with "PLUGIN_" or make
	   them	local.

       --help
	   Print help and exit.

       --host
	   short form: -h; type: string

	   Host	to connect to.

       --interval
	   type: int; default: 1

	   How often to	check the if trigger is	true, in seconds.

       --iterations
	   type: int

	   How many times to "--collect" diagnostic data.  By default, the
	   tool	runs forever and collects data every time the trigger occurs.
	   Specify "--iterations" to collect data a limited number of times.
	   This	option is also useful with "--no-stalk"	to collect data	once
	   and exit, for example.

       --log
	   type: string; default: /var/log/pt-stalk.log

	   Print all output to this file when daemonized.

       --match
	   type: string

	   The pattern to use when watching SHOW PROCESSLIST.  See
	   "--function"	for details.

       --notify-by-email
	   type: string

	   Send	an email to these addresses for	every "--collect".

       --password
	   short form: -p; type: string

	   Password to use when	connecting.  If	password contains commas they
	   must	be escaped with	a backslash: "exam\,ple"

       --pid
	   type: string; default: /var/run/pt-stalk.pid

	   Create the given PID	file.  The tool	won't start if the PID file
	   already exists and the PID it contains is different than the
	   current PID.	 However, if the PID file exists and the PID it
	   contains is no longer running, the tool will	overwrite the PID file
	   with	the current PID.  The PID file is removed automatically	when
	   the tool exits.

       --plugin
	   type: string

	   Load	a plugin to hook into the tool and extend is functionality.
	   The specified file does not need to be executable, nor does its
	   first line need to be shebang line.	It only	needs to define	one or
	   more	of these Bash functions:

	   before_stalk
	       Called before stalking.

	   before_collect
	       Called when the trigger occurs, before running a	"--collect"
	       subprocesses in the background.

	   after_collect
	       Called after running a collector	process.  The PID of the
	       collector process is passed as the first	argument.  This	hook
	       is called before	"after_collect_sleep".

	   after_collect_sleep
	       Called after sleeping "--sleep" seconds for the collector
	       process to finish.  This	hook is	called after "after_collect".

	   after_interval_sleep
	       Called after sleeping "--interval" seconds after	each trigger
	       check.

	   after_stalk
	       Called after stalking.  Since pt-stalk stalks forever by
	       default,	this hook is only called if "--iterations" is
	       specified.

	   For example,	a very simple plugin that touches a file when
	   "--collect" is triggered:

	      before_collect() {
		 touch /tmp/foo
	      }

	   Since the plugin is completely sourced (imported) into the tool's
	   namespace, be careful not to	define other functions or global
	   variables that already exist	in the tool.  You should prefix	all
	   plugin-specific functions and global	variables with "plugin_" or
	   "PLUGIN_".

	   Plugins have	access to all command line options but they should not
	   modify them.	 Each option is	a global variable like $OPT_DEST which
	   corresponds to "--dest".  Therefore,	the global variable for	each
	   command line	option is "OPT_" plus the option name in all caps with
	   hyphens replaced by underscores.

	   Plugins can stop the	tool by	setting	the global variable "OKTORUN"
	   to 1.  In this case,	the global variable "EXIT_REASON" should also
	   be set to indicate why the tool was stopped.

	   Plugin writers should keep in mind that the file destination	prefix
	   currently in	use should be accessed through the $prefix variable,
	   rather than $OPT_PREFIX.

       --mysql-only
	   Trigger only	MySQL related captures,	ignoring all others. The only
	   not MySQL related value being collected is the disk space, because
	   it is needed	to calculate the available free	disk space to write
	   the result files.  This option is useful for	RDS instances.

       --port
	   short form: -P; type: int

	   Port	number to use for connection.

       --prefix
	   type: string

	   The filename	prefix for diagnostic samples.	By default, all	files
	   created by the same "--collect" instance have a timestamp prefix
	   based on the	current	local time, like "2011_12_06_14_02_02",	which
	   is December 6, 2011 at 14:02:02.

       --retention-count
	   type: int; default: 0

	   Keep	the data for the last N	runs. If N > 0,	the program will keep
	   the data for	the last N runs	and will delete	the older data.

       --retention-size
	   type: int; default: 0

	   Keep	up to --retention-size MB of data. It will keep	at least 1 run
	   even	if the size is bigger than the specified in this parameter

       --retention-time
	   type: int; default: 30

	   Number of days to retain collected samples.	Any samples that are
	   older will be purged.

       --run-time
	   type: int; default: 30

	   How long to "--collect" diagnostic data when	the trigger occurs.
	   The value is	in seconds and should not be longer than "--sleep".
	   It is usually not necessary to change this; if the default 30
	   seconds doesn't collect enough data,	running	longer is not likely
	   to help because the system or MySQL server is probably too busy to
	   respond.  In	fact, in many cases a shorter collection period	is
	   appropriate.

	   This	value is used two other	times.	After collecting, the collect
	   subprocess will wait	another	"--run-time" seconds for its commands
	   to finish.  Some commands can take awhile if	the system is running
	   very	slowly (which can likely be the	case given that	a collection
	   was triggered).  Since empty	files are deleted, the extra wait
	   gives commands time to finish and write their data.	The value is
	   potentially used again just before the tool exits to	wait again for
	   any collect subprocesses to finish.	In most	cases this won't
	   happen because of the aforementioned	extra wait.  If	it happens,
	   the tool will log "Waiting up to N seconds for subprocesses to
	   finish..." where N is three times "--run-time".  In both cases,
	   after waiting, the tool kills all of	its subprocesses.

       --sleep
	   type: int; default: 300

	   How long to sleep after "--collect".	 This prevents the tool	from
	   triggering continuously, which might	be a problem if	the collection
	   process is intrusive.  It also prevents filling up the disk or
	   gathering too much data to analyze reasonably.

       --sleep-collect
	   type: int; default: 1

	   How long to sleep between collection	loop cycles.  This is useful
	   with	"--no-stalk" to	do long	collections.  For example, to collect
	   data	every minute for an hour, specify: "--no-stalk --run-time 3600
	   --sleep-collect 60".

       --socket
	   short form: -S; type: string

	   Socket file to use for connection.

       --stalk
	   default: yes; negatable: yes

	   Watch the server and	wait for the trigger to	occur.	Specify
	   "--no-stalk"	to collect diagnostic data immediately,	that is,
	   without waiting for the trigger to occur.  You probably also	want
	   to specify values for "--interval", "--iterations", and "--sleep".
	   For example,	to immediately collect data for	1 minute then exit,
	   specify:

	      --no-stalk --run-time 60 --iterations 1

	   "--cycles", "--daemonize", "--log" and "--pid" have no effect with
	   "--no-stalk".  Safeguard options, like "--disk-bytes-free" and
	   "--disk-pct-free", are still	respected.

	   See also "--collect".

       --threshold
	   type: int; default: 25

	   The maximum acceptable value	for "--variable".  "--collect" is
	   triggered when the value of "--variable" is greater than
	   "--threshold" for "--cycles"	many times.  Currently,	there is no
	   way to define a lower threshold to check for	a "--variable" value
	   that	is too low.

	   See also "--function".

       --user
	   short form: -u; type: string

	   User	for login if not current user.

       --variable
	   type: string; default: Threads_running

	   The variable	to compare against "--threshold".  See also
	   "--function".

       --verbose
	   type: int; default: 2

	   Print more or less information while	running.  Since	the tool is
	   designed to be a long-running daemon, the default verbosity level
	   only	prints the most	important information.	If you run the tool
	   interactively, you may want to use a	higher verbosity level.

	     LEVEL PRINTS
	     ===== =====================================
	     0	   Errors
	     1	   Warnings
	     2	   Matching triggers and collection info
	     3	   Non-matching	triggers

       --version
	   Print tool's	version	and exit.

ENVIRONMENT
       This tool does not require any environment variables for	configuration,
       although	it can be influenced to	work differently by through several
       variables.  Keep	in mind	that these are expert settings,	and should not
       be used in most cases.

       Specifically, the variables that	can be set are:

       CMD_GDB
       CMD_IOSTAT
       CMD_MPSTAT
       CMD_MYSQL
       CMD_MYSQLADMIN
       CMD_OPCONTROL
       CMD_OPREPORT
       CMD_PMAP
       CMD_STRACE
       CMD_SYSCTL
       CMD_TCPDUMP
       CMD_VMSTAT

       For example, during collection iostat is	called with a -dx argument,
       but because you have an NFS partition, you also need the	-n flag	there.
       Instead of editing the source, you can call pt-stalk as

	   CMD_IOSTAT="iostat -n" pt-stalk ...

       which will do exactly what you need.  Combined with the plugin hooks,
       this gives you a	fine-grained control of	what the tool does.

       It is possible to enable	"debug"	mode in	mysqladmin specifying:

       "CMD_MYSQLADMIN='mysqladmin debug' pt-stalk params ..."

SYSTEM REQUIREMENTS
       This tool requires Bash v3 or newer.  Certain options require other
       programs:

       "--collect-gdb" requires	"gdb"
       "--collect-oprofile" requires "opcontrol" and "opreport"
       "--collect-strace" requires "strace"
       "--collect-tcpdump" requires "tcpdump"

BUGS
       For a list of known bugs, see <http://www.percona.com/bugs/pt-stalk>.

       Please report bugs at <https://jira.percona.com/projects/PT>.  Include
       the following information in your bug report:

       o   Complete command-line used to run the tool

       o   Tool	"--version"

       o   MySQL version of all	servers	involved

       o   Output from the tool	including STDERR

       o   Input files (log/dump/config	files, etc.)

       If possible, include debugging output by	running	the tool with
       "PTDEBUG"; see "ENVIRONMENT".

DOWNLOADING
       Visit <http://www.percona.com/software/percona-toolkit/>	to download
       the latest release of Percona Toolkit.  Or, get the latest release from
       the command line:

	  wget percona.com/get/percona-toolkit.tar.gz

	  wget percona.com/get/percona-toolkit.rpm

	  wget percona.com/get/percona-toolkit.deb

       You can also get	individual tools from the latest release:

	  wget percona.com/get/TOOL

       Replace "TOOL" with the name of any tool.

AUTHORS
       Baron Schwartz, Justin Swanhart,	Fernando Ipar, Daniel Nichter, and
       Brian Fraser

ABOUT PERCONA TOOLKIT
       This tool is part of Percona Toolkit, a collection of advanced command-
       line tools for MySQL developed by Percona.  Percona Toolkit was forked
       from two	projects in June, 2011:	Maatkit	and Aspersa.  Those projects
       were created by Baron Schwartz and primarily developed by him and
       Daniel Nichter.	Visit <http://www.percona.com/software/> to learn
       about other free, open-source software from Percona.

COPYRIGHT, LICENSE, AND	WARRANTY
       This program is copyright 2011-2018 Percona LLC and/or its affiliates,
       2010-2011 Baron Schwartz.

       THIS PROGRAM IS PROVIDED	"AS IS"	AND WITHOUT ANY	EXPRESS	OR IMPLIED
       WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
       MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

       This program is free software; you can redistribute it and/or modify it
       under the terms of the GNU General Public License as published by the
       Free Software Foundation, version 2; OR the Perl	Artistic License.  On
       UNIX and	similar	systems, you can issue `man perlgpl' or	`man
       perlartistic' to	read these licenses.

       You should have received	a copy of the GNU General Public License along
       with this program; if not, write	to the Free Software Foundation, Inc.,
       59 Temple Place,	Suite 330, Boston, MA  02111-1307  USA.

VERSION
       pt-stalk	3.2.0

perl v5.32.1			  2020-04-23			   PT-STALK(1)

NAME | SYNOPSIS | RISKS | DESCRIPTION | CONFIGURING | OPTIONS | ENVIRONMENT | SYSTEM REQUIREMENTS | BUGS | DOWNLOADING | AUTHORS | ABOUT PERCONA TOOLKIT | COPYRIGHT, LICENSE, AND WARRANTY | VERSION

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=pt-stalk&sektion=1&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help