Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
pgreplay(1)		     PostgreSQL	Utilities		   pgreplay(1)

       pgreplay	- PostgreSQL log file replayer for performance tests

       pgreplay	[parse options]	[replay	options] [-d level] [infile]
       pgreplay	-f [parse options] [-o outfile]	[-d level] [infile]
       pgreplay	-r [replay options] [-d	level] [infile]

       pgreplay	reads a	PostgreSQL log file (not a WAL file), extracts the SQL
       statements and executes them  in	 the  same  order  and	relative  time
       against a PostgreSQL database cluster.  A final report gives you	a use-
       ful statistical analysis	of your	workload and its execution.

       In the first form, the log file infile is replayed at the  time	it  is

       With the	-f option, pgreplay will not execute the statements, but write
       them to a `replay file' outfile that can	be  replayed  with  the	 third

       With  the -r option, pgreplay will execute the statements in the	replay
       file infile that	was created by the second form.

       If the execution	of statements gets behind schedule,  warning  messages
       are  issued  that  indicate that	the server cannot handle the load in a
       timely fashion.	The idea is to replay a	real-world  database  workload
       as exactly as possible.

       To  create  a  log file that can	be parsed by pgreplay, you need	to set
       the following parameters	in postgresql.conf:

	      log_min_messages=error (or more)
	      log_min_error_statement=log (or more)
	      log_line_prefix='%m|%u|%d|%c|' (if you don't use CSV logging)
	      lc_messages must be set to English (encoding does	not matter)
	      bytea_output=escape (from	version	9.0 on,	only if	 you  want  to
	      replay the log on	8.4 or earlier)

       The  database  cluster against which you	replay the SQL statements must
       be a clone of the database cluster that generated  the  logs  from  the
       time immediately	before the logs	were generated.

       pgreplay	is useful for performance tests, particularly in the following

       *   You want to compare the performance of your PostgreSQL  application
	   on different	hardware or different operating	systems.

       *   You	want  to  upgrade your database	and want to make sure that the
	   new database	version	does not suffer	from  performance  regressions
	   that	affect you.

       Moreover, pgreplay can give you some feeling as to how your application
       might scale by allowing you to try to replay the	workload at  a	higher
       speed.	Be  warned,  though, that 500 users working at double speed is
       not really the same as 1000 users working at normal speed.

   Parse options:
       -c     Specifies	that the log file is in	'csvlog' format	(highly	recom-
	      mended) and not in 'stderr' format.

       -b timestamp
	      Only  log	 entries  greater  or  equal to	that timestamp will be
	      parsed.  The format is YYYY-MM-DD	HH:MM:SS.FFF like in  the  log
	      file.  An	optional time zone part	will be	ignored.

       -e timestamp
	      Only log entries less or equal to	that timestamp will be parsed.
	      The format is YYYY-MM-DD HH:MM:SS.FFF like in the	log file.   An
	      optional time zone part will be ignored.

       -q     Specifies	 that  a backslash in a	simple string literal will es-
	      cape the following single	quote.	This depends on	 configuration
	      options  like standard_conforming_strings	and is the default for
	      server version 9.0 and less.

       -D database
	      Only log entries related	to  the	 specified  database  will  be
	      parsed  (this  option  can  be specified multiple	times for more
	      than one database).

       -U username
	      Only log entries related	to  the	 specified  username  will  be
	      parsed  (this  option  can  be specified multiple	times for more
	      than one user).

   Replay options:
       -h hostname
	      Host name	where the target database cluster is running  (or  di-
	      rectory  where  the UNIX socket can be found). Defaults to local
	      This works just like the -h option of psql.

       -p port
	      TCP port where the target	database cluster can be	reached.

       -W password
	      By default, pgreplay assumes that	the target database cluster is
	      configured  for trust authentication. With the -W	option you can
	      specify a	password that will be used for all users in the	 clus-

       -s factor
	      Speed  factor  for  replay,  by default 1. This can be any valid
	      positive floating	point number. A	factor less than 1 will	replay
	      the  workload  in	 `slow	motion', while a factor	greater	than 1
	      means `fast forward'.

       -E encoding
	      Specifies	the encoding of	the log	file, which will  be  used  as
	      client  encoding	during	replay.	If it is omitted, your default
	      client encoding will be used.

       -j     If all connections are idle, jump	ahead to the next request  in-
	      stead  of	 sleeping. This	will speed up replay. Execution	delays
	      will still be reported correctly,	but replay statistics will not
	      contain the idle time.

       -X options
	      Extra  connection	options	for replay connections.	 These must be
	      libpq connection options specified in the	 format	 `option=value

   Output options:
       -o outfile
	      specifies	 the  replay file where	the statements will be written
	      for later	replay.

   Debug options:
       -d level
	      Specifies	the trace level	(between 1 and 3).  Increasing	levels
	      will  produce  more  detailed information	about what pgreplay is

       -v     Prints the program version and exits.

       PGHOST Specifies	the default value for the -h option.

       PGPORT Specifies	the default value for the -p option.

	      Specifies	the default value for the -E option.

       pgreplay	can only replay	what is	logged by PostgreSQL.  This  leads  to
       some limitations:

       *   COPY	statements will	not be replayed, because the copy data are not

       *   Fast-path API function calls	are not	logged and  will  not  be  re-
	   played.  Unfortunately, this	includes the Large Object API.

       *   Since  the log file is always in the	server encoding	(which you can
	   specify with	the -E switch of pgreplay),  all  SET  client_encoding
	   statements will be ignored.

       *   Since  the  preparation  time  of prepared statements is not	logged
	   (unless log_min_messages is debug2 or more),	these statements  will
	   be  prepared	 immediately before they are first executed during re-

       *   Because the log file	contains only text, query parameters  and  re-
	   turn	 values	 will always be	in text	and never in binary format. If
	   you use binary mode to, say,	transfer large binary  data,  pgreplay
	   can cause significantly more	network	traffic	than the original run.

       *   Sometimes, if a connection takes longer to complete,	the session ID
	   unexpectedly	changes	in the PostgreSQL log file. This causes	 pgre-
	   play	 to  treat  the	session	as two different ones, resulting in an
	   additional connection. This is arguably a bug in PostgreSQL.

       Written by Laurenz Albe <>.

Jun 2011							   pgreplay(1)


Want to link to this manual page? Use this URL:

home | help