pgreplay log file replayer released

Started by Laurenz Albeabout 16 years ago9 messagesgeneral
Jump to latest
#1Laurenz Albe
laurenz.albe@cybertec.at

I announce the first release of pgreplay, version 0.9.0 (Beta).

Project home page: http://pgreplay.projects.postgresql.org/

pgreplay reads a PostgreSQL log file (*not* a WAL file),
extracts the SQL statements and executes them in the same order
and relative time against a PostgreSQL database cluster.

If the execution of statements gets behind schedule, warning
messages are issued that indicate that the server cannot handle
the load in a timely fashion.
The idea is to replay a real-world database workload as exactly
as possible.

pgreplay is useful for performance tests, particularly in the
following situations:

* You want to compare the performance of your PostgreSQL
application on different hardware or different operating systems.
* You want to upgrade your database and want to make sure that
the new database version does not suffer from performance
regressions that affect you.

Features:
* Should compile and run on any platform that PostgreSQL supports
* Can replay the workload at different speeds
* Can parse "stderr" and "csvlog" log files
* Can save workload to replay in "replay file" for reuse

Enjoy!

Yours,
Laurenz Albe

#2Bruce Momjian
bruce@momjian.us
In reply to: Laurenz Albe (#1)
Re: pgreplay log file replayer released

On Wed, Mar 17, 2010 at 2:06 PM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:

I announce the first release of pgreplay, version 0.9.0 (Beta).

Project home page: http://pgreplay.projects.postgresql.org/

pgreplay reads a PostgreSQL log file (*not* a WAL file),
extracts the SQL statements and executes them in the same order
and relative time against a PostgreSQL database cluster.

Do you have a multi-threaded model that tracks which transactions each
query belonged to and runs them concurrently like they were in the
original setup? That's what I've been looking for.

--
greg

#3Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Bruce Momjian (#2)
Re: pgreplay log file replayer released

Greg Stark <gsstark@mit.edu> writes:

Do you have a multi-threaded model that tracks which transactions each
query belonged to and runs them concurrently like they were in the
original setup? That's what I've been looking for.

Tsung does that and has been doing it for… quite some time. It even
comes with a recorder which is a PostgreSQL proxy: connect it to your
server, connect your client to it, and let it record a session at a
time.

Then in the configuration you get to choose how many of each session you
want to mix, etc.

http://tsung.erlang-projects.org/

Regards,
--
dim

My TODO has "write a Tsung blog entry (series?) and a tutorial", but
you'll have to wait until after extensions and some other things, or do
it yourself... sorry about that...

#4Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Dimitri Fontaine (#3)
Re: pgreplay log file replayer released

Dimitri Fontaine wrote:

Greg Stark <gsstark@mit.edu> writes:

Do you have a multi-threaded model that tracks which transactions each
query belonged to and runs them concurrently like they were in the
original setup? That's what I've been looking for.

Tsung does that and has been doing it for… quite some time. It even
comes with a recorder which is a PostgreSQL proxy: connect it to your
server, connect your client to it, and let it record a session at a
time.

Then in the configuration you get to choose how many of each
session you
want to mix, etc.

http://tsung.erlang-projects.org/

pgreplay is single-threaded, but uses asynchronous query processing,
so multiple connections can be handled simultaneously.

pgreplay will use as many connections as the original run did, and
query order and timing are retained.

This is the first time I hear of Tsung - it sounds like a good idea.

I guess it has some advantages over pgreplay; the biggest one that
I can see is that it will see things that are not logged, like
COPY data.

It seems that Tsung currently only supports "basic queries", but I
assume that this can be improved. One thing that Tsung, recording
queries as proxy, will never be able to handle are encrypted connections,
but I guess that's a minor problem.

On the usability side, Tsung will require that all clients are redirected
to the recording proxy, while pgreplay will only require that the logging
configuration settings on the server are changed. This can be an advantage
in large distributed production environments.

Yours,
Laurenz Albe

#5Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Laurenz Albe (#4)
Re: pgreplay log file replayer released

"Albe Laurenz" <laurenz.albe@wien.gv.at> writes:

It seems that Tsung currently only supports "basic queries", but I
assume that this can be improved.

In fact from the time when PostgreSQL support was added, some more
Erlang drivers have appeared and some of them covers the entire
protocol. So it should be possible to update Tsung to use them, given
some interest.

One thing that Tsung, recording
queries as proxy, will never be able to handle are encrypted connections,
but I guess that's a minor problem.

Yes, because you typically run the proxy only to record sessions, in
order to prepare the tsung setup. Another way to go from logs is to use
pgfouine, which knows how to prepare a tsung config from PostgreSQL
logs:

http://pgfouine.projects.postgresql.org/tsung.html

On the usability side, Tsung will require that all clients are redirected
to the recording proxy, while pgreplay will only require that the logging
configuration settings on the server are changed. This can be an advantage
in large distributed production environments.

Well never use the tsung recorder in production. Ever. Run it to
construct your sessions files from your application.

Regards,
--
dim

#6Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Dimitri Fontaine (#5)
Re: pgreplay log file replayer released

Dimitri Fontaine wrote:

One thing that Tsung, recording
queries as proxy, will never be able to handle are encrypted connections,
but I guess that's a minor problem.

Yes, because you typically run the proxy only to record sessions, in
order to prepare the tsung setup. Another way to go from logs is to use
pgfouine, which knows how to prepare a tsung config from PostgreSQL
logs:

http://pgfouine.projects.postgresql.org/tsung.html

That's nice!

On the usability side, Tsung will require that all clients are redirected
to the recording proxy, while pgreplay will only require that the logging
configuration settings on the server are changed. This can be an advantage
in large distributed production environments.

Well never use the tsung recorder in production. Ever. Run it to
construct your sessions files from your application.

I see. So that case covers the creation of "artificial" session data as
opposed to the above case.

Yours,
Laurenz Albe

#7Ben
bench@silentmedia.com
In reply to: Dimitri Fontaine (#5)
[SPAM] Re: pgreplay log file replayer released

On Mar 23, 2010, at 4:08 AM, Dimitri Fontaine wrote:

One thing that Tsung, recording
queries as proxy, will never be able to handle are encrypted connections,
but I guess that's a minor problem.

Yes, because you typically run the proxy only to record sessions, in
order to prepare the tsung setup. Another way to go from logs is to use
pgfouine, which knows how to prepare a tsung config from PostgreSQL
logs:

http://pgfouine.projects.postgresql.org/tsung.html

We've actually found that pgfouine, while a great tool for most things, does not do a terribly good job at recreating production sessions for tsung. This isn't pgfouine's fault (although there is a bug there I still need to file) but rather with the fundamental design of Tsung and its origins as a tool to test stateless servers. If I want to replay a log *exactly* as it happened, tsung cannot help me.

I definitely look forward to using pgreplay the next time we need to compare hardware classes on known production load.

#8Greg Smith
gsmith@gregsmith.com
In reply to: Laurenz Albe (#4)
Re: pgreplay log file replayer released

I just summarized some of the discussion on this thread and created a
wiki page that starts to cover each of the three tools now available for
this job: http://wiki.postgresql.org/wiki/Statement_Playback

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

#9Jimmy Zhang
crackeur@comcast.net
In reply to: Ben (#7)
[ANN]VTD-XML 2.9

VTD-XML 2.9, the next generation XML Processing API for SOA and Cloud computing, has been released. Please visit https://sourceforge.net/projects/vtd-xml/files/ to download the latest version.
a.. Strict Conformance
a.. VTD-XML now fully conforms to XML namespace 1.0 spec
b.. Performance Improvement
a.. Significantly improved parsing performance for small XML files
c.. Expand Core VTD-XML API
a.. Adds getPrefixString(), and toNormalizedString2()
d.. Cutting/Splitting
a.. Adds getSiblingElementFragment()
e.. A number of bug fixes and code enhancement including:
a.. Fixes a bug for reading very large XML documents on some platforms
b.. Fixes a bug in parsing processing instruction
c.. Fixes a bug in outputAndReparse()