Re: Extracting SQL from logs in a usable format - Mailing list pgsql-general

From Chris Ernst
Subject Re: Extracting SQL from logs in a usable format
Date
Msg-id 4B2C1938.4080206@esoft.com
Whole thread Raw
In response to Re: Extracting SQL from logs in a usable format  (Rory Campbell-Lange <rory@campbell-lange.net>)
Responses Re: Extracting SQL from logs in a usable format  (Greg Smith <greg@2ndquadrant.com>)
List pgsql-general
Rory Campbell-Lange wrote:
> On 18/12/09, Chris Ernst (cernst@esoft.com) wrote:
>> I have a project where I need to be able to capture every query from a
>> production system into a file such that I can "replay" them on a staging
>> system.  Does such a thing already exist or should I start writing my
>> own log parser?
>
> I'm sure there are a number of log replay systems already in existence.
> Perhaps you could use Londiste, and introduce a lag to the replication
> process if it is required?
> http://skytools.projects.postgresql.org/doc/londiste.ref.html#toc17
>
> A very simple way of doing this is to log all of the SQL statements by
> setting the postgresql.conf parameter log_min_duration_statement to 0.
> You can then easily parse out the SQL statements from the log file.
> I have done this before quite successfully when wishing to replay a long
> set of SQL statements to test un upgrade of a Postgresql server.

Hi Rory,

Thank you for the quick reply.

Londiste isn't really an option as it (apparently) would only get
INSERT, UPDATE and DELETE queries.  I would want to capture every query
that is run, including SELECTs.  Plus, the production master is already
running slony1, and I don't think they will play nice together.

My goal is to be able to replay a set of actual production traffic on
the staging server, starting from a snapshot at the point where the
statement logging began.  Then make some changes (tweak settings,
upgrade versions, make DDL changes, etc.) and rerun the same set of
statements to analyze the results with pgFouine.

I started writing my own log parser to pull the statements from the
postgres logs, but as I get in to the details, it's not quite as
straight forward as I had thought.  Keeping track of which client
connections have prepared queries, merging the correct arguments and
maintaining concurrency are all critical and far from trivial.

Basically I'm curious if anyone has already created something that does
this or am I treading into uncharted waters?  I've been googling around
a bit and haven't come up with anything yet.


Thanks again,

    - Chris

pgsql-general by date:

Previous
From: Alex -
Date:
Subject: Re: PL/Perl Performance Problems
Next
From: Alvaro Herrera
Date:
Subject: Re: PL/Perl Performance Problems