Home > mailing lists

Re: How can we make beta testing better? - Mailing list pgsql-hackers

From	Jehan-Guillaume de Rorthais
Subject	Re: How can we make beta testing better?
Date	April 23, 2014 05:55:20
Msg-id	20140423075514.2ae7ee84@erg Whole thread Raw
In response to	Re: How can we make beta testing better? (Josh Berkus <josh@agliodbs.com>)
List	pgsql-hackers

Tree view

On Thu, 17 Apr 2014 16:42:21 -0700
Josh Berkus <josh@agliodbs.com> wrote:

> On 04/15/2014 09:53 PM, Rod Taylor wrote:
> > A documented beta test process/toolset which does the following would help:
> > 1) Enables full query logging
> > 2) Creates a replica of a production DB, record $TIME when it stops.
> > 3) Allow user to make changes (upgrade to 9.4, change hardware, change
> > kernel settings, ...)
> > 4) Plays queries from the CSV logs starting from $TIME mimicking actual
> > timing and transaction boundaries
> > 
> > If Pg can make it easy to duplicate activities currently going on in
> > production inside another environment, I would be pleased to fire a couple
> > billion queries through it over the next few weeks.
> > 
> > #4 should include reporting useful to the project, such as a sampling of
> > queries which performed significantly worse and a few relative performance
> > stats for overall execution time.
> 
> So we have some software we've been procrastinating on OSS'ing, which does:
> 
> 1) Takes full query CSV logs from a running postgres instance
> 2) Runs them against a target instance in parallel
> 3) Records response times for all queries
> 
> tsung and pgreplay also do this, but have some limitations which make
> them impractical for a general set of logs to replay.

I've been working on another tool able to replay scenario recorded directly
from a network dump (see [pgshark]). It works, can be totally transparent from
the application point of view, the tcpdump can run anywhere, and **ALL** the
real traffic can be replayed...but it needs some more work for reporting and
handling parallel sessions. The drawback of using libpcap is that you can lost
packets while capturing and a very large capture buffer can not keep you safe
for hours of high-speed scenario. So it might require multiple capture and
adjusting the buffer size to capture 100% of the traffic on the required period.

I tried to quickly write a simple proxy using Perl POE to capture ALL the
traffic safely. My POC was doing nothing but forwarding packets and IIRC a 30s
stress test with 10 or 20 sessions using pgbench showed a drop of ~60% of
performances. But it was a very quick POC with a mono-processus/mono-thread
POC.

Maybe another path would be to be able to generate some this traffic dump
from PostgreSQL (which only have the application level to deal with) itself in a
format we can feed to pgbench. 

> What it would need is:
> 
> A) scripting around coordinated backups
> B) Scripting for single-command runs, including changing pg.conf to
> record data.

Changing the pg.conf is pretty easy with alter system now. But I'm sure we all
have some scripts out there doing this (at least I do)

> C) tools to *analyze* the output data, including error messages.

That's what I lack in pgshark so far.

[pgshark] https://github.com/dalibo/pgshark

Cheers,
-- 
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com

pgsql-hackers by date:

From: yamt@netbsd.org (YAMAMOTO Takashi)
Date: 23 April 2014, 04:17:58
Subject: Re: Perfomance degradation 9.3 (vs 9.2) for FreeBSD

From: Simon Riggs
Date: 23 April 2014, 06:11:47
Subject: 9.4 Proposal: Initdb creates a single table

Re: How can we make beta testing better? - Mailing list pgsql-hackers

Previous

Next