Re: Simulating Clog Contention - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Simulating Clog Contention
Date
Msg-id CA+U5nMLOH+Cw+njR-p1bVyB5_kS-_0528aSk8KJw+5+POG2O_A@mail.gmail.com
Whole thread Raw
In response to Re: Simulating Clog Contention  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Simulating Clog Contention
List pgsql-hackers
On Thu, Jan 19, 2012 at 2:36 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 12.01.2012 14:31, Simon Riggs wrote:
>>
>> In order to simulate real-world clog contention, we need to use
>> benchmarks that deal with real world situations.
>>
>> Currently, pgbench pre-loads data using COPY and executes a VACUUM so
>> that all hint bits are set on every row of every page of every table.
>> Thus, as pgbench runs it sees zero clog accesses from historical data.
>> As a result, clog access is minimised and the effects of clog
>> contention in the real world go unnoticed.
>>
>> The following patch adds a pgbench option -I to load data using
>> INSERTs, so that we can begin benchmark testing with rows that have
>> large numbers of distinct un-hinted transaction ids. With a database
>> pre-created using this we will be better able to simulate and thus
>> more easily measure clog contention. Note that current clog has space
>> for 1 million xids, so a scale factor of greater than 10 is required
>> to really stress the clog.
>
>
> No doubt this is handy for testing this particular area, but overall I feel
> this is too much of a one-trick pony to include in pgbench.
>
> Alternatively, you could do something like this:

I think the one-trick pony is pgbench. It has exactly one starting
condition for its tests and that isn't even a real world condition.

The main point of including the option into pgbench is to have a
utility that produces as initial test condition that works the same
for everyone, so we can accept each others benchmark results. We both
know that if someone posts that they have done $RANDOMSQL on a table
before running a test, it will just be ignored and people will say
user error. Some people will get it wrong when reproducing things and
we'll have chaos.

The patch exists as a way of testing the clog contention improvement
patches and provides a route to long term regression testing that the
solution(s) still work.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Should I implement DROP INDEX CONCURRENTLY?
Next
From: Dimitri Fontaine
Date:
Subject: Re: Inline Extension