Re: pgbench: INSERT workload, FK indexes, filler fix - Mailing list pgsql-hackers

From David Christensen
Subject Re: pgbench: INSERT workload, FK indexes, filler fix
Date
Msg-id lzsg0yuj1h.fsf@veeddrois.attlocal.net
Whole thread Raw
In response to Re: pgbench: INSERT workload, FK indexes, filler fix  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
Fabien COELHO writes:

> Hello Greg,
>
> Some quick feedback about the patch and the arguments.
>
> Filling: having an empty string/NULL has been bothering me for some time. However there is a
> significant impact on the client/server network stream while initializing or running queries, which
> means that pgbench older performance report would be comparable to newer ones, which is a pain even 
> if the new results do make sense, as you noted in a comment. I'm okay with breaking that, but it
> would require a consensus: People would run pgbench on a previous install, upgrade, run pgbench
> again, and report a massive performance regression. Who will have to deal with that noise?

I agree that it is a behavior change, but "filler" that literally includes nothing but a NULL bitmap
or minimal-length column isn't really measuring what it sets out to measure, so to me it seems like
we need to bite the bullet and just start doing what we claim to already be doing; this is something
that has been inaccurate for a long time, and continuing to keep it inaccurate in the name of
consistency seems to be the wrong tack to take here.  (My argument to the group at large, not you
specifically.)

I assume that we will need to include a big note in the documentation about the behavior change,
perhaps even a note in the output of pgbench itself; the "right" answer can be bikeshedded about.

> A work around could be to add new workloads with different names, and let the previous workloads
> more or less as is.

You're basically suggesting "tpcb-like-traditional" and "tcpb-like-actual"? :-)  I guess that would
be an approach of sorts, though more than one of the built-ins needed to change in this, and I
question how useful expanding these workloads will be.

> "--insert-only" as a short hand for "-b insert-only": I do not think this is really needed to save 1
> char. Also note that "-b i" would probably work.

Fair; I was just mirroring the existing structure.

> extra indexes: I'm ok on principle. Do we want an option for that though? Isn't adding "i" to -I
> enough? Also I do not like much the code which modifies the -I provided string to add a "i".

To me it seems disingenuous to setup a situation where you'd have FKs with no indexes, which is why
I'd added that modification; unless you're talking anout something different?

>> After bouncing the possibilities around a little, David and I thought this
>> specific set of changes might be the right amount of change for one PG
>> version.
>
> Hmmm. I was hoping for more changes:-) Eg the current error handling patch would be great.

I'm happy to continue working on improving this part of the program.

>> benchmark noise from where I started at with PG.  The $750 USD AMD retail
>> chip in my basement lab pushes 1M TPS of prepared SELECT statements over
>> sockets.  Plus or minus 84 bytes per row in a benchmark database doesn't
>> worry me so much anymore.
>
> AFAICR the space is actually allocated by pg and filled with blanks, just not transfered by the
> protocol? For an actual network connection I guess the effect should be quite noticeable.

This patchset included filling with actual bytes, not just padding (or implied padding via
char(n)).  Depending on how this is invoked, it could definitely add some network overhead (though
I'd be surprised if it pushed it over a single packet relative to the original size of the query).

>> [...]
>> I personally would prefer to see pgbench lead by example here, that tables
>> related this way should be indexed with FKs by default, as the Right Way to
>> do such things.
>
> I do agree that the default should be the good choices, and that some manual effort should be done
> to get the bad ones. The only issue is that people do not like change.

Heh, you are not wrong here.  Hopefully we can get some consensus about this being the right way
forward.

Best,

David



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: make world and install-world without docs
Next
From: Bharath Rupireddy
Date:
Subject: Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options