Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions? - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?
Date
Msg-id AE62A5AA-500F-405B-BBE6-FBA4A99CB17A@yesql.se
Whole thread Raw
In response to Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?
List pgsql-hackers
> On 29 Jul 2020, at 10:32, Michael Paquier <michael@paquier.xyz> wrote:
> 
> On Wed, Jul 01, 2020 at 06:24:18PM +0900, Michael Paquier wrote:
>> I am not sure either, still it looks worth thinking about it.
>> Attached is a rebased version of the last patch.  What this currently
>> holds is just the switch to heap_multi_insert() for the three catalogs
>> pg_attribute, pg_depend and pg_shdepend.  One point that looks worth
>> debating about is to how much to cap the data inserted at once.  This
>> uses 64kB for all three, with a number of slots chosen based on the
>> size of each record, similarly to what we do for COPY.
> 
> I got an extra round of review done for this patch.  

Thanks!

> While on it, I have done some measurements to see the difference in
> WAL produced and get an idea of the gain.

> On HEAD, with a table that has 1300 attributes, this leads to 563kB of 
> WAL produced.  With the patch, we get down to 505kB.  That's an
> extreme case of course, but that's nice a nice gain.
> 
> A second test, after creating a database from a template that has
> roughly 10k entries in pg_shdepend (10k empty tables actually), showed
> a reduction from 2158kB to 1762kB in WAL.

Extreme cases for sure, but more importantly, there should be no cases when
this would cause an increase wrt the status quo.

> Finally comes the third catalog, pg_depend, and there is one thing
> that makes me itching about this part.  We do a lot of useless work
> for the allocation and destruction of the slots when there are pinned
> dependencies, and there can be a lot of them.  Just by running the
> main regression tests, it is easy to see that in 0003 we still do a
> lot of calls of recordMultipleDependencies() for one single
> dependency, and that most of these are actually pinned.  So we finish
> by doing a lot of slot manipulation to insert nothing at the end,
> contrary to the counterparts with pg_shdepend and pg_attribute.  

Maybe it'd be worth pre-computing by a first pass which tracks pinned objects
in a bitmap; with a second pass which then knows how many and which to insert
into slots?

> In
> short, I think that for now it would be fine to commit a patch that
> does the multi-INSERT optimization for pg_attribute and pg_shdepend,
> but that we need more careful work for pg_depend.

Fair enough, let's break out pg_depend and I'll have another go at that.

cheers ./daniel



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: PATCH: Add uri percent-encoding for binary data
Next
From: Thomas Munro
Date:
Subject: Re: recovering from "found xmin ... from before relfrozenxid ..."