Home > mailing lists

Re: Custom table AMs need to include heapam.h because ofBulkInsertState - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Custom table AMs need to include heapam.h because ofBulkInsertState
Date	July 16, 2019 18:46:11
Msg-id	20190716184611.6jy7g4t4kongaeg7@alap3.anarazel.de Whole thread Raw
In response to	Re: Custom table AMs need to include heapam.h because of BulkInsertState (David Rowley <david.rowley@2ndquadrant.com>)
Responses	Re: Custom table AMs need to include heapam.h because of BulkInsertState Re: Custom table AMs need to include heapam.h because of BulkInsertState
List	pgsql-hackers

Tree view

Hi,

Sorry for not chiming in again earlier, I was a bit exhausted...

On 2019-07-03 19:46:06 +1200, David Rowley wrote:
> I think the only objection to doing it the way [2] did was, if there
> are more than MAX_PARTITION_BUFFERS partitions then we may end up
> evicting the CopyMultiInsertBuffer out of the CopyMultiInsertInfo and
> thus cause a call to table_finish_bulk_insert() before we're done with
> the copy.

Right.

> It's not impossible that this could happen many times for a
> given partition.  I agree that a working version of [2] is cleaner
> than [1] but it's just the thought of those needless calls.

I think it's fairly important to optimize this. E.g. emitting
unnecessary fsyncs as it'd happen for heap is a pretty huge constant to
add to bulk loading.

> For [1], I wasn't very happy with the way it turned out which is why I
> ended up suggesting a few other ideas. I just don't really like either
> of them any better than [1], so I didn't chase those up, and that's
> why I ended up going for [2].

Yea, I don't like [1] either - they all seems too tied to copy.c's
usage.  Ideas:

1) Have ExecFindPartition() return via a bool* whether the partition is
   being accessed for the first time. In copy.c push the partition onto
   a list of to-be-bulk-finished tables.
2) Add a execPartition.c function that returns all the used tables from
   a PartitionTupleRouting*.

both seem cleaner to me than your proposals in [1], albeit not perfect
either. I think knowing which partitions are referenced is a reasonable
thing to want from the partition machinery. But using bulk-insert etc
seems outside of execPartition.c's remit, so doing that in copy.c seems
to make sense.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Alexander Korotkov
Date: 16 July 2019, 18:44:39
Subject: Re: SQL/JSON path issues/questions

From: Daniel Gustafsson
Date: 16 July 2019, 18:48:20
Subject: Re: A little report on informal commit tag usage

Re: Custom table AMs need to include heapam.h because ofBulkInsertState - Mailing list pgsql-hackers

Previous

Next