Thread: Custom table AMs need to include heapam.h because of BulkInsertState

Custom table AMs need to include heapam.h because of BulkInsertState

From
Michael Paquier
Date:
Hi all,

I have been playing lately with the table AM API to do some stuff, and
I got surprised that in the minimum set of headers which needs to be
included for a table AM we have a hard dependency with heapam.h for
BulkInsertState and vacuum.h for VacuumParams.

I am fine to live with the dependency with vacuum.h as it is not that
strange.  However for BulkInsertState we get a hard dependency with a
heap-related area and it seems to me that we had better move that part
out of heapam.c, as we want a clear dependency cut with the heap AM
for any new custom table AM.

I'd like to think that the best way to deal with that and reduce the
confusion would be to move anything related to bulk inserts into their
own header/file, meaning the following set:
- ReleaseBulkInsertStatePin
- GetBulkInsertState
- FreeBulkInsertState
There is the argument that we could also move that part into tableam.h
itself though as some of the rather generic table-related callbacks,
but that seems grotty.  So I think that we could just move that stuff
as backend/access/common/bulkinsert.c.

Thoughts?
--
Michael

Attachment

Re: Custom table AMs need to include heapam.h because ofBulkInsertState

From
Andres Freund
Date:
Hi,

On 2019-06-01 15:09:24 -0400, Michael Paquier wrote:
> I have been playing lately with the table AM API to do some stuff, and
> I got surprised that in the minimum set of headers which needs to be
> included for a table AM we have a hard dependency with heapam.h for
> BulkInsertState and vacuum.h for VacuumParams.

I've noted this before as a future todo.


> I'd like to think that the best way to deal with that and reduce the
> confusion would be to move anything related to bulk inserts into their
> own header/file, meaning the following set:
> - ReleaseBulkInsertStatePin
> - GetBulkInsertState
> - FreeBulkInsertState
> There is the argument that we could also move that part into tableam.h
> itself though as some of the rather generic table-related callbacks,
> but that seems grotty.  So I think that we could just move that stuff
> as backend/access/common/bulkinsert.c.

Yea, I think we should do that at some point. But I'm not sure this is
the right design. Bulk insert probably needs to rather be something
that's allocated inside the AM.

Greetings,

Andres Freund



Re: Custom table AMs need to include heapam.h because ofBulkInsertState

From
Michael Paquier
Date:
On Sat, Jun 01, 2019 at 12:19:43PM -0700, Andres Freund wrote:
> Yea, I think we should do that at some point. But I'm not sure this is
> the right design. Bulk insert probably needs to rather be something
> that's allocated inside the AM.

Yeah, actually you may be right that I am not taking the correct path
here.  At quick glance it looks that there is a strong relationship
between the finish_bulk_insert callback and the bistate free already,
so we could do much better than moving the code around.  Perhaps we
could just have a TODO?  As one of the likely-doable items.
--
Michael

Attachment

Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
Robert Haas
Date:
On Sat, Jun 1, 2019 at 3:09 PM Michael Paquier <michael@paquier.xyz> wrote:
> I am fine to live with the dependency with vacuum.h as it is not that
> strange.  However for BulkInsertState we get a hard dependency with a
> heap-related area and it seems to me that we had better move that part
> out of heapam.c, as we want a clear dependency cut with the heap AM
> for any new custom table AM.

Yeah, I noticed this, too.  +1 for doing something about it.  Not sure
exactly what is the best approach.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Custom table AMs need to include heapam.h because ofBulkInsertState

From
Michael Paquier
Date:
On Tue, Jun 04, 2019 at 10:18:03AM -0400, Robert Haas wrote:
> On Sat, Jun 1, 2019 at 3:09 PM Michael Paquier <michael@paquier.xyz> wrote:
>> I am fine to live with the dependency with vacuum.h as it is not that
>> strange.  However for BulkInsertState we get a hard dependency with a
>> heap-related area and it seems to me that we had better move that part
>> out of heapam.c, as we want a clear dependency cut with the heap AM
>> for any new custom table AM.
>
> Yeah, I noticed this, too.  +1 for doing something about it.  Not sure
> exactly what is the best approach.

One thing which is a bit tricky is that for example with COPY FROM we
have a code path which is able to release a buffer held by the bulk
insert state.  So I think that we could get easily out by combining
the bistate free path with finish_bulk_insert, create the bistate
within the AM when doing a single or multi tuple insert, and having
one extra callback to release a buffer held.  Still this last bit does
not completely feel right in terms of flexibility and readability.

Note as well that we never actually use bistate when calling
table_tuple_insert_speculative() on HEAD.  I guess that the argument
is here for consistency with the tuple_insert callback.  Could we do
something separately about that?
--
Michael

Attachment

Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
Robert Haas
Date:
On Thu, Jun 6, 2019 at 10:29 PM Michael Paquier <michael@paquier.xyz> wrote:
> One thing which is a bit tricky is that for example with COPY FROM we
> have a code path which is able to release a buffer held by the bulk
> insert state.

Are you talking about the call to ReleaseBulkInsertStatePin, or something else?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
Robert Haas
Date:
On Sat, Jun 1, 2019 at 3:20 PM Andres Freund <andres@anarazel.de> wrote:
> > I'd like to think that the best way to deal with that and reduce the
> > confusion would be to move anything related to bulk inserts into their
> > own header/file, meaning the following set:
> > - ReleaseBulkInsertStatePin
> > - GetBulkInsertState
> > - FreeBulkInsertState
> > There is the argument that we could also move that part into tableam.h
> > itself though as some of the rather generic table-related callbacks,
> > but that seems grotty.  So I think that we could just move that stuff
> > as backend/access/common/bulkinsert.c.
>
> Yea, I think we should do that at some point. But I'm not sure this is
> the right design. Bulk insert probably needs to rather be something
> that's allocated inside the AM.

As far as I can see, any on-disk, row-oriented, block-based AM is
likely to want the same implementation as the heap.  Column stores
might want to pin multiple buffers, and an in-memory AM might have a
completely different set of requirements, but something like zheap
really has no reason to depart from what the heap does.  I think it's
really important that new table AMs not only have the option to do
something different than the heap in any particular area, but that
they also have the option to do the SAME thing as the heap without
having to duplicate a bunch of code.  So I think it would be
reasonable to start by doing some pure code movement here, along the
lines proposed by Michael -- not sure if src/backend/access/common is
right or if it should be src/backend/access/table -- and then add the
abstraction afterwards.  Worth noting is ReadBufferBI() also needs
moving and is a actually a bigger problem than the functions that
Michael mentioned, because the other functions are accessible if
you're willing to stoop to including heap-specific headers, but that
function is static and you'll have to just copy-and-paste it.  Uggh.

Here's a draft design for adding some abstraction, roughly modeled on
the abstraction Andres added for TupleTableSlots:

1. a BulkInsertState becomes a struct whose only member is a pointer
to const BulkInsertStateOps *const ops

2. that structure has a member for each defined operation on a BulkInsertState:

void (*free)(BulkInsertState *);
void (*release_pin)(BulkInsertState *); // maybe rename to make it more generic

3. table AM gets a new member BulkInsertState
*(*create_bistate)(Relation Rel) and a corresponding function
table_create_bistate(), analogous to table_create_slot(), which can
call the constructor function for the appropriate type of
BulkInsertState and return the result

4. each type of BulkInsertState has its own functions for making use
of it, akin to ReadBufferBI.  That particular function signature is
only likely to be correct for something that does more-or-less what
the existing type of BulkInsertState does; if you're using a
column-store that pins multiple buffers or something, you'll need your
own code path.  But that's OK, because ReadBufferBI or whatever other
functions you have are only going to get called from AM-specific code,
which will know what type of BulkInsertState they have got, because
they are in control of which kind of BulkInsertState gets created for
their relations as per point #4, so they can just call the right
functions.

5. The current implementation of BulkInsertState gets renamed to
BlockBulkInsertState (or something else) and is used by heap and any
AMs that like it.

I see Michael's point about the relationship between
finish_bulk_insert() and the BulkInsertState, and maybe if we could
figure that out we could avoid the need for a BulkInsertState to have
a free method (or maybe any methods at all, in which case it could
just be an opaque struct, like a Node). However, it looks to me as
though copy.c can create a bunch of BulkInsertStates but only call
finish_bulk_insert() once, so unless that's a bug in need of fixing I
don't quite see how to make that approach work.

Comments?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Custom table AMs need to include heapam.h because ofBulkInsertState

From
Andres Freund
Date:
Hi,

(David, see bottom if you're otherwise not interested).

On 2019-06-07 09:48:29 -0400, Robert Haas wrote:
> On Sat, Jun 1, 2019 at 3:20 PM Andres Freund <andres@anarazel.de> wrote:
> > > I'd like to think that the best way to deal with that and reduce the
> > > confusion would be to move anything related to bulk inserts into their
> > > own header/file, meaning the following set:
> > > - ReleaseBulkInsertStatePin
> > > - GetBulkInsertState
> > > - FreeBulkInsertState
> > > There is the argument that we could also move that part into tableam.h
> > > itself though as some of the rather generic table-related callbacks,
> > > but that seems grotty.  So I think that we could just move that stuff
> > > as backend/access/common/bulkinsert.c.
> >
> > Yea, I think we should do that at some point. But I'm not sure this is
> > the right design. Bulk insert probably needs to rather be something
> > that's allocated inside the AM.
> 
> As far as I can see, any on-disk, row-oriented, block-based AM is
> likely to want the same implementation as the heap.

I'm pretty doubtful about that. It'd e.g. would make a ton of sense to
keep the VM pinned, even for heap. You could also do a lot better with
toast.  And for zheap we'd - unless we change the design - quite
possibly benefit from keeping the last needed tpd buffer around.


> Here's a draft design for adding some abstraction, roughly modeled on
> the abstraction Andres added for TupleTableSlots:

Hm, I'm not sure I see the need for a vtable based approach here. Won't
every AM know exactly what they need / have?  I'm not convinced it's
worthwhile to treat that separately from the tableam. I.e.  have a
BulkInsertState struct with *no* members, and then, as you suggest:

> 
> 3. table AM gets a new member BulkInsertState
> *(*create_bistate)(Relation Rel) and a corresponding function
> table_create_bistate(), analogous to table_create_slot(), which can
> call the constructor function for the appropriate type of
> BulkInsertState and return the result

but also route the following through the AM:

> 2. that structure has a member for each defined operation on a BulkInsertState:
> 
> void (*free)(BulkInsertState *);
> void (*release_pin)(BulkInsertState *); // maybe rename to make it more generic

Where free would just be part of finish_bulk_insert, and release_pin a
new callback.


> 4. each type of BulkInsertState has its own functions for making use
> of it, akin to ReadBufferBI.

Right, I don't think that's avoidable unfortunately.



> I see Michael's point about the relationship between
> finish_bulk_insert() and the BulkInsertState, and maybe if we could
> figure that out we could avoid the need for a BulkInsertState to have
> a free method (or maybe any methods at all, in which case it could
> just be an opaque struct, like a Node).

Right, so we actually eneded up at the same place. And you found a bug:

> However, it looks to me as though copy.c can create a bunch of
> BulkInsertStates but only call finish_bulk_insert() once, so unless
> that's a bug in need of fixing I don't quite see how to make that
> approach work.

That is a bug. Not a currently "active" one with in-core AMs (no
dangerous bulk insert flags ever get set for partitioned tables), but we
obviously need to fix it nevertheless.

Robert, seems we'll have to - regardless of where we come down on fixing
this bug - have to make copy use multiple BulkInsertState's, even in the
CIM_SINGLE (with proute) case. Or do you have a better idea?

David, any opinions on how to best fix this? It's not extremely obvious
how to do so best in the current setup of the partition actually being
hidden somewhere a few calls away (i.e. the table_open happening in
ExecInitPartitionInfo()).

Greetings,

Andres Freund



Re: Custom table AMs need to include heapam.h because ofBulkInsertState

From
Michael Paquier
Date:
On Fri, Jun 07, 2019 at 08:55:36AM -0400, Robert Haas wrote:
> Are you talking about the call to ReleaseBulkInsertStatePin, or something else?

Yes, I was referring to ReleaseBulkInsertStatePin()
--
Michael

Attachment

Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
David Rowley
Date:
On Sat, 8 Jun 2019 at 04:51, Andres Freund <andres@anarazel.de> wrote:
>
> On 2019-06-07 09:48:29 -0400, Robert Haas wrote:
> > However, it looks to me as though copy.c can create a bunch of
> > BulkInsertStates but only call finish_bulk_insert() once, so unless
> > that's a bug in need of fixing I don't quite see how to make that
> > approach work.
>
> That is a bug. Not a currently "active" one with in-core AMs (no
> dangerous bulk insert flags ever get set for partitioned tables), but we
> obviously need to fix it nevertheless.
>
> Robert, seems we'll have to - regardless of where we come down on fixing
> this bug - have to make copy use multiple BulkInsertState's, even in the
> CIM_SINGLE (with proute) case. Or do you have a better idea?
>
> David, any opinions on how to best fix this? It's not extremely obvious
> how to do so best in the current setup of the partition actually being
> hidden somewhere a few calls away (i.e. the table_open happening in
> ExecInitPartitionInfo()).

That's been overlooked. I agree it's not a bug with heap, since
heapam_finish_bulk_insert() only does anything there when we're
skipping WAL, which we don't do in copy.c for partitioned tables.
However, who knows what other AMs will need, so we'd better fix that.

My proposed patch is attached.

I ended up moving the call to CopyMultiInsertInfoCleanup() down to
after we call table_finish_bulk_insert for the main table. This might
not be required but I lack imagination right now to what AMs might put
in the finish_bulk_insert call, so doing this seems safer.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
Robert Haas
Date:
On Fri, Jun 7, 2019 at 12:51 PM Andres Freund <andres@anarazel.de> wrote:
> > As far as I can see, any on-disk, row-oriented, block-based AM is
> > likely to want the same implementation as the heap.
>
> I'm pretty doubtful about that. It'd e.g. would make a ton of sense to
> keep the VM pinned, even for heap. You could also do a lot better with
> toast.  And for zheap we'd - unless we change the design - quite
> possibly benefit from keeping the last needed tpd buffer around.

That's fair enough to a point, but I'm not trying to enforce code
reuse; I'm trying to make it possible.  If it's good enough for the
heap, which is really the gold standard for AMs until somebody manages
to do better, it's entirely reasonable for somebody else to want to
just do it the way the heap does.  We gain nothing by making that
difficult.

> > Here's a draft design for adding some abstraction, roughly modeled on
> > the abstraction Andres added for TupleTableSlots:
>
> Hm, I'm not sure I see the need for a vtable based approach here. Won't
> every AM know exactly what they need / have?  I'm not convinced it's
> worthwhile to treat that separately from the tableam. I.e.  have a
> BulkInsertState struct with *no* members, and then, as you suggest:

Hmm, so what we would we do here?   Just 'struct BulkInsertState;
typedef struct BulkInsertState BulkInsertState;' ... and then never
actually define the struct anywhere?

> > 3. table AM gets a new member BulkInsertState
> > *(*create_bistate)(Relation Rel) and a corresponding function
> > table_create_bistate(), analogous to table_create_slot(), which can
> > call the constructor function for the appropriate type of
> > BulkInsertState and return the result
>
> but also route the following through the AM:
>
> > 2. that structure has a member for each defined operation on a BulkInsertState:
> >
> > void (*free)(BulkInsertState *);
> > void (*release_pin)(BulkInsertState *); // maybe rename to make it more generic
>
> Where free would just be part of finish_bulk_insert, and release_pin a
> new callback.

OK, that's an option.  I guess we'd change free_bulk_insert to take
the BulkInsertState as an additional option?

> Robert, seems we'll have to - regardless of where we come down on fixing
> this bug - have to make copy use multiple BulkInsertState's, even in the
> CIM_SINGLE (with proute) case. Or do you have a better idea?

Nope.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
David Rowley
Date:
On Mon, 10 Jun 2019 at 11:45, David Rowley <david.rowley@2ndquadrant.com> wrote:
>
> On Sat, 8 Jun 2019 at 04:51, Andres Freund <andres@anarazel.de> wrote:
> > David, any opinions on how to best fix this? It's not extremely obvious
> > how to do so best in the current setup of the partition actually being
> > hidden somewhere a few calls away (i.e. the table_open happening in
> > ExecInitPartitionInfo()).
>
> That's been overlooked. I agree it's not a bug with heap, since
> heapam_finish_bulk_insert() only does anything there when we're
> skipping WAL, which we don't do in copy.c for partitioned tables.
> However, who knows what other AMs will need, so we'd better fix that.
>
> My proposed patch is attached.
>
> I ended up moving the call to CopyMultiInsertInfoCleanup() down to
> after we call table_finish_bulk_insert for the main table. This might
> not be required but I lack imagination right now to what AMs might put
> in the finish_bulk_insert call, so doing this seems safer.

Andres, do you want to look at this before I look again?

Do you see any issue with calling table_finish_bulk_insert() when the
partition's CopyMultiInsertBuffer is evicted from the
CopyMultiInsertInfo rather than at the end of the copy? It can mean
that we call the function multiple times per partition. I assume the
function is only really intended to flush bulk inserted tuple to the
storage, so calling it more than once would just mean an inefficiency
rather than a bug.

Let me know your thoughts.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
Andres Freund
Date:
Hi,

On June 12, 2019 6:42:11 PM PDT, David Rowley <david.rowley@2ndquadrant.com> wrote:
>On Mon, 10 Jun 2019 at 11:45, David Rowley
><david.rowley@2ndquadrant.com> wrote:
>>
>> On Sat, 8 Jun 2019 at 04:51, Andres Freund <andres@anarazel.de>
>wrote:
>> > David, any opinions on how to best fix this? It's not extremely
>obvious
>> > how to do so best in the current setup of the partition actually
>being
>> > hidden somewhere a few calls away (i.e. the table_open happening in
>> > ExecInitPartitionInfo()).
>>
>> That's been overlooked. I agree it's not a bug with heap, since
>> heapam_finish_bulk_insert() only does anything there when we're
>> skipping WAL, which we don't do in copy.c for partitioned tables.
>> However, who knows what other AMs will need, so we'd better fix that.
>>
>> My proposed patch is attached.
>>
>> I ended up moving the call to CopyMultiInsertInfoCleanup() down to
>> after we call table_finish_bulk_insert for the main table. This might
>> not be required but I lack imagination right now to what AMs might
>put
>> in the finish_bulk_insert call, so doing this seems safer.
>
>Andres, do you want to look at this before I look again?
>
>Do you see any issue with calling table_finish_bulk_insert() when the
>partition's CopyMultiInsertBuffer is evicted from the
>CopyMultiInsertInfo rather than at the end of the copy? It can mean
>that we call the function multiple times per partition. I assume the
>function is only really intended to flush bulk inserted tuple to the
>storage, so calling it more than once would just mean an inefficiency
>rather than a bug.
>
>Let me know your thoughts.

I'm out on vacation until Monday (very needed, pretty damn exhausted). So I can't really give you a in depth answer
rightnow. 

Off the cuff, I'd say it's worthwhile to try somewhat hard to avoid superfluous finish calls. They can be quite
expensive(fsync), and we ought to have nearly all the state for doing it only as much as necessary. Possibly we need
onebool per partition to track whether any rows where inserted, but thats peanuts in comparison to all the other state. 

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
David Rowley
Date:
On Fri, 14 Jun 2019 at 07:53, Andres Freund <andres@anarazel.de> wrote:
>
> On June 12, 2019 6:42:11 PM PDT, David Rowley <david.rowley@2ndquadrant.com> wrote:
> >Do you see any issue with calling table_finish_bulk_insert() when the
> >partition's CopyMultiInsertBuffer is evicted from the
> >CopyMultiInsertInfo rather than at the end of the copy? It can mean
> >that we call the function multiple times per partition. I assume the
> >function is only really intended to flush bulk inserted tuple to the
> >storage, so calling it more than once would just mean an inefficiency
> >rather than a bug.
> >
> >Let me know your thoughts.
>
> I'm out on vacation until Monday (very needed, pretty damn exhausted). So I can't really give you a in depth answer
rightnow. 
>
> Off the cuff, I'd say it's worthwhile to try somewhat hard to avoid superfluous finish calls. They can be quite
expensive(fsync), and we ought to have nearly all the state for doing it only as much as necessary. Possibly we need
onebool per partition to track whether any rows where inserted, but thats peanuts in comparison to all the other state. 

No worries. I'll just park this patch here until you're ready to give it a look.

With the attached version I'm just calling table_finish_bulk_insert()
once per partition at the end of CopyFrom().  We've got an array with
all the ResultRelInfos we touched in the proute, so it's mostly a
matter of looping over that array and calling the function on each
ResultRelInfo's ri_RelationDesc.  However, to make it more complex,
PartitionTupleRouting is private to execPartition.c so we can't do
this directly... After staring at my screen for a while, I decided to
write a function that calls a callback function on each ResultRelInfo
in the PartitionTupleRouting.

The three alternative ways I thought of were:

1) Put PartitionTupleRouting back into execPartition.h and write the
loop over each ResultRelInfo in copy.c.
2) Write a specific function in execPartition.c that calls
table_finish_bulk_insert()
3) Modify ExecCleanupTupleRouting to pass in the ti_options and a bool
to say if it should call table_finish_bulk_insert() or not.

I didn't really like either of those. For #1, I'd rather keep it
private. For #2, it just seems a bit too specific a function to go
into execPartition.c. For #3 I really don't want to slow down
ExecCleanupTupleRouting() any. I designed those to be as fast as
possible since they're called for single-row INSERTs into partitioned
tables. Quite a bit of work went into PG12 to make those fast.

Of course, someone might see one of the alternatives as better than
what the patch does, so comments welcome.

The other thing I noticed is that we call
table_finish_bulk_insert(cstate->rel, ti_options); in copy.c
regardless of if we've done any bulk inserts or not. Perhaps that
should be under an if (insertMethod != CIM_SINGLE)

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: Custom table AMs need to include heapam.h because ofBulkInsertState

From
Michael Paquier
Date:
On Sat, Jun 15, 2019 at 12:25:12PM +1200, David Rowley wrote:
> With the attached version I'm just calling table_finish_bulk_insert()
> once per partition at the end of CopyFrom().  We've got an array with
> all the ResultRelInfos we touched in the proute, so it's mostly a
> matter of looping over that array and calling the function on each
> ResultRelInfo's ri_RelationDesc.  However, to make it more complex,
> PartitionTupleRouting is private to execPartition.c so we can't do
> this directly... After staring at my screen for a while, I decided to
> write a function that calls a callback function on each ResultRelInfo
> in the PartitionTupleRouting.

Don't take me bad, but I find the solution of defining and using a new
callback to call the table AM callback not really elegant, and keeping
all table AM callbacks called at a higher level than the executor
makes the code easier to follow.  Shouldn't we try to keep any calls
to table_finish_bulk_insert() within copy.c for each partition
instead?

> The other thing I noticed is that we call
> table_finish_bulk_insert(cstate->rel, ti_options); in copy.c
> regardless of if we've done any bulk inserts or not. Perhaps that
> should be under an if (insertMethod != CIM_SINGLE)

Yeah, good point.
--
Michael

Attachment

Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
David Rowley
Date:
On Mon, 24 Jun 2019 at 22:16, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sat, Jun 15, 2019 at 12:25:12PM +1200, David Rowley wrote:
> > With the attached version I'm just calling table_finish_bulk_insert()
> > once per partition at the end of CopyFrom().  We've got an array with
> > all the ResultRelInfos we touched in the proute, so it's mostly a
> > matter of looping over that array and calling the function on each
> > ResultRelInfo's ri_RelationDesc.  However, to make it more complex,
> > PartitionTupleRouting is private to execPartition.c so we can't do
> > this directly... After staring at my screen for a while, I decided to
> > write a function that calls a callback function on each ResultRelInfo
> > in the PartitionTupleRouting.
>
> Don't take me bad, but I find the solution of defining and using a new
> callback to call the table AM callback not really elegant, and keeping
> all table AM callbacks called at a higher level than the executor
> makes the code easier to follow.  Shouldn't we try to keep any calls
> to table_finish_bulk_insert() within copy.c for each partition
> instead?

I'm not quite sure if I follow you since the call to
table_finish_bulk_insert() is within copy.c still.

The problem was that PartitionTupleRouting is private to
execPartition.c, and we need a way to determine which of the
partitions we routed tuples to. It seems inefficient to flush all of
them if only a small number had tuples inserted into them and to me,
it seems inefficient to add some additional tracking in CopyFrom(),
like a hash table to store partition Oids that we inserted into. Using
PartitionTupleRouting makes sense. It's just a question of how to
access it, which is not so easy due to it being private.

I did suggest a few other ways that we could solve this. I'm not so
clear on which one of those you're suggesting or if you're thinking of
something new.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
David Rowley
Date:
On Mon, 24 Jun 2019 at 23:12, David Rowley <david.rowley@2ndquadrant.com> wrote:
>
> On Mon, 24 Jun 2019 at 22:16, Michael Paquier <michael@paquier.xyz> wrote:
> >
> > Don't take me bad, but I find the solution of defining and using a new
> > callback to call the table AM callback not really elegant, and keeping
> > all table AM callbacks called at a higher level than the executor
> > makes the code easier to follow.  Shouldn't we try to keep any calls
> > to table_finish_bulk_insert() within copy.c for each partition
> > instead?
>
> I'm not quite sure if I follow you since the call to
> table_finish_bulk_insert() is within copy.c still.
>
> The problem was that PartitionTupleRouting is private to
> execPartition.c, and we need a way to determine which of the
> partitions we routed tuples to. It seems inefficient to flush all of
> them if only a small number had tuples inserted into them and to me,
> it seems inefficient to add some additional tracking in CopyFrom(),
> like a hash table to store partition Oids that we inserted into. Using
> PartitionTupleRouting makes sense. It's just a question of how to
> access it, which is not so easy due to it being private.
>
> I did suggest a few other ways that we could solve this. I'm not so
> clear on which one of those you're suggesting or if you're thinking of
> something new.

Any further thoughts on this Michael?

Or Andres? Do you have a preference to which of the approaches
(mentioned upthread) I use for the fix?

If I don't hear anything I'll probably just push the first fix. The
inefficiency does not affect heap, so likely the people with the most
interest in improving that will be authors of other table AMs that
actually do something during table_finish_bulk_insert() for
partitions. We could revisit this in PG13 if someone comes up with a
need to improve things here.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
David Rowley
Date:
On Sun, 30 Jun 2019 at 17:54, David Rowley <david.rowley@2ndquadrant.com> wrote:

> Any further thoughts on this Michael?
>
> Or Andres? Do you have a preference to which of the approaches
> (mentioned upthread) I use for the fix?
>
> If I don't hear anything I'll probably just push the first fix. The
> inefficiency does not affect heap, so likely the people with the most
> interest in improving that will be authors of other table AMs that
> actually do something during table_finish_bulk_insert() for
> partitions. We could revisit this in PG13 if someone comes up with a
> need to improve things here.

I've pushed the original patch plus a small change to only call
table_finish_bulk_insert() for the target of the copy when we're using
bulk inserts.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Custom table AMs need to include heapam.h because ofBulkInsertState

From
Michael Paquier
Date:
On Tue, Jul 02, 2019 at 01:26:26AM +1200, David Rowley wrote:
> I've pushed the original patch plus a small change to only call
> table_finish_bulk_insert() for the target of the copy when we're using
> bulk inserts.

Yes, sorry for coming late at the party here.  What I meant previously
is that I did not find the version published at [1] to be natural with
its structure to define an executor callback which then calls a
callback for table AMs, still I get your point that it would be better
to try to avoid unnecessary fsync calls on partitions where no tuples
have been redirected with a COPY.  The version 1 of the patch attached
at [2] felt much more straight-forward and cleaner by keeping all the
table AM callbacks within copy.c.

This has been reverted as of f5db56f, still it seems to me that this
was moving in the right direction.

[1]: https://postgr.es/m/CAKJS1f95sB21LBF=1MCsEV+XLtA_JC3mtXx5kgDuHDsOGoWhKg@mail.gmail.com
[2]: https://postgr.es/m/CAKJS1f_0t-K0_3xe+erXPQ-jgaOb6tRZayErCXF2RpGdUVMt9g@mail.gmail.com
--
Michael

Attachment

Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
David Rowley
Date:
On Wed, 3 Jul 2019 at 19:35, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Jul 02, 2019 at 01:26:26AM +1200, David Rowley wrote:
> > I've pushed the original patch plus a small change to only call
> > table_finish_bulk_insert() for the target of the copy when we're using
> > bulk inserts.
>
> Yes, sorry for coming late at the party here.  What I meant previously
> is that I did not find the version published at [1] to be natural with
> its structure to define an executor callback which then calls a
> callback for table AMs, still I get your point that it would be better
> to try to avoid unnecessary fsync calls on partitions where no tuples
> have been redirected with a COPY.  The version 1 of the patch attached
> at [2] felt much more straight-forward and cleaner by keeping all the
> table AM callbacks within copy.c.
>
> This has been reverted as of f5db56f, still it seems to me that this
> was moving in the right direction.

I think the only objection to doing it the way [2] did was, if there
are more than MAX_PARTITION_BUFFERS partitions then we may end up
evicting the CopyMultiInsertBuffer out of the CopyMultiInsertInfo and
thus cause a call to table_finish_bulk_insert() before we're done with
the copy. It's not impossible that this could happen many times for a
given partition.  I agree that a working version of [2] is cleaner
than [1] but it's just the thought of those needless calls.

For [1], I wasn't very happy with the way it turned out which is why I
ended up suggesting a few other ideas. I just don't really like either
of them any better than [1], so I didn't chase those up, and that's
why I ended up going for [2]. If you think any of the other ideas I
suggested are better (apart from [2]) then let me know and I can see
about writing a patch. Otherwise, I plan to just fix [2] and push.

> [1]: https://postgr.es/m/CAKJS1f95sB21LBF=1MCsEV+XLtA_JC3mtXx5kgDuHDsOGoWhKg@mail.gmail.com
> [2]: https://postgr.es/m/CAKJS1f_0t-K0_3xe+erXPQ-jgaOb6tRZayErCXF2RpGdUVMt9g@mail.gmail.com

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
David Rowley
Date:
On Wed, 3 Jul 2019 at 19:35, Michael Paquier <michael@paquier.xyz> wrote:
> This has been reverted as of f5db56f, still it seems to me that this
> was moving in the right direction.

I've pushed this again, this time with the cleanup code done in the right order.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Custom table AMs need to include heapam.h because ofBulkInsertState

From
Michael Paquier
Date:
Hi David,

On Wed, Jul 10, 2019 at 09:40:59PM +1200, David Rowley wrote:
> On Wed, 3 Jul 2019 at 19:35, Michael Paquier <michael@paquier.xyz> wrote:
>> This has been reverted as of f5db56f, still it seems to me that this
>> was moving in the right direction.
>
> I've pushed this again, this time with the cleanup code done in the
> right order.

I have spent some time lately analyzing f7c830f as I was curious about
the logic behind it, and FWIW the result looks good.  Thanks!
--
Michael

Attachment

Re: Custom table AMs need to include heapam.h because ofBulkInsertState

From
Andres Freund
Date:
Hi,

Sorry for not chiming in again earlier, I was a bit exhausted...


On 2019-07-03 19:46:06 +1200, David Rowley wrote:
> I think the only objection to doing it the way [2] did was, if there
> are more than MAX_PARTITION_BUFFERS partitions then we may end up
> evicting the CopyMultiInsertBuffer out of the CopyMultiInsertInfo and
> thus cause a call to table_finish_bulk_insert() before we're done with
> the copy.

Right.


> It's not impossible that this could happen many times for a
> given partition.  I agree that a working version of [2] is cleaner
> than [1] but it's just the thought of those needless calls.

I think it's fairly important to optimize this. E.g. emitting
unnecessary fsyncs as it'd happen for heap is a pretty huge constant to
add to bulk loading.


> For [1], I wasn't very happy with the way it turned out which is why I
> ended up suggesting a few other ideas. I just don't really like either
> of them any better than [1], so I didn't chase those up, and that's
> why I ended up going for [2].

Yea, I don't like [1] either - they all seems too tied to copy.c's
usage.  Ideas:

1) Have ExecFindPartition() return via a bool* whether the partition is
   being accessed for the first time. In copy.c push the partition onto
   a list of to-be-bulk-finished tables.
2) Add a execPartition.c function that returns all the used tables from
   a PartitionTupleRouting*.

both seem cleaner to me than your proposals in [1], albeit not perfect
either. I think knowing which partitions are referenced is a reasonable
thing to want from the partition machinery. But using bulk-insert etc
seems outside of execPartition.c's remit, so doing that in copy.c seems
to make sense.


Greetings,

Andres Freund



Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
David Rowley
Date:
On Wed, 17 Jul 2019 at 06:46, Andres Freund <andres@anarazel.de> wrote:
> 1) Have ExecFindPartition() return via a bool* whether the partition is
>    being accessed for the first time. In copy.c push the partition onto
>    a list of to-be-bulk-finished tables.
> 2) Add a execPartition.c function that returns all the used tables from
>    a PartitionTupleRouting*.

#2 seems better than #1 as it does not add overhead to ExecFindPartition().

Are you thinking this should go back into v12, or just for v13 only?

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Custom table AMs need to include heapam.h because ofBulkInsertState

From
Andres Freund
Date:
Hi,

On 2019-07-18 11:29:37 +1200, David Rowley wrote:
> On Wed, 17 Jul 2019 at 06:46, Andres Freund <andres@anarazel.de> wrote:
> > 1) Have ExecFindPartition() return via a bool* whether the partition is
> >    being accessed for the first time. In copy.c push the partition onto
> >    a list of to-be-bulk-finished tables.
> > 2) Add a execPartition.c function that returns all the used tables from
> >    a PartitionTupleRouting*.
> 
> #2 seems better than #1 as it does not add overhead to ExecFindPartition().

I don't see how #1 would add meaningful overhead compared to the other
costs of that function.  Wouldn't it just be adding if (isnew) *isnew =
false; to the "/* ResultRelInfo already built */" branch, and the
reverse to the else?  That got to be several orders of magnitude cheaper
than e.g. FormPartitionKeyDatum() which is unconditionally executed?


> Are you thinking this should go back into v12, or just for v13 only?

Not sure, tbh. Probably depends a bit on how complicated it'd look?

Greetings,

Andres Freund



Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
David Rowley
Date:
On Thu, 18 Jul 2019 at 11:36, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2019-07-18 11:29:37 +1200, David Rowley wrote:
> > On Wed, 17 Jul 2019 at 06:46, Andres Freund <andres@anarazel.de> wrote:
> > > 1) Have ExecFindPartition() return via a bool* whether the partition is
> > >    being accessed for the first time. In copy.c push the partition onto
> > >    a list of to-be-bulk-finished tables.
> > > 2) Add a execPartition.c function that returns all the used tables from
> > >    a PartitionTupleRouting*.
> >
> > #2 seems better than #1 as it does not add overhead to ExecFindPartition().
>
> I don't see how #1 would add meaningful overhead compared to the other
> costs of that function.  Wouldn't it just be adding if (isnew) *isnew =
> false; to the "/* ResultRelInfo already built */" branch, and the
> reverse to the else?

Yes

> That got to be several orders of magnitude cheaper
> than e.g. FormPartitionKeyDatum() which is unconditionally executed?

Probably.

However, I spent quite a bit of time trying to make that function as
fast as possible in v12, and since #2 seems like a perfectly good
alternative, I'd rather go with that than to add pollution to
ExecFindPartition's signature. Also, #2 seems better since it keeps
CopyFrom() from having to maintain a list. I think we all agreed
somewhere that that code is more complex than we'd all like it to be.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Custom table AMs need to include heapam.h because ofBulkInsertState

From
Andres Freund
Date:
Hi,

On 2019-07-18 11:57:44 +1200, David Rowley wrote:
> However, I spent quite a bit of time trying to make that function as
> fast as possible in v12, and since #2 seems like a perfectly good
> alternative, I'd rather go with that than to add pollution to
> ExecFindPartition's signature. Also, #2 seems better since it keeps
> CopyFrom() from having to maintain a list. I think we all agreed
> somewhere that that code is more complex than we'd all like it to be.

Fair enough.

One last thought for #1: I was wondering about is whether a the bool *
approach might be useful for nodeModifyTable.c too? I thought that maybe
that could be used to avoid some checks for setting up per partition
state, but it seems not to be the case ATM.

Greetings,

Andres Freund



Re: Custom table AMs need to include heapam.h because of BulkInsertState

From
David Rowley
Date:
On Wed, 17 Jul 2019 at 06:46, Andres Freund <andres@anarazel.de> wrote:
> 2) Add a execPartition.c function that returns all the used tables from
>    a PartitionTupleRouting*.

Here's a patch which implements it that way.

I struggled a bit to think of a good name for the execPartition.c
function. I ended up with ExecGetRoutedToRelations. I'm open to better
ideas.

I also chose to leave the change of function signatures done in
f7c830f1a in place. I don't think the additional now unused parameter
is that out of place. Also, the function is inlined, so removing it
wouldn't help performance any.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment