Thread: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
Hi, I've been doing some testing today, and it seems heap_sync is somewhat confused by partitioned tables. I'm doing a COPY into a partitioned table (lineitem from TPC-H partitioned per month) like this: --------------------------------------------------------------------- BEGIN; CREATE TABLE lineitem (...) PARTITION BY RANGE (l_shipdate); CREATE TABLE lineitem_1992_01 PARTITION OF lineitem FOR VALUES FROM ('1992-01-01') TO ('1992-02-01'); ... CREATE TABLE lineitem_1998_12 PARTITION OF lineitem FOR VALUES FROM ('1998-12-01') TO ('1999-01-01'); COPY INTO lineitem FROM .... COMMIT; --------------------------------------------------------------------- I'm observing COPY failing like this: ERROR: could not open file "base/16384/16427": No such file or directory where the relfilenode matches the "lineitem" relation (which is guaranteed to be empty, as it's partitioned, so it's not surprising the relfilenode does not exist). After adding an Assert(false) into mdopen(), which is where the error comes from, I got this backtrace (full backtrace attached): ... #3 0x0000000000766387 in mdopen (...) at md.c:609 #4 0x00000000007674b7 in mdnblocks (...) at md.c:876 #5 0x0000000000767a61 in mdimmedsync (...) at md.c:1034 #6 0x00000000004c62b5 in heap_sync (...) at heapam.c:9399 #7 0x00000000005a2bdd in CopyFrom (...) at copy.c:2890 #8 0x00000000005a1621 in DoCopy (...) at copy.c:992 ... So apparently CopyFrom() invokes heap_sync() on the partitioned relation, which attempts to do mdimmedsync() only on the root. That seems like a bug to me. Obviously this only applies to wal_level=minimal. There are multiple callers of heap_sync(), but only the COPY seems to be affected by this, because the rest can't see partitioned tables. So it seems heap_sync() needs to be improved to sync all partitions. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)
From
David Rowley
Date:
On 19 September 2018 at 12:21, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > So apparently CopyFrom() invokes heap_sync() on the partitioned > relation, which attempts to do mdimmedsync() only on the root. That > seems like a bug to me. > > Obviously this only applies to wal_level=minimal. There are multiple > callers of heap_sync(), but only the COPY seems to be affected by this, > because the rest can't see partitioned tables. > > So it seems heap_sync() needs to be improved to sync all partitions. Wouldn't it be better to modify copy.c to just perform the heap_sync on just the partitions it touches? After https://commitfest.postgresql.org/19/1690/ it's very efficient to know which partitions were used. The bulk load might just be hitting a single partition, so it makes sense to try to just do the ones we need to, rather than needlessly performing heap_sync() on all of them. For the back branches, the current form of PartitionDispatch is not particularly inefficient to just skip over the uninitialized partitions. 6b78231d918 did recently just make the PartitionDispatch struct's members private to execPartition.c, so doing this will probably require some API function that can be passed a callback function to do what we need. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Michael Paquier
Date:
On Wed, Sep 19, 2018 at 01:14:10PM +1200, David Rowley wrote: > On 19 September 2018 at 12:21, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> So apparently CopyFrom() invokes heap_sync() on the partitioned >> relation, which attempts to do mdimmedsync() only on the root. That >> seems like a bug to me. And the root should not have any physical storage either, so attempting to sync it is wrong. >> Obviously this only applies to wal_level=minimal. There are multiple >> callers of heap_sync(), but only the COPY seems to be affected by this, >> because the rest can't see partitioned tables. >> >> So it seems heap_sync() needs to be improved to sync all partitions. > > Wouldn't it be better to modify copy.c to just perform the heap_sync > on just the partitions it touches? Yeah, my gut is telling me that this would be the best approach for now, still I am not sure that this is the best move in the long term. All the other callers of heap_sync don't care about partitioned tables, so we could add an assertion on RELKIND_PARTITIONED_TABLE. Still, one crazy argument against that approach would be that we'd need to do the same thing again if one day CTAS or matviews allow some sort of automatic creation of partitions. -- Michael
Attachment
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Andres Freund
Date:
On 2018-09-19 12:06:47 +0900, Michael Paquier wrote: > On Wed, Sep 19, 2018 at 01:14:10PM +1200, David Rowley wrote: > > Wouldn't it be better to modify copy.c to just perform the heap_sync > > on just the partitions it touches? > > Yeah, my gut is telling me that this would be the best approach for now, > still I am not sure that this is the best move in the long term. ISTM heap_sync() would be the entirely wrong layer to handle partitioning. For several reasons: 1) With pluggable storage, we want to have multiple different table implementations, doing the syncing on the heap_* for partitions would thus be wrong. 2) In just about all cases we only want to sync a few partitions, there's not really a use-case for doing syncs across all partitions imo. > All the other callers of heap_sync don't care about partitioned > tables, so we could add an assertion on RELKIND_PARTITIONED_TABLE. Or rather, it should assert the expected relkinds? Greetings, Andres Freund
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Michael Paquier
Date:
On Wed, Sep 19, 2018 at 02:48:58PM -0700, Andres Freund wrote: > On 2018-09-19 12:06:47 +0900, Michael Paquier wrote: >> Yeah, my gut is telling me that this would be the best approach for now, >> still I am not sure that this is the best move in the long term. > > ISTM heap_sync() would be the entirely wrong layer to handle > partitioning. For several reasons: 1) With pluggable storage, we want to > have multiple different table implementations, doing the syncing on the > heap_* for partitions would thus be wrong. 2) In just about all cases we > only want to sync a few partitions, there's not really a use-case for > doing syncs across all partitions imo. I haven't considered things from the angle of 1), which is a very good point. 2) is also a good argument. >> All the other callers of heap_sync don't care about partitioned >> tables, so we could add an assertion on RELKIND_PARTITIONED_TABLE. > > Or rather, it should assert the expected relkinds? Yeah, I think that we are coming back to what heap_create assumes in term of which relkinds should have storage or not, so a global macro able to do the work would be adapted perhaps? -- Michael
Attachment
Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)
From
David Rowley
Date:
On Thu, 20 Sep 2018 at 11:31, Andres Freund <andres@anarazel.de> wrote: > > On 2018-09-19 12:06:47 +0900, Michael Paquier wrote: > > On Wed, Sep 19, 2018 at 01:14:10PM +1200, David Rowley wrote: > > > Wouldn't it be better to modify copy.c to just perform the heap_sync > > > on just the partitions it touches? > > > > Yeah, my gut is telling me that this would be the best approach for now, > > still I am not sure that this is the best move in the long term. > > ISTM heap_sync() would be the entirely wrong layer to handle > partitioning. For several reasons: 1) With pluggable storage, we want to > have multiple different table implementations, doing the syncing on the > heap_* for partitions would thus be wrong. 2) In just about all cases we > only want to sync a few partitions, there's not really a use-case for > doing syncs across all partitions imo. I started looking into this and noticed that during the check where we set the hi_options bits to determine if we can skip WAL and FSM checks, we only check the table that's the target of the COPY command. That's fine for normal tables, but for partitioned tables, it's pretty wrong. We could end up skipping FSM checks for partitions that have concurrent DML being performed which could bloat the table. We could also skip writing WAL for partitions that should have had WAL written. To fix this we'd need to also know each partition that we're about to route tuples to was also created or truncated in the same transaction, but we can't know that just yet since we've not opened up any partitions to check yet. I think the best back-patchable fix for this is just to disable the optimization for partitioned tables. I also noticed that we support COPY FREEZE for partitioned tables and attempt to just check if the partitioned table was created or truncated in the current transaction. This is also wrong as even if the partitioned table had been just created, it's partitions may have already existed and actually be visible to other transactions. It seems best to just error out if anyone tries to do COPY FREEZE with a partitioned table as there is no option, in this case, to build per-partition hi_options lazily since we must error out if FREEZE cannot be supported before the COPY starts processing tuples. > > All the other callers of heap_sync don't care about partitioned > > tables, so we could add an assertion on RELKIND_PARTITIONED_TABLE. > > Or rather, it should assert the expected relkinds? I think an Assert() in heap_sync is a good idea. As far as I know, it should only ever be used for normal tables or materialized views. So perhaps best to assert the rel is one of those? I've attached a patch to assist discussion on the problem with setting the incorrect hi_options for partitioned tables. Comments are welcome. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Amit Langote
Date:
Hi, On 2018/09/28 10:23, David Rowley wrote: > On Thu, 20 Sep 2018 at 11:31, Andres Freund <andres@anarazel.de> wrote: >> >> On 2018-09-19 12:06:47 +0900, Michael Paquier wrote: >>> On Wed, Sep 19, 2018 at 01:14:10PM +1200, David Rowley wrote: >>>> Wouldn't it be better to modify copy.c to just perform the heap_sync >>>> on just the partitions it touches? >>> >>> Yeah, my gut is telling me that this would be the best approach for now, >>> still I am not sure that this is the best move in the long term. >> >> ISTM heap_sync() would be the entirely wrong layer to handle >> partitioning. For several reasons: 1) With pluggable storage, we want to >> have multiple different table implementations, doing the syncing on the >> heap_* for partitions would thus be wrong. 2) In just about all cases we >> only want to sync a few partitions, there's not really a use-case for >> doing syncs across all partitions imo. > > I started looking into this and noticed that during the check where we > set the hi_options bits to determine if we can skip WAL and FSM > checks, we only check the table that's the target of the COPY command. > That's fine for normal tables, but for partitioned tables, it's pretty > wrong. We could end up skipping FSM checks for partitions that have > concurrent DML being performed which could bloat the table. We could > also skip writing WAL for partitions that should have had WAL written. > > To fix this we'd need to also know each partition that we're about to > route tuples to was also created or truncated in the same transaction, > but we can't know that just yet since we've not opened up any > partitions to check yet. I think the best back-patchable fix for this > is just to disable the optimization for partitioned tables. Agreed. Data could go into partitions all of which may not pass the tests used to select the value of hi_options being used to enable this optimization. So, it's better to avoid the optimization until we find a way to set hi_option's value per partition. Doing it upfront for all partitions doesn't seem like a good idea as you already said, so we may not consider patching it like that in back-branches. > I also noticed that we support COPY FREEZE for partitioned tables and > attempt to just check if the partitioned table was created or > truncated in the current transaction. This is also wrong as even if > the partitioned table had been just created, it's partitions may have > already existed and actually be visible to other transactions. It > seems best to just error out if anyone tries to do COPY FREEZE with a > partitioned table as there is no option, in this case, to build > per-partition hi_options lazily since we must error out if FREEZE > cannot be supported before the COPY starts processing tuples. Yeah, the same argument seems to apply here. Not all partitions might be created in the same sub-transaction. >>> All the other callers of heap_sync don't care about partitioned >>> tables, so we could add an assertion on RELKIND_PARTITIONED_TABLE. >> >> Or rather, it should assert the expected relkinds? > > I think an Assert() in heap_sync is a good idea. As far as I know, it > should only ever be used for normal tables or materialized views. So > perhaps best to assert the rel is one of those? Agree with Andres that we should assert relkinds that heap_sync is designed to handle, instead of, say, asserting that no partitioned tables are passed to it. > I've attached a patch to assist discussion on the problem with setting > the incorrect hi_options for partitioned tables. > > Comments are welcome. Thank you for creating the patch and Tomas for reporting the bug. Looking at the patch itself, does it seem like both the newly added comments repeat the same point (that we'll need per-partition hi_options to enable these optimizations) and are pretty close to each other? Regards, Amit
Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)
From
David Rowley
Date:
On 28 September 2018 at 14:25, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > Looking at the patch itself, does it seem like both the newly added > comments repeat the same point (that we'll need per-partition hi_options > to enable these optimizations) and are pretty close to each other? Thanks for looking at this. I don't agree that we can skip explaining why one of the optimisations can't be applied just because we've explained why a similar optimisation cannot be applied somewhere close by. I think that the WAL/FSM optimisation can fairly easily be improved on and probably fixed in PG12 as we can just lazily determine per-partition if it can be applied to that partition or not. For the FREEZE optimisation, since we ERROR out in cases where it's requested but is not possible, it does not seem likely we'll ever fix that since to do that we'd need to determine that all partitions have just been truncated or were only just created in this transaction. Since we've both recently done a bit of work in the area of speeding up COPY, then I doubt either of us would like to go and slow it down again by adding a pre-check that goes and opens all the partitions before the copy begins. That's going to have a huge negative performance impact on small copies to 1 partition when there are many partitions attached. So in this regard, you'll notice that the comments are not that similar. One explains that we could improve on it, and the other attempts to mention that it would be surprising if we performed a FREEZE for some partitions but not others. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Michael Paquier
Date:
On Fri, Sep 28, 2018 at 02:46:30PM +1200, David Rowley wrote: > I don't agree that we can skip explaining why one of the optimisations > can't be applied just because we've explained why a similar > optimisation cannot be applied somewhere close by. I think that the > WAL/FSM optimisation can fairly easily be improved on and probably > fixed in PG12 as we can just lazily determine per-partition if it can > be applied to that partition or not. Have you guys looked at what the following patch does for partitions and how it interacts with it? https://commitfest.postgresql.org/19/528/ The proposed patch is missing the point that documentation also mentions the optimizations for COPY with wal_level = minimal: <para> <command>COPY</command> is fastest when used within the same transaction as an earlier <command>CREATE TABLE</command> or <command>TRUNCATE</command> command. In such cases no WAL needs to be written, because in case of an error, the files containing the newly loaded data will be removed anyway. However, this consideration only applies when <xref linkend="guc-wal-level"/> is <literal>minimal</literal> as all commands must write WAL otherwise. </para> -- Michael
Attachment
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Amit Langote
Date:
On 2018/09/28 11:46, David Rowley wrote: > On 28 September 2018 at 14:25, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> Looking at the patch itself, does it seem like both the newly added >> comments repeat the same point (that we'll need per-partition hi_options >> to enable these optimizations) and are pretty close to each other? > > Thanks for looking at this. > > I don't agree that we can skip explaining why one of the optimisations > can't be applied just because we've explained why a similar > optimisation cannot be applied somewhere close by. I think that the > WAL/FSM optimisation can fairly easily be improved on and probably > fixed in PG12 as we can just lazily determine per-partition if it can > be applied to that partition or not. > > For the FREEZE optimisation, since we ERROR out in cases where it's > requested but is not possible, it does not seem likely we'll ever fix > that since to do that we'd need to determine that all partitions have > just been truncated or were only just created in this transaction. > Since we've both recently done a bit of work in the area of speeding > up COPY, then I doubt either of us would like to go and slow it down > again by adding a pre-check that goes and opens all the partitions > before the copy begins. That's going to have a huge negative > performance impact on small copies to 1 partition when there are many > partitions attached. > > So in this regard, you'll notice that the comments are not that > similar. One explains that we could improve on it, and the other > attempts to mention that it would be surprising if we performed a > FREEZE for some partitions but not others. Okay, I see the difference. I didn't really intend to ask to remove the second comment altogether; just the part which repeats the why (that actual relations that the data will go into are not handy for inspection at this point). Looking at the second comment: +/* + * We currently disallow COPY FREEZE on partitioned tables. The + * reason for this is that here we're only able to determine details + * about the partitioned table. The actual partitions are not opened + * until we start routing tuples to them, so we cannot know in advance + * if the partition has just been created or not. It may be possible + * to relax this, but we would need to store hi_options per partition + * and it would possibly mean that we'd freeze some partitions but not + * others. An outright ERROR seems better than surprising behavior. + */ Does this comment mean that we *may* consider removing this ERROR in the future and continue with the optimization on per-partition basis even if, as it says, it might mean that some partitions will contain frozen rows while others not? Thanks, Amit
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Amit Langote
Date:
On 2018/09/28 12:12, Michael Paquier wrote: > On Fri, Sep 28, 2018 at 02:46:30PM +1200, David Rowley wrote: >> I don't agree that we can skip explaining why one of the optimisations >> can't be applied just because we've explained why a similar >> optimisation cannot be applied somewhere close by. I think that the >> WAL/FSM optimisation can fairly easily be improved on and probably >> fixed in PG12 as we can just lazily determine per-partition if it can >> be applied to that partition or not. > > Have you guys looked at what the following patch does for partitions and > how it interacts with it? > https://commitfest.postgresql.org/19/528/ Just looked at that patch and noticed that the following hunk won't cope if COPY's target table is partitioned: diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 7674369613..7b9a7af2d2 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -2416,10 +2416,8 @@ CopyFrom(CopyState cstate) { hi_options |= HEAP_INSERT_SKIP_FSM; - if (!XLogIsNeeded() && - cstate->rel->trigdesc == NULL && - RelationGetNumberOfBlocks(cstate->rel) == 0) - hi_options |= HEAP_INSERT_SKIP_WAL; + if (!XLogIsNeeded() && RelationGetNumberOfBlocks(cstate->rel) == 0) + hi_options |= HEAP_INSERT_SKIP_WAL; } /* RelationGetNumberOfBlocks would blow up if passed a partitioned table to it. Applying David's patch will take care of that though. > The proposed patch is missing the point that documentation also mentions > the optimizations for COPY with wal_level = minimal: > <para> > <command>COPY</command> is fastest when used within the same > transaction as an earlier <command>CREATE TABLE</command> or > <command>TRUNCATE</command> command. In such cases no WAL > needs to be written, because in case of an error, the files > containing the newly loaded data will be removed anyway. > However, this consideration only applies when > <xref linkend="guc-wal-level"/> is <literal>minimal</literal> as all commands > must write WAL otherwise. > </para> I might be wrong but I'm not sure if we should mention here that this optimization is not applied to partitioned tables due to what appears to be a implementation-level limitation? Thanks, Amit
Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)
From
David Rowley
Date:
On 28 September 2018 at 15:12, Michael Paquier <michael@paquier.xyz> wrote: > On Fri, Sep 28, 2018 at 02:46:30PM +1200, David Rowley wrote: >> I don't agree that we can skip explaining why one of the optimisations >> can't be applied just because we've explained why a similar >> optimisation cannot be applied somewhere close by. I think that the >> WAL/FSM optimisation can fairly easily be improved on and probably >> fixed in PG12 as we can just lazily determine per-partition if it can >> be applied to that partition or not. > > Have you guys looked at what the following patch does for partitions and > how it interacts with it? > https://commitfest.postgresql.org/19/528/ I've glanced at it. I don't think we're taking anything in the wrong direction. The patch looks like it would need rebased if this gets in first. > The proposed patch is missing the point that documentation also mentions > the optimizations for COPY with wal_level = minimal: > <para> > <command>COPY</command> is fastest when used within the same > transaction as an earlier <command>CREATE TABLE</command> or > <command>TRUNCATE</command> command. In such cases no WAL > needs to be written, because in case of an error, the files > containing the newly loaded data will be removed anyway. > However, this consideration only applies when > <xref linkend="guc-wal-level"/> is <literal>minimal</literal> as all commands > must write WAL otherwise. > </para> I've edited that in the attached patch. Also reworded a comment that Amit mentioned and made a small change to the COPY FREEZE docs to mention no support for partitioned tables. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Tomas Vondra
Date:
On 09/28/2018 06:02 AM, Amit Langote wrote: > On 2018/09/28 12:12, Michael Paquier wrote: >> On Fri, Sep 28, 2018 at 02:46:30PM +1200, David Rowley wrote: >>> I don't agree that we can skip explaining why one of the optimisations >>> can't be applied just because we've explained why a similar >>> optimisation cannot be applied somewhere close by. I think that the >>> WAL/FSM optimisation can fairly easily be improved on and probably >>> fixed in PG12 as we can just lazily determine per-partition if it can >>> be applied to that partition or not. >> >> Have you guys looked at what the following patch does for partitions and >> how it interacts with it? >> https://commitfest.postgresql.org/19/528/ > > Just looked at that patch and noticed that the following hunk won't cope > if COPY's target table is partitioned: > > diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c > index 7674369613..7b9a7af2d2 100644 > --- a/src/backend/commands/copy.c > +++ b/src/backend/commands/copy.c > @@ -2416,10 +2416,8 @@ CopyFrom(CopyState cstate) > { > hi_options |= HEAP_INSERT_SKIP_FSM; > > - if (!XLogIsNeeded() && > - cstate->rel->trigdesc == NULL && > - RelationGetNumberOfBlocks(cstate->rel) == 0) > - hi_options |= HEAP_INSERT_SKIP_WAL; > + if (!XLogIsNeeded() && RelationGetNumberOfBlocks(cstate->rel) == 0) > + hi_options |= HEAP_INSERT_SKIP_WAL; > } > > /* > > RelationGetNumberOfBlocks would blow up if passed a partitioned table to it. > > Applying David's patch will take care of that though. > >> The proposed patch is missing the point that documentation also mentions >> the optimizations for COPY with wal_level = minimal: >> <para> >> <command>COPY</command> is fastest when used within the same >> transaction as an earlier <command>CREATE TABLE</command> or >> <command>TRUNCATE</command> command. In such cases no WAL >> needs to be written, because in case of an error, the files >> containing the newly loaded data will be removed anyway. >> However, this consideration only applies when >> <xref linkend="guc-wal-level"/> is <literal>minimal</literal> as all commands >> must write WAL otherwise. >> </para> > > I might be wrong but I'm not sure if we should mention here that this > optimization is not applied to partitioned tables due to what appears to > be a implementation-level limitation? > Aren't most limitations implementation-level? ;-) IMHO it's entirely appropriate to mention this in user-level docs. Based on the current wording, it's quite natural to assume the optimization applies to both partitioned and non-partitioned tables. And in fact it does not. It will still work for individual partitions, though. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)
From
David Rowley
Date:
On 28 September 2018 at 19:22, David Rowley <david.rowley@2ndquadrant.com> wrote: > I've edited that in the attached patch. Also reworded a comment that > Amit mentioned and made a small change to the COPY FREEZE docs to > mention no support for partitioned tables. Added to the November 'fest. https://commitfest.postgresql.org/20/1811/ -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Steve Singer
Date:
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation: tested, failed --- a/doc/src/sgml/perform.sgml +++ b/doc/src/sgml/perform.sgml @@ -1534,9 +1534,10 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse; <command>TRUNCATE</command> command. In such cases no WAL needs to be written, because in case of an error, the files containing the newly loaded data will be removed anyway. - However, this consideration only applies when - <xref linkend="guc-wal-level"/> is <literal>minimal</literal> as all commands - must write WAL otherwise. + However, this consideration only applies for non-partitioned + tables when <xref linkend="guc-wal-level"/> is + <literal>minimal</literal> as all commands must write WAL + otherwise. </para> Would this be better as "However, this consideration only applies for non-partitioned and the wal_level is minimal" (Switching the 'when' to 'and the') --- a/doc/src/sgml/ref/copy.sgml +++ b/doc/src/sgml/ref/copy.sgml @@ -222,9 +222,10 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable Requests copying the data with rows already frozen, just as they would be after running the <command>VACUUM FREEZE</command> command. This is intended as a performance option for initial data loading. - Rows will be frozen only if the table being loaded has been created - or truncated in the current subtransaction, there are no cursors - open and there are no older snapshots held by this transaction. + Rows will be frozen only for non-partitioned tables if the table + being loaded has been created or truncated in the current + subtransaction, there are no cursors open and there are no older + snapshots held by this transaction. Similar concern with above When I read this comment it makes me think that the table is partitioned then the rows will be frozen even if the table wascreated by an earlier transaction. Do we want to instead say "Rows will be frozen if the table is not partitioned and the table being loaded has been...." Other than that the patch looks fine and works as expected. The new status of this patch is: Waiting on Author
Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)
From
David Rowley
Date:
Thanks for looking at this. On 1 November 2018 at 16:07, Steve Singer <steve@ssinger.info> wrote: > --- a/doc/src/sgml/perform.sgml > +++ b/doc/src/sgml/perform.sgml > @@ -1534,9 +1534,10 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse; > <command>TRUNCATE</command> command. In such cases no WAL > needs to be written, because in case of an error, the files > containing the newly loaded data will be removed anyway. > - However, this consideration only applies when > - <xref linkend="guc-wal-level"/> is <literal>minimal</literal> as all commands > - must write WAL otherwise. > + However, this consideration only applies for non-partitioned > + tables when <xref linkend="guc-wal-level"/> is > + <literal>minimal</literal> as all commands must write WAL > + otherwise. > </para> > > Would this be better as "However, this consideration only applies for non-partitioned and the wal_level is minimal" > (Switching the 'when' to 'and the') I ended up going with: However, this consideration only applies when <xref linkend="guc-wal-level"/> is <literal>minimal</literal> for non-partitioned tables as all commands must write WAL otherwise. > --- a/doc/src/sgml/ref/copy.sgml > +++ b/doc/src/sgml/ref/copy.sgml > @@ -222,9 +222,10 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable > Requests copying the data with rows already frozen, just as they > would be after running the <command>VACUUM FREEZE</command> command. > This is intended as a performance option for initial data loading. > - Rows will be frozen only if the table being loaded has been created > - or truncated in the current subtransaction, there are no cursors > - open and there are no older snapshots held by this transaction. > + Rows will be frozen only for non-partitioned tables if the table > + being loaded has been created or truncated in the current > + subtransaction, there are no cursors open and there are no older > + snapshots held by this transaction. > > Similar concern with above I've changed this to: Rows will be frozen only if the table being loaded has been created or truncated in the current subtransaction, there are no cursors open and there are no older snapshots held by this transaction. It is currently not possible to perform a <command>COPY FREEZE</command> on a partitioned table. Which is just the original text with the new sentence tagged onto the end. > Other than that the patch looks fine and works as expected. > > The new status of this patch is: Waiting on Author Thanks for looking at this. I've attached an updated patch. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Steve Singer
Date:
On Thu, 1 Nov 2018, David Rowley wrote: > > I ended up going with: > > However, this consideration only applies when > <xref linkend="guc-wal-level"/> is <literal>minimal</literal> for > non-partitioned tables as all commands must write WAL otherwise. > > > I've changed this to: > > Rows will be frozen only if the table being loaded has been created > or truncated in the current subtransaction, there are no cursors > open and there are no older snapshots held by this transaction. It is > currently not possible to perform a <command>COPY FREEZE</command> on > a partitioned table. > > Which is just the original text with the new sentence tagged onto the end. > >> Other than that the patch looks fine and works as expected. >> >> The new status of this patch is: Waiting on Author > > Thanks for looking at this. I've attached an updated patch. I am happy with the changes. I think the patch is ready for a committer. > > -- > David Rowley http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > Steve
Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)
From
David Rowley
Date:
On 2 November 2018 at 15:43, Steve Singer <steve@ssinger.info> wrote: > I am happy with the changes. > I think the patch is ready for a committer. Many thanks for your review. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Alvaro Herrera
Date:
Here are versions for branches 10 and 11. The main change is that the COPY test didn't have the partitioned table, because it was recently introduced (0d5f05cde011) so I backpatched that part also. It's a bit useless, but I'd rather backpatch the same thing rather than have different lines there ... (The version for 10 only differs in the perform.sgml hunk.) I intend to get this pushed on Monday. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Alvaro Herrera
Date:
FWIW I didn't like the error message: cannot perform FREEZE on a partitioned table but then I noticed other messages also say "cannot perform FREEZE". I think they should all say "cannot perform COPY FREEZE" instead. Not something for this patch to fix, though, I think. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Tomas Vondra
Date:
On 11/17/18 4:53 PM, Alvaro Herrera wrote: > Here are versions for branches 10 and 11. The main change is that the > COPY test didn't have the partitioned table, because it was recently > introduced (0d5f05cde011) so I backpatched that part also. It's a bit > useless, but I'd rather backpatch the same thing rather than have > different lines there ... > The patch seems fine to me. It's a bit unfortunate that we simply disable the optimization on partitioned tables instead of fixing it somehow, but I guess it's better than nothing and no one has a very good idea how to make it work. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: heap_sync seems rather oblivious to partitioned tables(wal_level=minimal)
From
Alvaro Herrera
Date:
On 2018-Nov-19, Tomas Vondra wrote: > On 11/17/18 4:53 PM, Alvaro Herrera wrote: > > Here are versions for branches 10 and 11. The main change is that the > > COPY test didn't have the partitioned table, because it was recently > > introduced (0d5f05cde011) so I backpatched that part also. It's a bit > > useless, but I'd rather backpatch the same thing rather than have > > different lines there ... > > The patch seems fine to me. Pushed now, thanks. > It's a bit unfortunate that we simply disable the optimization on > partitioned tables instead of fixing it somehow, but I guess it's better > than nothing and no one has a very good idea how to make it work. Yeah, I think leaving it in the current state is worse than disabling it, since an error is thrown anyway. In any case, I guess when we have a patch it won't be backpatchable. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)
From
David Rowley
Date:
On Tue, 20 Nov 2018 at 01:47, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > The patch seems fine to me. Thanks for looking at it. > It's a bit unfortunate that we simply disable the optimization on > partitioned tables instead of fixing it somehow, but I guess it's better > than nothing and no one has a very good idea how to make it work. Yeah, I agree. I added various comments in the patch to explain why we're not going to the trouble of allowing it. I had hoped that would do for now. Lately, there's quite a bit of effort going into reducing various overheads to having a partitioned table with high numbers of partitions. A patch to speed up INSERT was just committed, for example. Perhaps COPY is less critical to make faster for inserting small numbers of rows, but I'd not rule out someone complaining sometime about the overhead of checking all partitions are compatible with the freeze optimisation before doing the COPY, although, perhaps we can just tell those people not to use the FREEZE option. Anyway, I was mostly interested in fixing the bug. Making COPY FREEZE work with partitioned tables sounds like a new feature. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: heap_sync seems rather oblivious to partitioned tables (wal_level=minimal)
From
David Rowley
Date:
(Returning from leave) On Tue, 20 Nov 2018 at 03:19, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Pushed now, thanks. Many thanks for pushing. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services