Thread: Statistics import and export: difference in statistics of materialized view dumped
Statistics import and export: difference in statistics of materialized view dumped
From
Ashutosh Bapat
Date:
Hi Jeff, Corey, After fixing the statistics difference in dumps of tables with indexes, I now see difference in statistics of materialized view dump in the test I am developing at [1] (see the latest patches there). I see following difference in the dump from the original regression database and the dump taken from the database where the dump is restored @@ -441198,8 +441198,8 @@ SELECT * FROM pg_catalog.pg_restore_relation_stats( 'version', '180000'::integer, 'relation', 'public.mvtest_bb'::regclass, - 'relpages', '1'::integer, - 'reltuples', '1'::real, + 'relpages', '0'::integer, + 'reltuples', '-1'::real, 'relallvisible', '0'::integer ); -- @@ -441218,8 +441218,8 @@ SELECT * FROM pg_catalog.pg_restore_relation_stats( 'version', '180000'::integer, 'relation', 'public.mvtest_tm'::regclass, - 'relpages', '1'::integer, - 'reltuples', '3'::real, + 'relpages', '0'::integer, + 'reltuples', '-1'::real, 'relallvisible', '0'::integer ); -- @@ -441238,8 +441238,8 @@ SELECT * FROM pg_catalog.pg_restore_relation_stats( 'version', '180000'::integer, 'relation', 'public.mvtest_tvmm'::regclass, - 'relpages', '1'::integer, - 'reltuples', '1'::real, + 'relpages', '0'::integer, + 'reltuples', '-1'::real, 'relallvisible', '0'::integer ); -- @@ -448468,9 +448468,9 @@ SELECT * FROM pg_catalog.pg_restore_relation_stats( 'version', '180000'::integer, 'relation', 'public.tableam_tblmv_heap2'::regclass, - 'relpages', '1'::integer, - 'reltuples', '1'::real, - 'relallvisible', '1'::integer + 'relpages', '0'::integer, + 'reltuples', '-1'::real, + 'relallvisible', '0'::integer ); These are materialised views created in the test matview.sql and create_am.sql. When I tried to reproduce the issue outside the test using the attached scripts. The SQL is just copied from matview.sql. But both the dumps (from original and restored databases) do not show any difference. But if I run "make installcheck", take dump of regression database, restore it, take dump of restored database, I am able to see the following difference *** 458089,458097 **** SELECT * FROM pg_catalog.pg_restore_relation_stats( 'version', '180000'::integer, 'relation', 'public.tableam_tblmv_heap2'::regclass, ! 'relpages', '1'::integer, ! 'reltuples', '1'::real, ! 'relallvisible', '1'::integer ); --- 458089,458097 ---- SELECT * FROM pg_catalog.pg_restore_relation_stats( 'version', '180000'::integer, 'relation', 'public.tableam_tblmv_heap2'::regclass, ! 'relpages', '0'::integer, ! 'reltuples', '-1'::real, ! 'relallvisible', '0'::integer ); This seems to be a real problem since the statistics is going back i.e. useful statistics is being reset. [1] https://www.postgresql.org/message-id/CAExHW5sBbMki6Xs4XxFQQF3C4Wx3wxkLAcySrtuW3vrnOxXDNQ%40mail.gmail.com -- Best Wishes, Ashutosh Bapat
Attachment
Re: Statistics import and export: difference in statistics of materialized view dumped
From
Tom Lane
Date:
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes: > After fixing the statistics difference in dumps of tables with > indexes, I now see difference in statistics of materialized view dump > in the test I am developing at [1] (see the latest patches there). Are you doing the restore in parallel by any chance? I had a todo item to revisit the dependencies that pg_dump is creating for stats items, because they looked wrong to me, ie inadequate to guarantee correct restore order. regards, tom lane
Re: Statistics import and export: difference in statistics of materialized view dumped
From
Jeff Davis
Date:
On Tue, 2025-03-11 at 10:17 -0400, Tom Lane wrote: > Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes: > > After fixing the statistics difference in dumps of tables with > > indexes, I now see difference in statistics of materialized view > > dump > > in the test I am developing at [1] (see the latest patches there). > > Are you doing the restore in parallel by any chance? I had a todo > item to revisit the dependencies that pg_dump is creating for stats > items, because they looked wrong to me, ie inadequate to guarantee > correct restore order. It's creating a dependency on the relation and a boundary dependency on the postDataBound (unless it's an index, or an MV that got pushed to SECTION_POST_DATA). I suspect what we need here is a dependency on the MV *data*, because that's doing a heap swap, which resets the stats. Looking into it. Regards, Jeff Davis
Re: Statistics import and export: difference in statistics of materialized view dumped
From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes: > On Tue, 2025-03-11 at 10:17 -0400, Tom Lane wrote: >> Are you doing the restore in parallel by any chance? I had a todo >> item to revisit the dependencies that pg_dump is creating for stats >> items, because they looked wrong to me, ie inadequate to guarantee >> correct restore order. > It's creating a dependency on the relation and a boundary dependency on > the postDataBound (unless it's an index, or an MV that got pushed to > SECTION_POST_DATA). > I suspect what we need here is a dependency on the MV *data*, because > that's doing a heap swap, which resets the stats. Looking into it. Right, that was what I was thinking, but hadn't had time to look in detail. The postDataBound dependency isn't real helpful here, we could lose that if we had the data dependency. regards, tom lane
Re: Statistics import and export: difference in statistics of materialized view dumped
From
Jeff Davis
Date:
On Tue, 2025-03-11 at 11:26 -0400, Tom Lane wrote: > Right, that was what I was thinking, but hadn't had time to look in > detail. The postDataBound dependency isn't real helpful here, we > could lose that if we had the data dependency. Attached a patch. It's a bit messier than I expected, so I'm open to other suggestions. The reason is because materialized view data is also pushed to RESTORE_PASS_POST_ACL, so we need to do the same for the statistics (otherwise the dependency is just ignored). Regards, Jeff Davis
Attachment
Re: Statistics import and export: difference in statistics of materialized view dumped
From
Ashutosh Bapat
Date:
On Wed, Mar 12, 2025 at 4:08 AM Jeff Davis <pgsql@j-davis.com> wrote: > > On Tue, 2025-03-11 at 11:26 -0400, Tom Lane wrote: > > Right, that was what I was thinking, but hadn't had time to look in > > detail. The postDataBound dependency isn't real helpful here, we > > could lose that if we had the data dependency. > > Attached a patch. > > It's a bit messier than I expected, so I'm open to other suggestions. > The reason is because materialized view data is also pushed to > RESTORE_PASS_POST_ACL, so we need to do the same for the statistics > (otherwise the dependency is just ignored). I ran my test with this patch (we have to remove 0003 patch in my test which uses --no-statistics option). It failed with following differences @@ -452068,8 +452068,8 @@ SELECT * FROM pg_catalog.pg_restore_relation_stats( 'version', '180000'::integer, 'relation', 'public.mvtest_aa'::regclass, - 'relpages', '2'::integer, - 'reltuples', '1'::real, + 'relpages', '0'::integer, + 'reltuples', '-1'::real, 'relallvisible', '0'::integer ); -- @@ -452097,8 +452097,8 @@ SELECT * FROM pg_catalog.pg_restore_relation_stats( 'version', '180000'::integer, 'relation', 'public.mvtest_tm_type'::regclass, - 'relpages', '2'::integer, - 'reltuples', '3'::real, + 'relpages', '0'::integer, + 'reltuples', '-1'::real, 'relallvisible', '0'::integer ); -- @@ -452111,8 +452111,8 @@ SELECT * FROM pg_catalog.pg_restore_relation_stats( 'version', '180000'::integer, 'relation', 'public.mvtest_tvmm_expr'::regclass, - 'relpages', '2'::integer, - 'reltuples', '1'::real, + 'relpages', '0'::integer, + 'reltuples', '-1'::real, 'relallvisible', '0'::integer ); --=== stderr === === EOF === The previous differences have disappeared but new differences have appeared. -- Best Wishes, Ashutosh Bapat
Re: Statistics import and export: difference in statistics of materialized view dumped
From
Ashutosh Bapat
Date:
On Wed, Mar 12, 2025 at 4:29 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > I ran my test with this patch (we have to remove 0003 patch in my test > which uses --no-statistics option). It failed with following > differences > @@ -452068,8 +452068,8 @@ > SELECT * FROM pg_catalog.pg_restore_relation_stats( > 'version', '180000'::integer, > 'relation', 'public.mvtest_aa'::regclass, > - 'relpages', '2'::integer, > - 'reltuples', '1'::real, > + 'relpages', '0'::integer, > + 'reltuples', '-1'::real, > 'relallvisible', '0'::integer > ); > -- > @@ -452097,8 +452097,8 @@ > SELECT * FROM pg_catalog.pg_restore_relation_stats( > 'version', '180000'::integer, > 'relation', 'public.mvtest_tm_type'::regclass, > - 'relpages', '2'::integer, > - 'reltuples', '3'::real, > + 'relpages', '0'::integer, > + 'reltuples', '-1'::real, > 'relallvisible', '0'::integer > ); > -- > @@ -452111,8 +452111,8 @@ > SELECT * FROM pg_catalog.pg_restore_relation_stats( > 'version', '180000'::integer, > 'relation', 'public.mvtest_tvmm_expr'::regclass, > - 'relpages', '2'::integer, > - 'reltuples', '1'::real, > + 'relpages', '0'::integer, > + 'reltuples', '-1'::real, > 'relallvisible', '0'::integer > ); > --=== stderr === > === EOF === > > The previous differences have disappeared but new differences have appeared. Pulled the latest sources but the test is still failing with the same differences. -- Best Wishes, Ashutosh Bapat
Re: Statistics import and export: difference in statistics of materialized view dumped
From
Jeff Davis
Date:
On Thu, 2025-03-27 at 17:07 +0530, Ashutosh Bapat wrote: > Pulled the latest sources but the test is still failing with the same > differences. The attached set of patches (your 0001 and 0002, combined with my patch v2j-0003) applied on master (058b5152f0) are passing the pg_upgrade test suite for me. Are you saying that the tests don't work for you even when v2j-0003 is applied? Or are you saying that your tests are failing on master, and that v2j-0002 should be committed? Regards, Jeff Davis
Attachment
Re: Statistics import and export: difference in statistics of materialized view dumped
From
Ashutosh Bapat
Date:
On Fri, Mar 28, 2025 at 10:41 AM Jeff Davis <pgsql@j-davis.com> wrote: > > On Thu, 2025-03-27 at 17:07 +0530, Ashutosh Bapat wrote: > > Pulled the latest sources but the test is still failing with the same > > differences. > > The attached set of patches (your 0001 and 0002, combined with my patch > v2j-0003) applied on master (058b5152f0) are passing the pg_upgrade > test suite for me. > > Are you saying that the tests don't work for you even when v2j-0003 is > applied? Or are you saying that your tests are failing on master, and > that v2j-0002 should be committed? When I applied v1 it didn't pass. But applying v2j-0003, the test passes. I don't think I understand the patch fully. But I have a comment. + * However, the section may be updated later for materialized views. + * Matview stats depend on the matview data, because REFRESH + * MATERIALIZED VIEW replaces the storage and resets the stats, and + * the matview data is in SECTION_POST_DATA. Also, the materialized + * SECTION_POST_DATA to resolve some kinds of dependency problems (see + * repairMatViewBoundaryMultiLoop()). + */ It feels like we can simplify this as: REFRESH MATERIALIZED VIEW replaces storage and resets the stats hence the Matview stats should be printed after printing REFRESH MATERIALIZED VIEW command in SECTION_POST_DATA. Also the materialized view ... . Hence the section may be updated later for materialized views. -- Best Wishes, Ashutosh Bapat
Re: Statistics import and export: difference in statistics of materialized view dumped
From
Jeff Davis
Date:
On Fri, 2025-03-28 at 14:53 +0530, Ashutosh Bapat wrote: > When I applied v1 it didn't pass. I applied v1 on top of master as of March 15 (771ba90298), and then took your two changes adding the tests, and it passed. Version v2j is just rebased forward, which involved a trivial merge conflict. > But applying v2j-0003, the test passes. Great. > It feels like we can simplify this as: REFRESH MATERIALIZED VIEW > replaces storage and resets the stats hence the Matview stats should > be printed after printing REFRESH MATERIALIZED VIEW command in > SECTION_POST_DATA. Also the materialized view ... . Hence the section > may be updated later for materialized views. Done, with a few modifications. Hopefully it's simpler. Tom suggested that there may be other dependency problems lurking, but I believe the patch is an improvement, so I committed it now to unblock your work. Regards, Jeff Davis