Thread: pgsql: Transfer statistics during pg_upgrade.
Transfer statistics during pg_upgrade. Add support to pg_dump for dumping stats, and use that during pg_upgrade so that statistics are transferred during upgrade. In most cases this removes the need for a costly re-analyze after upgrade. Some statistics are not transferred, such as extended statistics or statistics with a custom stakind. Now pg_dump accepts the options --schema-only, --no-schema, --data-only, --no-data, --statistics-only, and --no-statistics; which allow all combinations of schema, data, and/or stats. The options are named this way to preserve compatibility with the previous --schema-only and --data-only options. Statistics are in SECTION_DATA, unless the object itself is in SECTION_POST_DATA. The stats are represented as calls to pg_restore_relation_stats() and pg_restore_attribute_stats(). Author: Corey Huinker, Jeff Davis Reviewed-by: Jian He Discussion: https://postgr.es/m/CADkLM=fzX7QX6r78fShWDjNN3Vcr4PVAnvXxQ4DiGy6V=0bCUA@mail.gmail.com Discussion: https://postgr.es/m/CADkLM%3DcB0rF3p_FuWRTMSV0983ihTRpsH%2BOCpNyiqE7Wk0vUWA%40mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/1fd1bd871012732e3c6c482667d2f2c56f1a9395 Modified Files -------------- doc/src/sgml/ref/pg_dump.sgml | 75 ++++- doc/src/sgml/ref/pg_dumpall.sgml | 41 ++- doc/src/sgml/ref/pg_restore.sgml | 48 ++- doc/src/sgml/ref/pgupgrade.sgml | 9 + src/bin/pg_dump/pg_backup.h | 2 + src/bin/pg_dump/pg_backup_archiver.c | 83 +++-- src/bin/pg_dump/pg_backup_archiver.h | 3 +- src/bin/pg_dump/pg_backup_directory.c | 2 +- src/bin/pg_dump/pg_dump.c | 415 ++++++++++++++++++++++++- src/bin/pg_dump/pg_dump.h | 12 + src/bin/pg_dump/pg_dump_sort.c | 37 ++- src/bin/pg_dump/pg_dumpall.c | 24 +- src/bin/pg_dump/pg_restore.c | 25 +- src/bin/pg_dump/t/001_basic.pl | 18 ++ src/bin/pg_dump/t/002_pg_dump.pl | 106 ++++++- src/bin/pg_upgrade/dump.c | 3 +- src/bin/pg_upgrade/option.c | 7 + src/bin/pg_upgrade/pg_upgrade.h | 1 + src/bin/pg_upgrade/t/002_pg_upgrade.pl | 7 + src/test/perl/PostgreSQL/Test/AdjustUpgrade.pm | 10 + src/test/recovery/t/027_stream_regress.pl | 4 +- src/tools/pgindent/typedefs.list | 1 + 22 files changed, 858 insertions(+), 75 deletions(-)
On 2025-02-20 Th 4:29 AM, Jeff Davis wrote: > Transfer statistics during pg_upgrade. Small nit. I notice this commit has introduced a couple of dubious uses of "mututally exclusive": doc/src/sgml/ref/pg_dump.sgml: This option is mutually exclusive to <option>--data-only</option> doc/src/sgml/ref/pg_dump.sgml- and <option>--statistics-only</option>. -- doc/src/sgml/ref/pg_restore.sgml: This option is mutually exclusive of <option>--data-only</option> doc/src/sgml/ref/pg_restore.sgml- and <option>--statistics-only</option>. I don't think this is idiomatic usage, and it's possibly not correct usage. At best it is jarring. One does not normally follow "mutually exclusive" with a preposition. I would replace the these with something like "This option cannot be used with ...". If you want to use the phrase, it should be something like "This option and ... are mutually exclusive." cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Thu, 20 Feb 2025 at 22:29, Jeff Davis <jdavis@postgresql.org> wrote: > Add support to pg_dump for dumping stats, and use that during > pg_upgrade so that statistics are transferred during upgrade. In most > cases this removes the need for a costly re-analyze after upgrade. I was surprised to see when I did pg_dump -T just_this_table that I also got statistics data for all tables in the database. Have filtering stats for tables not in the -T list been overlooked? David
On Tue, 2025-04-08 at 17:24 +1200, David Rowley wrote: > I was surprised to see when I did pg_dump -T just_this_table that I > also got statistics data for all tables in the database. -T means "exclude this pattern" I tried a few basic variations and they seemed to do what I expected: CREATE TABLE mytable1(i INT); INSERT INTO mytable1 VALUES(1); CREATE TABLE mytable2(i INT); INSERT INTO mytable2 VALUES(2); CREATE TABLE mytable3(i INT); INSERT INTO mytable3 VALUES(3); ANALYZE; $ ./bin/pg_dump -t mytable1 postgres | grep "Name: mytable" -- Name: mytable1; Type: TABLE; Schema: public; Owner: jdavis -- Data for Name: mytable1; Type: TABLE DATA; Schema: ... -- Statistics for Name: mytable1; Type: STATISTICS DATA; ... $ ./bin/pg_dump -T mytable1 postgres | grep "Name: mytable" -- Name: mytable2; Type: TABLE; Schema: public; Owner: jdavis -- Name: mytable3; Type: TABLE; Schema: public; Owner: jdavis -- Data for Name: mytable2; Type: TABLE DATA; Schema: ... -- Data for Name: mytable3; Type: TABLE DATA; Schema: ... -- Statistics for Name: mytable2; Type: STATISTICS DATA; ... -- Statistics for Name: mytable3; Type: STATISTICS DATA; ... If you still see a problem, please let me know. Regards, Jeff Davis