Hi hackers,
I was wondering if there's a good reason in pg_dump getPublicationTables()
to iterate through all tables one by one and querying to see if it has a
corresponding publication other than memory concerns?
Attached is a patch to construct the publications by querying for all publications
at once and then finding the corresponding tableInfo information in tblinfoindex.
It seems more likely that users will have a lot more tables than publications so
this should be a lot faster, especially when there's millions of tables.
Setup:
time pg_dump --schema-only -d postgres -f dump.sql -v 2> dump.log
10k tables, each table having its own publication.
Before patch:
real 0m6.409s
user 0m0.234s
sys 0m0.584s
With patch:
real 0m5.483s
user 0m0.399s
sys 0m0.443s
10k tables, no publications.
Before patch:
real 0m6.193s
user 0m0.276s
sys 0m0.531s
With patch:
real 0m5.261s
user 0m0.254s
sys 0m0.377s
Thanks,
John H