Re: Can we get rid of repeated queries from pg_dump? - Mailing list pgsql-general
From | hubert depesz lubaczewski |
---|---|
Subject | Re: Can we get rid of repeated queries from pg_dump? |
Date | |
Msg-id | 20210826160644.GA3120@depesz.com Whole thread Raw |
In response to | Re: Can we get rid of repeated queries from pg_dump? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Can we get rid of repeated queries from pg_dump?
|
List | pgsql-general |
On Thu, Aug 26, 2021 at 10:20:29AM -0400, Tom Lane wrote: > Well, you could move it forward by doing the legwork to identify which > queries are worth merging. Is it really sane to do a global "select > format_type() from pg_type" query and save all the results on the client > side? I wonder whether there are cases where that'd be a net loss. > You could do the experimentation to figure that out without necessarily > having the C skills to make pg_dump actually do it. So, I got some info. First, some stats. The DB contains: - 14 extensions - 1 aggregate - 107 functions - 5 schemas - 5 sequences - 188 logged tables - 1 unlogged table - 206 "normal" indexes - 30 unique indexes - 15 materialized views - 16 triggers - 87 types - 26 views pg_dump -s of it is ~ 670kB. Interestingly, while dumping (pg_dump -s -v), we can see progress going on, and then, after: ==== ... pg_dump: reading publications pg_dump: reading publication membership pg_dump: reading subscriptions pg_dump: reading dependency data pg_dump: saving encoding = UTF8 pg_dump: saving standard_conforming_strings = on pg_dump: saving search_path = ==== It stops (progress visible in console). And then, in pg logs I see queries like: #v+ SELECT proretset, prosrc, probin, provolatile, proisstrict, prosecdef, lanname, proconfig, procost, prorows, pg_catalog.pg_get_function_arguments(p.oid) AS funcargs, pg_catalog.pg_get_function_identity_arguments(p.oid) AS funciargs, pg_catalog.pg_get_function_result(p.oid) AS funcresult, proleakproof, array_to_string(protrftypes, ' ') AS protrftypes, proparallel, prokind, prosupport, NULL AS prosqlbody FROM pg_catalog.pg_proc p, pg_catalog.pg_language l WHERE p.oid = '43875'::pg_catalog.oid AND l.oid = p.prolang #v- Now for query stats. To dump it all, pg_dump needed 9173 queries (logged by log_min_duration_statement = 0 for this user). I extracted all queries to separate files, and made stats. In total there were only 4257 unique queries. Then I checked for repeated queries. Top 10 most repeated offenders were: 615 times : SELECT pg_catalog.format_type('25'::pg_catalog.oid, NULL) 599 times : SELECT pg_catalog.format_type('23'::pg_catalog.oid, NULL) 579 times : SELECT pg_catalog.format_type('2281'::pg_catalog.oid, NULL) 578 times : SELECT pg_catalog.format_type('41946'::pg_catalog.oid, NULL) 523 times : SELECT pg_catalog.format_type('701'::pg_catalog.oid, NULL) 459 times : SELECT pg_catalog.format_type('42923'::pg_catalog.oid, NULL) 258 times : SELECT pg_catalog.format_type('16'::pg_catalog.oid, NULL) 176 times : SELECT pg_catalog.format_type('19'::pg_catalog.oid, NULL) 110 times : SELECT pg_catalog.format_type('21'::pg_catalog.oid, NULL) 106 times : SELECT pg_catalog.format_type('42604'::pg_catalog.oid, NULL) In total, there were 5000 queries: SELECT pg_catalog.format_type('[0-9]+'::pg_catalog.oid, NULL) But there were only 83 separate oids that were scanned. The only other repeated command was: SELECT pg_catalog.set_config('search_path', '', false); and it was called only twice. Based on my reading of queries in order it seems to follow the pattern of: One call for: SELECT proretset, prosrc, probin, provolatile, proisstrict, prosecdef, lanname, proconfig, procost, prorows, pg_catalog.pg_get_function_arguments(p.oid)AS funcargs, pg_catalog.pg_get_function_identity_arguments(p.oid) AS funciargs, pg_catalog.pg_get_function_re sult(p.oid) AS funcresult, proleakproof, array_to_string(protrftypes, ' ') AS protrftypes, proparallel, prokind, prosupport, NULL AS prosqlbody FROM pg_catalog.pg_proc p, pg_catalog.pg_language l WHERE p.oid = 'SOME_NUMBER'::pg_catalog.oidAND l.oid = p.prolang and then one or more: SELECT pg_catalog.format_type('SOME_NUMBER'::pg_catalog.oid, NULL) In one case, after proc query, there were 94 concecutive pg_catalog.format_type queries. I hope it helps. Best regards, depesz
pgsql-general by date: