Thread: Re: Can we get rid of repeated queries from pg_dump?

Re: Can we get rid of repeated queries from pg_dump?

From
Tom Lane
Date:
[ redirecting to -hackers ]

I wrote:
> I experimented with the attached, very quick-n-dirty patch to collect
> format_type results during the initial scan of pg_type, instead.  On the
> regression database in HEAD, it reduces the number of queries pg_dump
> issues from 3260 to 2905; but I'm having a hard time detecting any net
> performance change.

I decided that that patch wasn't too safe, because it applies
format_type() to pg_type rows that we have no reason to trust the
longevity of.  I think it could fall over if some concurrent process
were busy dropping a temp table, for example.

So here's a version that just does plain caching of the results
of retail getFormattedTypeName() calls.  This visibly adds no
queries that were not done before, so it should be safe enough.
And there can't be any cases that it makes slower, either.

            regards, tom lane

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 6adbd20778..6259e68aa4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -5232,6 +5232,7 @@ getTypes(Archive *fout, int *numTypes)
         tyinfo[i].dobj.name = pg_strdup(PQgetvalue(res, i, i_typname));
         tyinfo[i].dobj.namespace =
             findNamespace(atooid(PQgetvalue(res, i, i_typnamespace)));
+        tyinfo[i].ftypname = NULL;    /* may get filled later */
         tyinfo[i].rolname = pg_strdup(PQgetvalue(res, i, i_rolname));
         tyinfo[i].typacl = pg_strdup(PQgetvalue(res, i, i_typacl));
         tyinfo[i].rtypacl = pg_strdup(PQgetvalue(res, i, i_rtypacl));
@@ -18892,12 +18893,11 @@ findDumpableDependencies(ArchiveHandle *AH, const DumpableObject *dobj,
  *
  * This does not guarantee to schema-qualify the output, so it should not
  * be used to create the target object name for CREATE or ALTER commands.
- *
- * TODO: there might be some value in caching the results.
  */
 static char *
 getFormattedTypeName(Archive *fout, Oid oid, OidOptions opts)
 {
+    TypeInfo   *typeInfo;
     char       *result;
     PQExpBuffer query;
     PGresult   *res;
@@ -18910,6 +18910,11 @@ getFormattedTypeName(Archive *fout, Oid oid, OidOptions opts)
             return pg_strdup("NONE");
     }

+    /* see if we have the result cached in the type's TypeInfo record */
+    typeInfo = findTypeByOid(oid);
+    if (typeInfo && typeInfo->ftypname)
+        return pg_strdup(typeInfo->ftypname);
+
     query = createPQExpBuffer();
     appendPQExpBuffer(query, "SELECT pg_catalog.format_type('%u'::pg_catalog.oid, NULL)",
                       oid);
@@ -18922,6 +18927,10 @@ getFormattedTypeName(Archive *fout, Oid oid, OidOptions opts)
     PQclear(res);
     destroyPQExpBuffer(query);

+    /* cache a copy for later requests */
+    if (typeInfo)
+        typeInfo->ftypname = pg_strdup(result);
+
     return result;
 }

diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index f5e170e0db..29af845ece 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -166,9 +166,11 @@ typedef struct _typeInfo
     DumpableObject dobj;

     /*
-     * Note: dobj.name is the pg_type.typname entry.  format_type() might
-     * produce something different than typname
+     * Note: dobj.name is the raw pg_type.typname entry.  ftypname is the
+     * result of format_type(), which will be quoted if needed, and might be
+     * schema-qualified too.
      */
+    char       *ftypname;
     char       *rolname;        /* name of owner, or empty string */
     char       *typacl;
     char       *rtypacl;

Re: Can we get rid of repeated queries from pg_dump?

From
Tom Lane
Date:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> My point in here is that potential optimizations regarding queries for
> pg_proc might speed up dumps for more people - as they might use things
> like postgis, but never realized that it can be much faster.

Agreed, but as I said upthread, fixing that looks like it will be
rather invasive.  Meanwhile, I went ahead and pushed the two
simple improvements discussed so far.

            regards, tom lane