Re: pg_dump and thousands of schemas - Mailing list pgsql-performance

From Tatsuo Ishii
Subject Re: pg_dump and thousands of schemas
Date
Msg-id 20120530.180620.600165924826262795.t-ishii@sraoss.co.jp
Whole thread Raw
In response to Re: pg_dump and thousands of schemas  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pg_dump and thousands of schemas
Re: pg_dump and thousands of schemas
List pgsql-performance
> Yeah, Jeff's experiments indicated that the remaining bottleneck is lock
> management in the server.  What I fixed so far on the pg_dump side
> should be enough to let partial dumps run at reasonable speed even if
> the whole database contains many tables.  But if psql is taking
> AccessShareLock on lots of tables, there's still a problem.

Ok, I modified the part of pg_dump where tremendous number of LOCK
TABLE are issued. I replace them with single LOCK TABLE with multiple
tables. With 100k tables LOCK statements took 13 minutes in total, now
it only takes 3 seconds. Comments?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3461f3e..cc1ffd7 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3832,6 +3832,7 @@ getTables(Archive *fout, int *numTables)
     int            i_reloptions;
     int            i_toastreloptions;
     int            i_reloftype;
+    bool        lock_needed = false;

     /* Make sure we are in proper schema */
     selectSourceSchema(fout, "pg_catalog");
@@ -4273,15 +4274,21 @@ getTables(Archive *fout, int *numTables)
          * NOTE: it'd be kinda nice to lock other relations too, not only
          * plain tables, but the backend doesn't presently allow that.
          */
-        if (tblinfo[i].dobj.dump && tblinfo[i].relkind == RELKIND_RELATION)
+        if (i == 0)
         {
             resetPQExpBuffer(query);
-            appendPQExpBuffer(query,
-                              "LOCK TABLE %s IN ACCESS SHARE MODE",
+            appendPQExpBuffer(query,"LOCK TABLE ");
+        }
+
+        if (tblinfo[i].dobj.dump && tblinfo[i].relkind == RELKIND_RELATION)
+        {
+            if (lock_needed)
+                appendPQExpBuffer(query,",");
+            appendPQExpBuffer(query,"%s",
                          fmtQualifiedId(fout,
                                         tblinfo[i].dobj.namespace->dobj.name,
                                         tblinfo[i].dobj.name));
-            ExecuteSqlStatement(fout, query->data);
+            lock_needed = true;
         }

         /* Emit notice if join for owner failed */
@@ -4290,6 +4297,12 @@ getTables(Archive *fout, int *numTables)
                       tblinfo[i].dobj.name);
     }

+    if (lock_needed)
+    {
+        appendPQExpBuffer(query, " IN ACCESS SHARE MODE");
+        ExecuteSqlStatement(fout, query->data);
+    }
+
     if (lockWaitTimeout && fout->remoteVersion >= 70300)
     {
         ExecuteSqlStatement(fout, "SET statement_timeout = 0");

pgsql-performance by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: pg_dump and thousands of schemas
Next
From: Murat Tasan
Date:
Subject: does the query planner consider work_mem?