Home > mailing lists

Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers

From	Bruce Momjian
Subject	Re: Further pg_upgrade analysis for many tables
Date	November 24, 2012 20:12:27
Msg-id	20121124171220.GE9382@momjian.us Whole thread Raw
In response to	Re: Further pg_upgrade analysis for many tables (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

Added to TODO:
Improve cache lookup speed for sessions accessing many relations
http://archives.postgresql.org/pgsql-hackers/2012-11/msg00356.php
 

---------------------------------------------------------------------------

On Fri, Nov  9, 2012 at 12:50:34AM -0500, Tom Lane wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
> > Are sure the server you are dumping out of is head?
> 
> I experimented a bit with dumping/restoring 16000 tables matching
> Bruce's test case (ie, one serial column apiece).  The pg_dump profile
> seems fairly flat, without any easy optimization targets.  But
> restoring the dump script shows a rather interesting backend profile:
> 
> samples  %        image name               symbol name
> 30861    39.6289  postgres                 AtEOXact_RelationCache
> 9911     12.7268  postgres                 hash_seq_search
> 2682      3.4440  postgres                 init_sequence
> 2218      2.8482  postgres                 _bt_compare
> 2120      2.7223  postgres                 hash_search_with_hash_value
> 1976      2.5374  postgres                 XLogInsert
> 1429      1.8350  postgres                 CatalogCacheIdInvalidate
> 1282      1.6462  postgres                 LWLockAcquire
> 973       1.2494  postgres                 LWLockRelease
> 702       0.9014  postgres                 hash_any
> 
> The hash_seq_search time is probably mostly associated with
> AtEOXact_RelationCache, which is run during transaction commit and scans
> the relcache hashtable looking for tables created in the current
> transaction.  So that's about 50% of the runtime going into that one
> activity.
> 
> There are at least three ways we could whack that mole:
> 
> * Run the psql script in --single-transaction mode, as I was mumbling
> about the other day.  If we were doing AtEOXact_RelationCache only once,
> rather than once per CREATE TABLE statement, it wouldn't be a problem.
> Easy but has only a narrow scope of applicability.
> 
> * Keep a separate list (or data structure of your choice) so that
> relcache entries created in the current xact could be found directly
> rather than having to scan the whole relcache.  That'd add complexity
> though, and could perhaps be a net loss for cases where the relcache
> isn't so bloated.
> 
> * Limit the size of the relcache (eg by aging out
> not-recently-referenced entries) so that we aren't incurring O(N^2)
> costs for scripts touching N tables.  Again, this adds complexity and
> could be counterproductive in some scenarios.
> 
>             regards, tom lane

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +

pgsql-hackers by date:

From: Andres Freund
Date: 24 November 2012, 18:50:13
Subject: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update

From: Tom Lane
Date: 24 November 2012, 20:26:19
Subject: Re: Proposal for Allow postgresql.conf values to be changed via SQL

Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers

Previous

Next