Re: Further pg_upgrade analysis for many tables - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Further pg_upgrade analysis for many tables |
Date | |
Msg-id | CA+TgmoZ383n+06T1Em1UGPSP276=uQM_Vbf6ZCHsA_7nYXpf3g@mail.gmail.com Whole thread Raw |
In response to | Re: Further pg_upgrade analysis for many tables (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Further pg_upgrade analysis for many tables
|
List | pgsql-hackers |
On Fri, Nov 9, 2012 at 12:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jeff Janes <jeff.janes@gmail.com> writes: >> Are sure the server you are dumping out of is head? > > I experimented a bit with dumping/restoring 16000 tables matching > Bruce's test case (ie, one serial column apiece). The pg_dump profile > seems fairly flat, without any easy optimization targets. But > restoring the dump script shows a rather interesting backend profile: > > samples % image name symbol name > 30861 39.6289 postgres AtEOXact_RelationCache > 9911 12.7268 postgres hash_seq_search > 2682 3.4440 postgres init_sequence > 2218 2.8482 postgres _bt_compare > 2120 2.7223 postgres hash_search_with_hash_value > 1976 2.5374 postgres XLogInsert > 1429 1.8350 postgres CatalogCacheIdInvalidate > 1282 1.6462 postgres LWLockAcquire > 973 1.2494 postgres LWLockRelease > 702 0.9014 postgres hash_any > > The hash_seq_search time is probably mostly associated with > AtEOXact_RelationCache, which is run during transaction commit and scans > the relcache hashtable looking for tables created in the current > transaction. So that's about 50% of the runtime going into that one > activity. > > There are at least three ways we could whack that mole: > > * Run the psql script in --single-transaction mode, as I was mumbling > about the other day. If we were doing AtEOXact_RelationCache only once, > rather than once per CREATE TABLE statement, it wouldn't be a problem. > Easy but has only a narrow scope of applicability. > > * Keep a separate list (or data structure of your choice) so that > relcache entries created in the current xact could be found directly > rather than having to scan the whole relcache. That'd add complexity > though, and could perhaps be a net loss for cases where the relcache > isn't so bloated. > > * Limit the size of the relcache (eg by aging out > not-recently-referenced entries) so that we aren't incurring O(N^2) > costs for scripts touching N tables. Again, this adds complexity and > could be counterproductive in some scenarios. Although there may be some workloads that access very large numbers of tables repeatedly, I bet that's not typical. Rather, I bet that a session which accesses 10,000 tables is most likely to access them just once each - and right now we don't handle that case very well; this is not the first complaint about big relcaches causing problems. On the flip side, we don't want workloads that exceed some baked-in cache size to fall off a cliff. So I think we should be looking for a solution that doesn't put a hard limit on the size of the relcache, but does provide at least some latitude to get rid of old entries. So maybe something like this. Add a flag to each relcache entry indicating whether or not it has been used. After adding 1024 entries to the relcache, scan all the entries: clear the flag if it's set, flush the entry if it's already clear. This allows the size of the relcache to grow without bound, but only if we're continuing to access the old tables in between adding new ones to the mix. As an additional safeguard, we could count the number of toplevel SQL commands that have been executed and require that a flush not be performed more often than, say, every 64 toplevel SQL commands. That way, if a single operation on an inheritance parent with many children sucks a lot of stuff into the relcache, we'll avoid cleaning it out too quickly. Maybe this is all too ad-hoc, but I feel like we don't need to overengineer this. The existing system is fine in 99% of the cases, so we really only need to find a way to detect the really egregious case where we are doing a neverending series of one-time table accesses and apply a very light tap to avoid the pain point in that case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: