Re: Make deparsing of column defaults faster - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Make deparsing of column defaults faster
Date
Msg-id CAMkU=1zniSch8BsZZL1ngqC9QGfhR=nkCpp6RAdf_sQFBDV42Q@mail.gmail.com
Whole thread Raw
In response to Re: Make deparsing of column defaults faster  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: Make deparsing of column defaults faster  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List pgsql-hackers
On Thu, Jul 5, 2018 at 10:45 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 29.06.18 05:15, Jeff Janes wrote:
> Since pg_dump calls pg_get_expr once over and over again on the same
> table consecutively, perhaps we could cache the column alias assignments
> in a single-entry cache, so if it is called on the same table as last
> time it just re-uses the aliases from last time.  I am not planning on
> working on that, I don't know where such a cache could be stored such
> that is freed and invalidated at the appropriate times.

I looked into that.  deparse_context_for() is actually not that
expensive on its own, well below one second, but it gets somewhat
expensive when you call it 1600 times for one table.  So to address that
case, we can cache the deparse context between calls in the fn_extra
field of pg_get_expr.  The attached patch does that.  This makes the
pg_dump -s times pretty much constant even with 1600 columns with
defaults.  psql \d should benefit similarly.  I haven't seen any other
cases where we'd expect hundreds of related objects to deparse.  (Do
people have hundreds of policies per table?)

One case that your patch doesn't improve (neither does my posted one) is check constraints.  To fix that, pg_get_constraintdef_worker would also need to grow a cache as well.  I don't know how often people put check constraints on most of the columns of a table.  Some people like their NOT NULL constraints to be named, not implicit.

But from the bigger picture of making pg_upgrade faster, a major issue is that while pg_dump -s gets faster for the column default case, the restore of that dump is still slow (again, my posted patch also doesn't fix that).  In that case it is deparse_context_for called from StoreAttrDefault which is slow.

(I suppose you could create scenarios with very many such tables to make
the overhead visible again.)

With partitioning, I think anything which happens once is likely to happen many times. But I don't see any extra improvement from applying my patch over yours (the bottleneck is shifted off to pg_dump itself), and yours is definitely cleaner and slightly more general.  I think that yours would be worthwhile, even if not the last word on the subject.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: How can we submit code patches that implement our (pending)patents?
Next
From: Tom Lane
Date:
Subject: Re: peripatus build failures....