Home > mailing lists

Re: Make deparsing of column defaults faster - Mailing list pgsql-hackers

From	Justin Pryzby
Subject	Re: Make deparsing of column defaults faster
Date	July 5, 2018 22:58:27
Msg-id	20180705165827.GK7025@telsasoft.com Whole thread Raw
In response to	Re: Make deparsing of column defaults faster (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses	Re: Make deparsing of column defaults faster (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List	pgsql-hackers

Tree view

On Mon, Jun 04, 2018 at 10:00:53PM -0400, Peter Eisentraut wrote:
> On 6/4/18 20:55, Jeff Janes wrote:
> > Since defaults can't contain Vars, this patch converts the second
> > parameter to zero in places where pg_get_expr is invoked on column
> > defaults.
> 
> My in-progress generated columns patch removes that assumption (since a
> generated column generally refers to another column of the same table).

On Thu, Jul 05, 2018 at 04:45:07PM +0200, Peter Eisentraut wrote:
> On 29.06.18 05:15, Jeff Janes wrote:
> > Since pg_dump calls pg_get_expr once over and over again on the same
> > table consecutively, perhaps we could cache the column alias assignments
> > in a single-entry cache, so if it is called on the same table as last
> > time it just re-uses the aliases from last time.  I am not planning on
> 
> I looked into that.  deparse_context_for() is actually not that
> expensive on its own, well below one second, but it gets somewhat
> expensive when you call it 1600 times for one table.  So to address that
> case, we can cache the deparse context between calls in the fn_extra
> field of pg_get_expr.  The attached patch does that.  This makes the
> pg_dump -s times pretty much constant even with 1600 columns with
> defaults.

I checked on one customer running PG10.4, for which pg_dump takes 8 minutes to
pg_dump -s.

I imported existing schema to PG12dev (which itself took 25min) and compared:
patched: 2m33.616s
unpatched: 7m19.578s

Note that I've reduced the number of child tables in this DB recently (by
repartitioning tables from daily to monthly granularity), thereby reducing the
number of columns of the largest tables by a factor of 30, and reducing the
size of pg_dump -s to 51MB from 120MB (6 months ago).  I expect this patch
would've saved even more before the cleanup.

> How realistic is this use case?  Is it worth it?

Note, that affects pg_upgrade, which is how this issue originally came up [0].
(But I believe pg_upgrade likes to call pg_dump from the old server version, so
pg_upgrade to v11 couldn't benefit unless this was included in PG10.5).

[pryzbyj@database postgresql]$ grep -c 'SET DEFAULT' /srv/cdrperfbackup/ts/2018-07-04/pg_dump-section=pre-data
183915

Justin

[0] https://www.postgresql.org/message-id/CAMkU%3D1x-e%2BmaqefhM1yMeSiJ8J9Z%2BSJHgW7c9bqo3E3JMG4iJA%40mail.gmail.com

pgsql-hackers by date:

From: Andrew Gierth
Date: 05 July 2018, 22:40:14
Subject: Re: Regarding shared_preload_libraries (postgresql.conf file)

From: Fujii Masao
Date: 05 July 2018, 23:22:46
Subject: Re: New function pg_stat_statements_reset_query() to reset statisticsof a specific query

Re: Make deparsing of column defaults faster - Mailing list pgsql-hackers

Previous

Next