Home > mailing lists

Re: Drastic performance loss in assert-enabled build in HEAD - Mailing list pgsql-hackers

From	Nicolas Barbier
Subject	Re: Drastic performance loss in assert-enabled build in HEAD
Date	April 5, 2013 09:17:44
Msg-id	CAP-rdTbbp9_a469p99n0atm_6tbb0iaUQx_XNH6cMVdsfYAvsg@mail.gmail.com Whole thread Raw
In response to	Re: Drastic performance loss in assert-enabled build in HEAD (Noah Misch <noah@leadboat.com>)
Responses	Re: Drastic performance loss in assert-enabled build in HEAD
List	pgsql-hackers

Tree view

2013/4/5 Noah Misch <noah@leadboat.com>:

> On Thu, Apr 04, 2013 at 12:28:01PM +0200, Nicolas Barbier wrote:
>
>> +1. Having unlogged matviews without having incremental updates yet,
>> isn't super useful anyway.
>
> I would have surmised the opposite: since an unlogged MV requires a full
> refresh at unpredictable moments, logged MVs will be preferred where a refresh
> is prohibitively expensive.

That sounds like a good reason to choose your matview to be logged indeed.

I.e., very expensive to rebuild → choose logged

The opposite is also true: If your matview is not so expensive to
rebuild, why would it matter that much if it is logged? (Rebuilding
would be a tad slower, but it is not that slow to start with, so who
cares?)

I.e., not so expensive to rebuild → logged or unlogged are fine

This would mean “always choose logged,” except for the restricted case
of “incremental updates of a matview that is not so expensive to
rebuild” that I describe next:

> Why might unlogged-MV applications desire incremental updates more acutely
> than logged-MV applications?

My reasoning was more like: If you have incremental updates, there
will probably be some overhead linked to executing any transaction
that updates the base tables, namely for storing the changesets
somewhere. I imagined it could at least be this storing of changesets
that one would want to be unlogged, lest it slowing down the commit of
most transactions that don’t even touch the matview.

(Note that losing any changeset is the same as losing the contents of
the whole matview in the “always guaranteed to be 100% up-to-date when
read” case.)

Of course, because of the previous reasoning, we should always make a
matview logged if it is very expensive to rebuild. That leaves the
case of a not-so-expensive matview: It is smart to make it unlogged
when the changesets are stored in a way similar to what I just
described.

Anyway, I conclude that (especially as we don’t have incremental
updates yet), unlogged matviews are, as things stand now, not very
useful.

As a sidenote, I see two ways to avoid storing changesets as part of a
commit that changes the base tables. Choosing any of those would
invalidate my previous logic, and even more diminish the need for
unlogged matviews:

(1) WAL is used as the source for the changesets; Andres’ logical
replication work comes to mind. Probably mostly useful for the “always
guaranteed to be 100% up-to-date when read” case.

(2) The base tables are scanned and the xmin/xmax values are used to
determine what changed since last time. This probably requires keeping
VACUUM from removing rows that are still needed to determine how to
change any matviews that depend on that base table. Probably mostly
useful for the case where bulk-incremental updates are performed only
sporadically, and where the base tables are not so big that they
cannot usefully be scanned in full.

Nicolas

--
A. Because it breaks the logical sequence of discussion.
Q. Why is top posting bad?

pgsql-hackers by date:

From: Florian Pflug
Date: 05 April 2013, 08:34:56
Subject: Re: corrupt pages detected by enabling checksums

From: Matthias
Date: 05 April 2013, 11:54:43
Subject: Re: Hash Join cost estimates

Re: Drastic performance loss in assert-enabled build in HEAD - Mailing list pgsql-hackers

Previous

Next