Home > mailing lists

Re: Removing unneeded self joins - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Removing unneeded self joins
Date	May 17, 2018 05:11:22
Msg-id	28694.1526523082@sss.pgh.pa.us Whole thread Raw
In response to	Re: Removing unneeded self joins (David Rowley <david.rowley@2ndquadrant.com>)
Responses	Re: Removing unneeded self joins
List	pgsql-hackers

Tree view

David Rowley <david.rowley@2ndquadrant.com> writes:
> On 17 May 2018 at 11:00, Andres Freund <andres@anarazel.de> wrote:
>> Wonder if we shouldn't just cache an estimated relation size in the
>> relcache entry till then. For planning purposes we don't need to be
>> accurate, and usually activity that drastically expands relation size
>> will trigger relcache activity before long. Currently there's plenty
>> workloads where the lseeks(SEEK_END) show up pretty prominently.

> While I'm in favour of speeding that up, I think we'd get complaints
> if we used a stale value.

Yeah, that scares me too.  We'd then be in a situation where (arguably)
any relation extension should force a relcache inval.  Not good.
I do not buy Andres' argument that the value is noncritical, either ---
particularly during initial population of a table, where the size could
go from zero to something-significant before autoanalyze gets around
to noticing.

I'm a bit skeptical of the idea of maintaining an accurate relation
size in shared memory, too.  AIUI, a lot of the problem we see with
lseek(SEEK_END) has to do with contention inside the kernel for access
to the single-point-of-truth where the file's size is kept.  Keeping
our own copy would eliminate kernel-call overhead, which can't hurt,
but it won't improve the contention angle.

            regards, tom lane

pgsql-hackers by date:

From: Amit Langote
Date: 17 May 2018, 04:52:30
Subject: partition -> partitioned

From: Andres Freund
Date: 17 May 2018, 05:19:34
Subject: Re: Removing unneeded self joins

Re: Removing unneeded self joins - Mailing list pgsql-hackers

Previous

Next