Re: Removing unneeded self joins - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: Removing unneeded self joins
Date
Msg-id 9c1a30c1-2c2d-50f4-dbd7-b4b50f6ddec4@postgrespro.ru
Whole thread Raw
In response to Re: Removing unneeded self joins  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers

On 17.05.2018 05:19, Andres Freund wrote:
> On 2018-05-16 22:11:22 -0400, Tom Lane wrote:
>> David Rowley <david.rowley@2ndquadrant.com> writes:
>>> On 17 May 2018 at 11:00, Andres Freund <andres@anarazel.de> wrote:
>>>> Wonder if we shouldn't just cache an estimated relation size in the
>>>> relcache entry till then. For planning purposes we don't need to be
>>>> accurate, and usually activity that drastically expands relation size
>>>> will trigger relcache activity before long. Currently there's plenty
>>>> workloads where the lseeks(SEEK_END) show up pretty prominently.
>>> While I'm in favour of speeding that up, I think we'd get complaints
>>> if we used a stale value.
>> Yeah, that scares me too.  We'd then be in a situation where (arguably)
>> any relation extension should force a relcache inval.  Not good.
>> I do not buy Andres' argument that the value is noncritical, either ---
>> particularly during initial population of a table, where the size could
>> go from zero to something-significant before autoanalyze gets around
>> to noticing.
> I don't think every extension needs to force a relcache inval. It'd
> instead be perfectly reasonable to define a rule that an inval is
> triggered whenever crossing a 10% relation size boundary. Which'll lead
> to invalidations for the first few pages, but much less frequently
> later.
>
>
>> I'm a bit skeptical of the idea of maintaining an accurate relation
>> size in shared memory, too.  AIUI, a lot of the problem we see with
>> lseek(SEEK_END) has to do with contention inside the kernel for access
>> to the single-point-of-truth where the file's size is kept.  Keeping
>> our own copy would eliminate kernel-call overhead, which can't hurt,
>> but it won't improve the contention angle.
> A syscall is several hundred instructions. An unlocked read - which'll
> be be sufficient in many cases, given that the value can quickly be out
> of date anyway - is a few cycles. Even with a barrier you're talking a
> few dozen cycles.  So I can't see how it'd not improve the contention.
>
> But the main reason for keeping it in shmem is less the lseek avoidance
> - although that's nice, context switches aren't great - but to make
> relation extension need far less locking.
>
> Greetings,
>
> Andres Freund
>

I completely agree with Andreas. In my multithreaded Postgres prototype 
file description cache (shared by all threads) becomes bottleneck 
exactly because of each query execution requires
access to file system (lseek) to provide optimizer estimation of the 
relation size, despite to the fact that all database fits in memory.
Well, this is certainly specific of shared descriptor's pool in my 
prototype, but the fact the we have to perform lseek at each query 
compilation seems to be annoying in any case.

And there is really no problem that cached relation size estimation is 
not precise.  It really can be invalidated even if relation size is 
changed more than some threshold value (1Mb?) or lease time for cached 
value is expired.
May be it is reasonable to implement specific invalidation for relation 
size esimation, to avoid complete invalidation and reconstruction of 
relation description and all dependent objects.
In this case time-based invalidation seems to be the easiest choice to 
implement. Repeating lseek each 10 or 1 second seems to have no 
noticeable impact on performance and relation size can not dramatically 
changed during this time.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



pgsql-hackers by date:

Previous
From: Paul Guo
Date:
Subject: [PATCH] Use access() to check file existence in GetNewRelFileNode().
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Infinite loop on master shutdown