Re: A performance issue with Memoize - Mailing list pgsql-hackers

From Anthonin Bonnefoy
Subject Re: A performance issue with Memoize
Date
Msg-id CAO6_XqrYGoMJH1UmETZ+Y==uppQBsuEkiPdhZxeBOj61R=c1hw@mail.gmail.com
Whole thread Raw
In response to Re: A performance issue with Memoize  (Alexander Lakhin <exclusion@gmail.com>)
Responses Re: A performance issue with Memoize
List pgsql-hackers
Hi,

I've seen a similar issue with the following query (tested on the current head):

EXPLAIN ANALYZE SELECT * FROM tenk1 t1
LEFT JOIN LATERAL (SELECT t1.two, tenk2.hundred, tenk2.two FROM tenk2) t2
ON t1.hundred = t2.hundred WHERE t1.hundred < 5;
                                                               QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=8.46..1495.10 rows=50000 width=256)
(actual time=0.860..111.013 rows=50000 loops=1)
   ->  Bitmap Heap Scan on tenk1 t1  (cost=8.16..376.77 rows=500
width=244) (actual time=0.798..1.418 rows=500 loops=1)
         Recheck Cond: (hundred < 5)
         Heap Blocks: exact=263
         ->  Bitmap Index Scan on tenk1_hundred  (cost=0.00..8.04
rows=500 width=0) (actual time=0.230..0.230 rows=500 loops=1)
               Index Cond: (hundred < 5)
   ->  Memoize  (cost=0.30..4.89 rows=100 width=12) (actual
time=0.009..0.180 rows=100 loops=500)
         Cache Key: t1.hundred
         Cache Mode: logical
         Hits: 0  Misses: 500  Evictions: 499  Overflows: 0  Memory Usage: 5kB
         ->  Index Scan using tenk2_hundred on tenk2  (cost=0.29..4.88
rows=100 width=12) (actual time=0.007..0.124 rows=100 loops=500)
               Index Cond: (hundred = t1.hundred)
 Planning Time: 0.661 ms
 Execution Time: 113.076 ms
(14 rows)

The memoize's cache key only uses t1.hundred while the nested loop has
two changed parameters: the lateral var t1.two and t1.hundred. This
leads to a chgParam that is always different and the cache is purged
on each rescan.

Regards,
Anthonin

On Sat, Jan 27, 2024 at 5:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:
>
> Hello,
>
> 27.01.2024 00:09, Tom Lane wrote:
> > David Rowley <dgrowleyml@gmail.com> writes:
> >> On Sat, 27 Jan 2024 at 09:41, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> drongo and fairywren are consistently failing the test case added
> >>> by this commit.  I'm not quite sure why the behavior of Memoize
> >>> would be platform-specific when we're dealing with integers,
> >>> but ...
> >> Maybe snprintf(buf, "%.*f", 0, 5.0 / 2.0); results in "3" on those
> >> rather than "2"?
> >> Looking at the code in fmtfloat(), we fallback on the built-in snprintf.
> > Maybe ... I don't have a better theory.
>
> FWIW, I've found where this behaviour is documented:
>
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/sprintf-sprintf-l-swprintf-swprintf-l-swprintf-l?view=msvc-170
>
> (I've remembered a case with test/sql/partition_prune from 2020, where
> sprintf on Windows worked the other way.)
>
>
> Best regards,
> Alexander
>
>



pgsql-hackers by date:

Previous
From: Amul Sul
Date:
Subject: Re: Add system identifier to backup manifest
Next
From: Michael Paquier
Date:
Subject: Re: Fix bugs not to discard statistics when changing stats_fetch_consistency