Re: increasing effective_cache_size slows down join queries by a factor of 4000x - Mailing list pgsql-general

From A Shaposhnikov
Subject Re: increasing effective_cache_size slows down join queries by a factor of 4000x
Date
Msg-id CA+1Wm9VPLD1JwzT12=361Je0Jm4K9GXwGWMRLUyzfa5mg_zLzA@mail.gmail.com
Whole thread Raw
In response to Re: increasing effective_cache_size slows down join queries by a factor of 4000x  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: increasing effective_cache_size slows down join queries by a factor of 4000x
List pgsql-general
I made a mistake yesterday claiming that the created statistics
changed the row counts in the estimates - it did not  - I looked at
the wrong query yesterday. In the correct query plan the row estimate
still differs from the actual by many orders of magnitude:

Nested Loop  (cost=1.01..27584834.53 rows=59608439 width=164) (actual
time=0.047..2.723 rows=854 loops=1)

It completely ignores the "limit" in SQL statement when doing the row
estimates - the estimates do not change when I change it. The business
logic really needs only 1000 rows in the result, I never need the
whole table in the result. The query always runs fast if the limit is
<= 46, when the limit >=47, it sometimes chooses to do a merge join on
2 tables with hundreds of millions of rows  instead of using the
indexes. The runtime difference is 4000x.

I migrated the data to the latest postgres 14.1. Both versions run
either the slow plan or the fast plan seemingly at random. The only
reliable way to make it choose the fast plan is to decrease the
effective_cache_size to a value 20x lower than the memory available in
the system. Dropping and creating the statistics on the join table
makes no difference in the estimated row counts and query plans.
PostgreSql seems to be caching something internally and choosing the
query plans at random - sometimes it is fast, sometimes 4000x slower
without any changes in configuration or statistics. The runtime
difference is 4000x.  Such randomness is clearly unacceptable. I think
I would have to try to use the "hint" extension suggested by Imre to
make it consistently choose the fast plan.



On Wed, Feb 2, 2022 at 3:34 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 2/2/22 22:10, Artyom Shaposhnikov wrote:
> > the row estimate became ~1000x smaller with the stat in place, so it
> > looks like it grossly miscalculates the query plans without the stats
> > for large tables representing M:M relations.
> >
>
> Well, if the estimates are significantly off (and 3 orders of magnitude
> certainly qualifies), then all bets are off. There's no magical option
> that'd fix planning in such conditions.
>
> Ultimately, fixing the estimates (e.g. by creating extended statistics)
> is the right "fix" as it gives the optimizer the information needed to
> pick the right plan.
>
> > On Wed, Feb 2, 2022 at 11:47 AM Michael Lewis <mlewis@entrata.com> wrote:
> >>
> >> What does the row estimate look like on the scan of data table with that statistic in place? Anytime the stats
givea mis-estimate this far off, I wouldn't expect that plans would be optimal except by luck.
 
> >>
> >> Index Scan using data_pkey on data t (cost=0.57..21427806.53 rows=58785023 width=131) (actual time=0.024..0.482
rows=854loops=1)
 
> >> Index Cond: (id > 205284974)
> >
>
> It'd be interesting to see the plans without the LIMIT, as that makes
> the "actual" values low simply by terminating early.
>
>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



pgsql-general by date:

Previous
From: Vijaykumar Jain
Date:
Subject: Re: Using the indexing and sampling APIs to realize progressive features
Next
From: "Peter J. Holzer"
Date:
Subject: Re: Regular Expression For Duplicate Words