Re: Bad selectivity estimate when using a sub query to determineWHERE condition - Mailing list pgsql-performance

From Justin Pryzby
Subject Re: Bad selectivity estimate when using a sub query to determineWHERE condition
Date
Msg-id 20200210191304.GF1412@telsasoft.com
Whole thread Raw
In response to Bad selectivity estimate when using a sub query to determine WHERE condition  (Chris Borckholder <chris.borckholder@bitpanda.com>)
Responses Re: Bad selectivity estimate when using a sub query to determineWHERE condition  (Chris Borckholder <chris.borckholder@bitpanda.com>)
List pgsql-performance
On Mon, Feb 10, 2020 at 11:34:01AM +0100, Chris Borckholder wrote:
> I have a large table of immutable events that need to be aggregated
> regularly to derive statistics. To improve the performance, that table is
> rolled up every 15minutes, so that online checks can aggregate rolled up
> data and combine it with latest events created after the last roll up.
> 
> To implement this a query is executed that selects only events after the
> time of the last rollup.
> That time is determined dynamically based on a log table.

Perhaps that could be done as an indexed column in the large table, rather
than querying a 2nd log table.
Possibly with a partial index on that column: WHERE unprocessed='t'.

> When using a sub select or CTE to get the latest roll up time, the query
> planner fails to recognize that a most of the large table would be filtered
> out by the condition and tries a sequential scan instead of an index scan.
> When using the literal value for the WHERE condition, the plan correctly
> uses an index scan, which is much faster.
> 
> I analyzed the involved tables and increased the collected histogram, but
> the query plan did not improve. Is there a way to help the query planner
> recognize this in the dynamic case?

Also, if you used partitioning with pgostgres since v11, then I think most
partitions would be excluded:

https://www.postgresql.org/docs/12/release-12.html
|Allow partition elimination during query execution (David Rowley, Beena Emerson)
|Previously, partition elimination only happened at planning time, meaning many joins and prepared queries could not
usepartition elimination.
 

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=499be013de65242235ebdde06adb08db887f0ea5

https://www.postgresql.org/about/featurematrix/detail/332/

Justin



pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bad selectivity estimate when using a sub query to determine WHERE condition
Next
From: Amol Tarte
Date:
Subject: Re: Writing 1100 rows per second