Re: Duplicate WHERE condition changes performance and plan - Mailing list pgsql-performance

From Justin Pryzby
Subject Re: Duplicate WHERE condition changes performance and plan
Date
Msg-id 20200420023357.GR26953@telsasoft.com
Whole thread Raw
In response to Duplicate WHERE condition changes performance and plan  ("singh400@gmail.com" <singh400@gmail.com>)
Responses Re: Duplicate WHERE condition changes performance and plan
List pgsql-performance
On Wed, Apr 15, 2020 at 08:55:53PM +0100, singh400@gmail.com wrote:
> We have an odd issue where specifying the same where clause twice causes PG
> to pick a much more efficent plan. We would like to know why.

> Query B (this is the 'fast' query):
> UPDATE problem_instance SET processed = false
> FROM problem
> WHERE problem.id = problem_instance.problem_id
> AND problem.status != 2
> AND problem.status != 2
> AND processed = true;

When you specify redundant condition, it results in an underestimate, as
expected:

Index Scan using problem_id_idx1 on public.problem (cost=0.28..624.68 ROWS=73 width=14) (actual time=0.011..0.714
ROWS=841loops=1)
 
    Filter: ((problem.status <> 2) AND (problem.status <> 2))

In this case, doing an index scans on problem_instance is apparently faster
than an seq scan.

I think the duplicate condition is fooling the planner, and by chance it's
giving a better plan.  That might indicate that your settings aren't ideal.
Maybe random_page_cost should be lower, which would encourage index scans.  If
you're using SSD storage, or if the DB is small compared with shared_buffers or
RAM, then random_page_cost should be closer to seq_page_cost.

How large are the indexes? problem_id_idx1 ?

On Mon, Apr 20, 2020 at 01:50:17AM +0100, singh400@gmail.com wrote:
> Even after upgrading my local install of PG to "PostgreSQL 12.2,
> compiled by Visual C++ build 1914, 64-bit" and I'm still seeing the
> same behaviour.

> Server Configuration:
> https://gist.github.com/indy-singh/8386d59206af042d365e5cd49fbae68f
> shared_buffers     2GB     configuration file
> effective_cache_size     6GB     configuration file

Note, until v10, the documentation said this:

https://www.postgresql.org/docs/9.6/runtime-config-resource.html
|Also, on Windows, large values for shared_buffers aren't as effective. You may
|find better results keeping the setting relatively low and using the operating
|system cache more instead. The useful range for shared_buffers on Windows
|systems is generally from 64MB to 512MB.

It would be interesting to know

-- 
Justin



pgsql-performance by date:

Previous
From: "singh400@gmail.com"
Date:
Subject: Re: Duplicate WHERE condition changes performance and plan
Next
From: Rick Vincent
Date:
Subject: RE: Postgres not using index on views