Query rewrite(optimization) using constraints - Mailing list pgsql-hackers

From Lily Liu
Subject Query rewrite(optimization) using constraints
Date
Msg-id CAJ8RQMEkDGuV=sGc45QCb5kdv7DvRbXV+mcOUhzuBw_hg7eUUQ@mail.gmail.com
Whole thread Raw
Responses Re: Query rewrite(optimization) using constraints  (Justin Pryzby <pryzby@telsasoft.com>)
Re: Query rewrite(optimization) using constraints  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

Hi, hackers


I notice that postgres use constraints to optimize/rewrite queries in limited cases as https://www.postgresql.org/docs/current/runtime-config-query.html#GUC-CONSTRAINT-EXCLUSION


I propose two new types of query rewrites using constraints here

1) Remove DISTINCT 

A simple example is SELECT DISTINCT(name) FROM R. If there is a unique constraint on the name column. The DISTINCT keyword can be removed safely. Query plans without the DISTINCT keyword might be much cheaper since DISTINCT is expensive.


2) Add LIMIT 1

An example of this optimization will be SELECT name from R WHERE name = ‘foo’. If there is a unique constraint on the name column, the selection result has at most one record. Therefore, we can add LIMIT 1 safely. If the original query plan performs a sequential scan on the R, adding LIMIT 1 might speed up the query because of the early return.


We designed an algorithm to decide if 1), 2) can be performed safely. Rewriting queries manually and experimenting on a table with 10K records shows 2X ~ 3X improvement for both rewrites. We have some other rewrite rules, but the two are most obvious ones. With this feature, the optimizer can consider the query plans both before and after the rewrite and choose the one with minimum cost. 


Will that feature be useful? How hard to implement the feature in the current system? Any thoughts or comments are highly appreciated!


Best,

Lily

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Parallel vacuum workers prevent the oldest xmin from advancing
Next
From: Andres Freund
Date:
Subject: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)