Re: Use extended statistics to estimate (Var op Var) clauses - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Use extended statistics to estimate (Var op Var) clauses
Date
Msg-id d9c6c669-61aa-b55b-82d6-cec5dad66c2d@enterprisedb.com
Whole thread Raw
In response to Re: Use extended statistics to estimate (Var op Var) clauses  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Use extended statistics to estimate (Var op Var) clauses  (Zhihong Yu <zyu@yugabyte.com>)
Re: Use extended statistics to estimate (Var op Var) clauses  (Mark Dilger <mark.dilger@enterprisedb.com>)
List pgsql-hackers
Hi,

The attached patch series is modified to improve estimates for these 
special clauses (Var op Var with the same var on both sides) without 
extended statistics. This is done in 0001, and it seems fairly simple 
and cheap.

The 0002 part is still the same patch as on 2021/07/20. Part 0003 fixes 
handling of those clauses so that we don't treat them as simple, but it 
does that by tweaking statext_is_compatible_clause(), as suggested by 
Dean. It does work, although it's a bit more invasive than simply 
checking the shape of clause in statext_mcv_clauselist_selectivity.

I do have results for the randomly generated queries, and this does 
improve the situation a lot - pretty much all the queries with (a=a) or 
(a<a) clauses had terrible estimates, and this fixes that.

That being said, I'm still not sure if this is an issue in real-world 
applications, or whether we're solving something because of synthetic 
queries generated by the randomized generator. But the checks seem 
fairly cheap, so maybe it doesn't matter too much.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: \dP and \dX use ::regclass without "pg_catalog."
Next
From: Zhihong Yu
Date:
Subject: Re: Use extended statistics to estimate (Var op Var) clauses