Home > mailing lists

Re: PoC: using sampling to estimate joins / complex conditions - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: PoC: using sampling to estimate joins / complex conditions
Date	March 22, 2022 02:35:41
Msg-id	20220321233541.k5eoere76wxvxkpl@alap3.anarazel.de Whole thread Raw
In response to	Re: PoC: using sampling to estimate joins / complex conditions (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses	Re: PoC: using sampling to estimate joins / complex conditions (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List	pgsql-hackers

Tree view

Hi,

On 2022-01-21 01:06:37 +0100, Tomas Vondra wrote:
> Yeah, I haven't updated some of the test output because some of those
> changes are a bit wrong (and I think that's fine for a PoC patch). I
> should have mentioned that in the message, though. Sorry about that.

Given that the patch hasn't been updated since January and that it's a PoC in
the final CF, it seems like it should at least be moved to the next CF? Or
perhaps returned?

I've just marked it as waiting-on-author for now - iirc that leads to fewer
reruns by cfbot once it's failing...

> 2) The correlated samples are currently built using a query, executed
> through SPI in a loop. So given a "driving" sample of 30k rows, we do
> 30k lookups - that'll take time, even if we do that just once and cache
> the results.

Ugh, yea, that's going to increase overhead by at least a few factors.

> I'm sure there there's room for some improvement, though - for example
> we don't need to fetch all columns included in the statistics object,
> but just stuff referenced by the clauses we're estimating. That could
> improve chance of using IOS etc.

Yea. Even just avoid avoiding SPI / planner + executor seems likely to be a
big win.

It seems one more of the cases where we really need logic to recognize "cheap"
vs "expensive" plans, so that we only do sampling when useful. I don't think
that's solved just by having a declarative syntax.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Nikita Malakhov
Date: 22 March 2022, 02:31:21
Subject: Re: Pluggable toaster

From: Andres Freund
Date: 22 March 2022, 02:45:19
Subject: Re: Add sub-transaction overflow status in pg_stat_activity

Re: PoC: using sampling to estimate joins / complex conditions - Mailing list pgsql-hackers

Previous

Next