Re: Merging statistics from children instead of re-sampling everything - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Merging statistics from children instead of re-sampling everything
Date
Msg-id 078c36f0-fc9c-b2d9-2bae-8eebafcefe93@enterprisedb.com
Whole thread Raw
In response to Merging statistics from children instead of re-sampling everything  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
Hi,

I'd like to point out two less obvious things, about how this relates to
Tom's proposal [1] and patch [2] from 2015. Tom approached the problem
from a different direction, essentially allowing Var to be associated
with a list of statistics instead of just one.

So it's a somewhat orthogonal solution, and it has pros and cons. The
pro is that it can ignore statistics for eliminated partitions, thus
producing better estimates. The con is that it requires all the places
dealing with VariableStatData to assume there's a list, not just one,
making the code more complex and more CPU expensive (with sufficiently
many partitions).

However, it seems to me we could easily combine those two things - we
can merge the statistics (the way I proposed here), so that each Var has
still just one VariableStatData. That'd mean the various places don't
need to start dealing with a list, and it'd still allow ignoring stats
for eliminated partitions.

Of course, that assumes the merge is cheaper than processing the list of
statistics, but I find that plausible, especially the list needs to be
processed multiple (e.g. when considering different join orders, filters
and so on).

Haven't tried, but might be worth exploring in the future.


regards


[1] https://www.postgresql.org/message-id/7363.1426537103@sss.pgh.pa.us

[2] https://www.postgresql.org/message-id/22598.1425686096@sss.pgh.pa.us

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Masahiro Ikeda
Date:
Subject: Re: wal stats questions
Next
From: Andy Fan
Date:
Subject: Re: UniqueKey on Partitioned table.