On 10/5/22 13:37, Laurenz Albe wrote:
> On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:
>> I've pushed the last version, and backpatched it to 10 (not sure I'd
>> call it a bugfix, but I certainly agree with Justin it's worth
>> mentioning in the docs, even on older branches).
>
> I'd like to suggest an improvement to this. The current wording could
> be read to mean that dead tuples won't get cleaned up in partitioned tables.
>
>
> By the way, where are the statistics of a partitioned tables used? The actual
> tables scanned are always the partitions, and in the execution plans that
> I have seen, the optimizer always used the statistics of the partitions.
For example, it is used to estimate selectivity of join clause:
CREATE TABLE test (id integer, val integer) PARTITION BY hash (id);
CREATE TABLE test_0 PARTITION OF test
FOR VALUES WITH (modulus 2, remainder 0);
CREATE TABLE test_1 PARTITION OF test
FOR VALUES WITH (modulus 2, remainder 1);
INSERT INTO test (SELECT q, q FROM generate_series(1,10) AS q);
VACUUM ANALYZE test;
INSERT INTO test (SELECT q, q%2 FROM generate_series(11,200) AS q);
VACUUM ANALYZE test_0,test_1;
EXPLAIN (ANALYZE, TIMING OFF, SUMMARY OFF)
SELECT * FROM test t1, test t2 WHERE t1.id = t2.val;
VACUUM ANALYZE test;
EXPLAIN (ANALYZE, TIMING OFF, SUMMARY OFF)
SELECT * FROM test t1, test t2 WHERE t1.id = t2.val;
Here without actual statistics on parent table we make wrong prediction.
--
Regards
Andrey Lepikhov
Postgres Professional