Home > mailing lists

Stats for inheritance trees - Mailing list pgsql-hackers

From	Tom Lane
Subject	Stats for inheritance trees
Date	December 28, 2009 18:41:17
Msg-id	2674.1262040064@sss.pgh.pa.us Whole thread Raw
Responses	Re: Stats for inheritance trees Re: Stats for inheritance trees Re: Stats for inheritance trees
List	pgsql-hackers

Tree view

Following up on the discussion here
http://archives.postgresql.org/message-id/4B3875C6020000250002D97D@gw.wicourts.gov
I'd like to propose making the following changes that would allow saner
planning for queries involving inheritance:

1. Currently the primary key of pg_statistic is (starelid, staattnum)
indicating the table and column the stats entry is for. I propose
adding a bool stainherit to the pkey. "false" means the stats entry
is for just that table column, ie, the traditional interpretation.
"true" means the stats entry covers that column and all its inheritance
children. Such entries could be used directly by the planner in cases
where it currently punts and delivers a default estimate.

2. When ANALYZE is invoked on a table that has inheritance children,
it will perform its normal duties for just that table (creating or
updating entries with stainherit = false) and then perform a second
scan that covers that table and all its children. This will be used
to create or update entries with stainherit = true. It might be
possible to avoid scanning the parent table itself twice, but I won't
contort the code too much to avoid that, since in most practical
applications the parent is empty or small anyway.

3. Ideally autovacuum would know enough to perform ANALYZEs on
inheritance parents after enough churn has occurred in their child
table(s). I am not entirely clear about a good way to do that.
We could have it just directly force an ANALYZE on parent(s) of any
table it has chosen to ANALYZE, but that might be overkill --- in
particular leading to excess ANALYZEs when several children receive
a lot of updates.

Even without a really smart solution to #3, this would be a big step
forward for inheritance queries.

BTW, while at it I'm inclined to add a non-unique index on
pg_inherits.inhparent, so that find_inheritance_children won't have to
seqscan pg_inherits anymore. It's surprising people haven't complained
about that before. The code says
* XXX might be a good idea to create an index on pg_inherits' inhparent * field, so that we can use an indexscan
insteadof sequential scan here. * However, in typical databases pg_inherits won't have enough entries to *
justifyan indexscan...

but we've long since learned that people stress databases in odd ways.

Comments?
regards, tom lane

pgsql-hackers by date:

From: "Kevin Grittner"
Date: 28 December 2009, 18:36:07
Subject: Re: Admission Control Policy

From: Andres Freund
Date: 28 December 2009, 18:55:06
Subject: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

Stats for inheritance trees - Mailing list pgsql-hackers

Previous

Next