Thread: Self-generating statistics?
Something that might be worth considering: Many of the performance problems on pgsql-general seem to be related to the fact that no analyze is performed after the creationon the tables, so maybe this might be an option to fix that (in future releases): when a table has no statistics at all, and the first seq-scanon the table is performed, it might improve further performance if this seq-scan is used to get table statistics too. This should not betoo expensive since reading the table has to be done only once. Further queries will have at least preliminary statistics at hand. I'm not sure how (CPU) expensive statistic-gathering is, but if most of the work is reading the tuples, it might be a winto do this. Regards, Mario Weilguni
Mario Weilguni <mweilguni@sime.com> writes: > Many of the performance problems on pgsql-general seem to be related > to the fact that no analyze is performed after the creation on the > tables Well, there are lots of other ways an incompetent DBA can screw up a database. The need to VACUUM and ANALYZE is stated clearly in the docs. Providing workarounds for negligence isn't the right path to get started down, IMHO. That said, the general idea of a self-tuning database system has merit, IMHO. For example, this paper proposes a histogram data structure that can be updated fairly cheaply based on data gathered from query execution: http://citeseer.nj.nec.com/255752.html A bunch of industry players (IBM, Microsoft, etc.) are putting some work into this area (IBM calls it "autonomic computing", for example). It might be an interesting area to look at in the future... Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
On Mon, Oct 21, 2002 at 08:23:33AM +0200, Mario Weilguni wrote: > Something that might be worth considering: > > Many of the performance problems on pgsql-general seem to be > related to the fact that no analyze is performed after the creation > on the tables, so maybe this might be an option to fix that (in > future releases): when a table has no statistics at all, and the > first seq-scan on the table is It's never the case that a table has no statistics at all. It has default ones. Maybe they're right; it's hard to know. Someone has posted on gborg an anto-vacuum daemon that might be of use in this situation. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110