Re: ANALYZE to be ignored by VACUUM - Mailing list pgsql-hackers
From | Dawid Kuroczko |
---|---|
Subject | Re: ANALYZE to be ignored by VACUUM |
Date | |
Msg-id | 758d5e7f0802200845v7b5dde22v1bd82c49755a1a53@mail.gmail.com Whole thread Raw |
In response to | Re: ANALYZE to be ignored by VACUUM (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>) |
Responses |
Re: ANALYZE to be ignored by VACUUM
|
List | pgsql-hackers |
On Feb 19, 2008 8:31 AM, ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> wrote: > Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes: > > > In my workload, ANALYZE takes long time (1min at statistics_target = 10, > > > and 5min at 100), but the updated table needs to be vacuumed every 30 seconds > > > because seqscans run on the table repeatedly. > > > > There is something *seriously* wrong with that. If vacuum can complete > > in under 30 seconds, how can analyze take a minute? (I'm also wondering > > whether you'll still need such frantic vacuuming with HOT...) > > There are two tables here: > [S] A small table, that is frequently updated and seqscan-ed > [L] A large table, that takes a long time to be analyzed > > The table [S] should be vacuumed every 30 seconds, because dead tuples > affects the performance of seqscan seriously. HOT and autovacuum are > very useful here *unless* long transactions begins. > Analyzing [L] takes 1-5 minutes and both HOT and vacuums don't work > during it. I want to use statistics_target = 100 at heart for more > accurate statistics, but I'm using 10 instead because of avoiding > long transactions by analyze. I am sure the idea is not original, yet still I would like to know how hard would it be to support local (per table) oldest visible XIDs. I mean, when transaction start you need to keep all tuples with xmin >= oldest_xid in all tables, because who knows what table will that transaction like to touch. But then again, there is relatively large list of cases when we don't need to hold vacuum on _all_ relations. These include: SELECTs in auto-commit mode -- provided the SELECT is not something fancy (not immutable PL-functions), we just need to keep a snapshot of affected tables. DMLs in auto-commit mode -- provided no PL-functions or triggers are in effect. WITH HOLD CURSORS. Yes, I know, WITH HOLD cursor on first COMMIT will create a copy of rows to be returned (which can take a looong time in some cases), but perhaps it could be possible to just "lock" the table from vacuuming and skip the temporary store. And lots of other, when done in auto-commit. Like ALTER TABLEs, CREATE TABLE AS SELECT, COPY, etc... I am sure that such an idea isn't original. What are the main obstacles in making it happen except "time&money"? :) Regards, Dawid
pgsql-hackers by date: