Re: [PERFORM] encouraging index-only scans - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [PERFORM] encouraging index-only scans |
Date | |
Msg-id | 20130906020043.GH27195@momjian.us Whole thread Raw |
In response to | Re: [PERFORM] encouraging index-only scans (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [PERFORM] encouraging index-only scans
Re: [PERFORM] encouraging index-only scans |
List | pgsql-hackers |
On Thu, Sep 5, 2013 at 09:10:06PM -0400, Robert Haas wrote: > On Thu, Sep 5, 2013 at 8:14 PM, Bruce Momjian <bruce@momjian.us> wrote: > > Actually, I now realize it is more complex than that, and worse. There > > are several questions to study to understand when pg_class.relallvisible > > is updated (which is used to determine if index-only scans are a good > > optimization choice), and when VM all-visible bits are set so heap pages > > can be skipped during index-only scans: > > > > 1) When are VM bits set: > > vacuum (non-full) > > analyze (only some random pages) > > Analyze doesn't set visibility-map bits. It only updates statistics > about how many are set. Sorry, yes you are correct. > > The calculus we should use to determine when we need to run vacuum has > > changed with index-only scans, and I am not sure we ever fully addressed > > this. > > Yeah, we didn't. I think the hard part is figuring out what behavior > would be best. Counting inserts as well as updates and deletes would > be a simple approach, but I don't have much confidence in it. My > experience is that having vacuum or analyze kick in during a bulk-load > operation is a disaster. We'd kinda like to come up with a way to > make vacuum run after the bulk load is complete, maybe, but how would > we identify that time, and there are probably cases where that's not > right either. I am unsure how we have gone a year with index-only scans and I am just now learning that it only works well with update/delete workloads or by running vacuum manually. I only found this out going back over January emails. Did other people know this? Was it not considered a serious problem? Well, our logic has been that vacuum is only for removing expired rows. I think we either need to improve that, or somehow make sequential scans update the VM map, and then find a way to trigger update of relallvisible even without inserts. Ideas ----- I think we need to detect tables that do not have VM bits set and try to determine if they should be vacuumed. If a table has most of its VM bits set, there in need to vacuum it for VM bit setting. Autovacuum knows how many pages are in the table via its file size, and it can scan the VM map to see how many pages are _not_ marked all-visible. If the VM map has many pages that are _not_ marked as all-visible, and change count since last vacuum is low, those pages might now be all-visible and vacuum might find them. One problem is that a long-running transaction is not going to update relallvisible until commit, so you might be vacuuming a table that is being modified, e.g. bulk loads. Do we have any way of detecting if a backend is modifying a table? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
pgsql-hackers by date: