Re: the big picture for index-only scans - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: the big picture for index-only scans |
Date | |
Msg-id | BANLkTim1rfyYey45m_Cc4xzcn=ugwKMbCQ@mail.gmail.com Whole thread Raw |
In response to | Re: the big picture for index-only scans (Cédric Villemain <cedric.villemain.debian@gmail.com>) |
Responses |
Re: the big picture for index-only scans
Re: the big picture for index-only scans Re: the big picture for index-only scans |
List | pgsql-hackers |
On Tue, May 10, 2011 at 10:58 AM, Cédric Villemain <cedric.villemain.debian@gmail.com> wrote: > ANALYZE can do the stats job for 'free' on the pages it collects > anyway. So that looks like a good idea. > I believe the really lazy vacuum is another topic; even if it will > improve the performance of the index only scan to have tables already > vacuuumed, the stats should expose that and the function > cost_index(_only?)() taking care of that. I basically agree. The connection is that - as we use the all-visible for more things, the performance penalty for failing to vacuum (say) an insert-only table will continue to grow. Still, as you say, clearly a separate topic. > The temptation is high to estimate the cost of an "index_scan(only) + > ordered(by ctid) table pages fetch if heap required". (this is what I > understood from heikki suggestion 3-4. and it makes sense). It may be > easier to implement both at once but I didn't find the branch in the > Heikki's git repos. (probably removed since the long time) I was thinking about this as well, at least if I understand you correctly. That would be similar to a bitmap index scan, and I think it would be a great thing to have, not only because it would allow us to get the advantages of index-only scans in situations that are well-suited to our current bitmap scans, but also because it could be batched. You could allocate a buffer of work_mem bytes and fill it up with TIDs; then, when it's full, you sort the buffer and start doing the necessary heap fetches in physical order. If you still need more rows, you can clear the buffer and go around for another pass. > Based on ANALYZE stats for the visibility, I believe cost_index and > cost_index_only should be very similar functions (well, atm, I don't > see the point to split it in 2 functions). Yeah, I would more imagine modifying the existing function. >> Any thoughts welcome. Incidentally, if anyone else feels like working >> on this, feel free to let me know and I'm happy to step away, from all >> of it or from whatever part someone else wants to tackle. I'm mostly >> working on this because it's something that I think we really need to >> get done, more than having a burning desire to be the one who does it. > > Indexonly scans are welcome! > I believe I can help on 3 and 4, but (really) not sure for 1 and 2. Well, I have code for #1, and just need reviews, and #2 shouldn't be that hard, and with luck I'll twist Bruce's arm into doing it (*waves to Bruce*). So #3 and #4 are the next thing to tackle. Any thoughts on what/how you'd like to contribute there? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: