Thread: Another idea for index-only scans
I have added another idea for index-only scans to the TODO list: > A third idea would be for a heap scan to check if all rows are visible > and if so set a per-table flag which can be checked by index scans. > Any change to the table would have to clear the flag. To detect > changes during the heap scan a counter could be set at the start and > checked at the end --- if it is the same, the table has not been > modified --- any table change would increment the counter. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On 8/15/07, Bruce Momjian <bruce@momjian.us> wrote: > I have added another idea for index-only scans to the TODO list: > > > A third idea would be for a heap scan to check if all rows are visible > > and if so set a per-table flag which can be checked by index scans. > > Any change to the table would have to clear the flag. To detect > > changes during the heap scan a counter could be set at the start and > > checked at the end --- if it is the same, the table has not been > > modified --- any table change would increment the counter. Perhaps this is naive (or discussed and discarded... if so, I couldn't find it, but I apologize if that's the case), but wouldn't recording the xid of non-readonly transactions, at commit time, and at the table level, be equivalent to the flag and remove the need for a counter? Readers could just check the last-modification-xid at the beginning and end of their scans to test for heap stability. I suppose that would require a write-exclusive lock on some metadata for each modified table during each commit... so perhaps it's a non-starter right there. --miker
Bruce, > I have added another idea for index-only scans to the TODO list: > > A third idea would be for a heap scan to check if all rows are > > visible and if so set a per-table flag which can be checked by index > > scans. Any change to the table would have to clear the flag. To > > detect changes during the heap scan a counter could be set at the > > start and checked at the end --- if it is the same, the table has not > > been modified --- any table change would increment the counter. Seems marginal at best. Checking overlap between the index and the FSM/DSM and only check dirty pages seems more intelligent, and able to cover a larger number of cases. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco
"Bruce Momjian" <bruce@momjian.us> writes: > I have added another idea for index-only scans to the TODO list: > >> A third idea would be for a heap scan to check if all rows are visible >> and if so set a per-table flag which can be checked by index scans. >> Any change to the table would have to clear the flag. To detect >> changes during the heap scan a counter could be set at the start and >> checked at the end --- if it is the same, the table has not been >> modified --- any table change would increment the counter. I think I would prefer to address this in the same infrastructure as the dead-space-map. That way you're not dependent on having no updates happening on the table at all. Any tuples on pages which contain no in-doubt tuples could have their visibility check skipped but when you come across a tuple on a page which has been modified since the last vacuum then you have to check the visibility. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
<p><font size="2">A hybrid scan approach combined with this idea would fit nicely - provide results for tids that are directlyvisible and set a bit in a bitmap for those that need recheck and extend recheck to take a bitmap (wait - it alreadydoes :-)<br /><br /> - Luke<br /><br /> Msg is shrt cuz m on ma treo<br /><br /> -----Original Message-----<br />From: Gregory Stark [<a href="mailto:stark@enterprisedb.com">mailto:stark@enterprisedb.com</a>]<br /> Sent: Wednesday,August 15, 2007 02:58 PM Eastern Standard Time<br /> To: Bruce Momjian<br /> Cc: PostgreSQL-development<br/> Subject: Re: [HACKERS] Another idea for index-only scans<br /><br /> "Bruce Momjian" <bruce@momjian.us>writes:<br /><br /> > I have added another idea for index-only scans to the TODO list:<br /> ><br/> >> A third idea would be for a heap scan to check if all rows are visible<br /> >> and if so seta per-table flag which can be checked by index scans.<br /> >> Any change to the table would have to clear theflag. To detect<br /> >> changes during the heap scan a counter could be set at the start and<br /> >> checked at the end --- if it is the same, the table has not been<br /> >> modified --- any table changewould increment the counter.<br /><br /> I think I would prefer to address this in the same infrastructure as the<br/> dead-space-map. That way you're not dependent on having no updates happening<br /> on the table at all. Any tupleson pages which contain no in-doubt tuples<br /> could have their visibility check skipped but when you come acrossa tuple on<br /> a page which has been modified since the last vacuum then you have to check<br /> the visibility.<br/><br /> --<br /> Gregory Stark<br /> EnterpriseDB <a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br/><br /> ---------------------------(end of broadcast)---------------------------<br/> TIP 4: Have you searched our list archives?<br /><br /> <a href="http://archives.postgresql.org">http://archives.postgresql.org</a><br/></font>
Gregory Stark wrote: > "Bruce Momjian" <bruce@momjian.us> writes: > > > I have added another idea for index-only scans to the TODO list: > > > >> A third idea would be for a heap scan to check if all rows are visible > >> and if so set a per-table flag which can be checked by index scans. > >> Any change to the table would have to clear the flag. To detect > >> changes during the heap scan a counter could be set at the start and > >> checked at the end --- if it is the same, the table has not been > >> modified --- any table change would increment the counter. > > I think I would prefer to address this in the same infrastructure as the > dead-space-map. That way you're not dependent on having no updates happening > on the table at all. Any tuples on pages which contain no in-doubt tuples > could have their visibility check skipped but when you come across a tuple on > a page which has been modified since the last vacuum then you have to check > the visibility. Yea, the bitmap/page idea is already on the TODO list. This was just a less granular idea. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Aug 15, 2007, at 1:54 PM, Gregory Stark wrote: >>> A third idea would be for a heap scan to check if all rows are >>> visible >>> and if so set a per-table flag which can be checked by index >>> scans. >>> Any change to the table would have to clear the flag. To detect >>> changes during the heap scan a counter could be set at the >>> start and >>> checked at the end --- if it is the same, the table has not been >>> modified --- any table change would increment the counter. > > I think I would prefer to address this in the same infrastructure > as the > dead-space-map. That way you're not dependent on having no updates > happening > on the table at all. Any tuples on pages which contain no in-doubt > tuples > could have their visibility check skipped but when you come across > a tuple on > a page which has been modified since the last vacuum then you have > to check > the visibility. The advantage to Bruce's idea is that it sounds pretty simple to implement. While it wouldn't be of use for many general cases, it *would* be useful for read-only tables, ie: old partitions. -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote: > I have added another idea for index-only scans to the TODO list: > > > A third idea would be for a heap scan to check if all rows are visible > > and if so set a per-table flag which can be checked by index scans. > > Any change to the table would have to clear the flag. To detect > > changes during the heap scan a counter could be set at the start and > > checked at the end --- if it is the same, the table has not been > > modified --- any table change would increment the counter. > This sounds useful for read-only or read-mostly tables. However, it also sounds a little dangerous. If you test your application performance, but not thoroughly enough, you might end up with a surprise when going into production. Regards,Jeff Davis
Jeff Davis wrote: > On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote: > > I have added another idea for index-only scans to the TODO list: > > > > > A third idea would be for a heap scan to check if all rows are visible > > > and if so set a per-table flag which can be checked by index scans. > > > Any change to the table would have to clear the flag. To detect > > > changes during the heap scan a counter could be set at the start and > > > checked at the end --- if it is the same, the table has not been > > > modified --- any table change would increment the counter. > > This sounds useful for read-only or read-mostly tables. I think it's too coarse-grained to be really useful. If it was one bit per page it could work, but one bit per relation is going to be reset too frequently. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, Aug 16, 2007 at 04:06:35PM -0400, Alvaro Herrera wrote: > Jeff Davis wrote: > > On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote: > > > I have added another idea for index-only scans to the TODO list: > > > > > > > A third idea would be for a heap scan to check if all rows are visible > > > > and if so set a per-table flag which can be checked by index scans. > > > > Any change to the table would have to clear the flag. To detect > > > > changes during the heap scan a counter could be set at the start and > > > > checked at the end --- if it is the same, the table has not been > > > > modified --- any table change would increment the counter. > > > > This sounds useful for read-only or read-mostly tables. > > I think it's too coarse-grained to be really useful. If it was one bit > per page it could work, but one bit per relation is going to be reset > too frequently. Not for the most common use cases for table partitioning. -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Alvaro Herrera <alvherre@commandprompt.com> writes: >> On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote: >>> A third idea would be for a heap scan to check if all rows are visible >>> and if so set a per-table flag which can be checked by index scans. > I think it's too coarse-grained to be really useful. If it was one bit > per page it could work, but one bit per relation is going to be reset > too frequently. Another problem it would have is that the flag would be a single point of contention. regards, tom lane
Decibel! wrote: > The advantage to Bruce's idea is that it sounds pretty simple to > implement. While it wouldn't be of use for many general cases, it > *would* be useful for read-only tables, ie: old partitions. Wouldn't the mostcommon case by foreign key checks against tables that essentially map application enums to display strings? This is a rather common scenario. It would be nice if such tables (which are typically small) could be retained in each backend process with a simple check that the cached data is still valid. James