Thread: Another idea for index-only scans

Another idea for index-only scans

From
Bruce Momjian
Date:
I have added another idea for index-only scans to the TODO list:

>   A third idea would be for a heap scan to check if all rows are visible
>   and if so set a per-table flag which can be checked by index scans.
>   Any change to the table would have to clear the flag.  To detect
>   changes during the heap scan a counter could be set at the start and
>   checked at the end --- if it is the same, the table has not been
>   modified --- any table change would increment the counter.

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Another idea for index-only scans

From
"Mike Rylander"
Date:
On 8/15/07, Bruce Momjian <bruce@momjian.us> wrote:
> I have added another idea for index-only scans to the TODO list:
>
> >   A third idea would be for a heap scan to check if all rows are visible
> >   and if so set a per-table flag which can be checked by index scans.
> >   Any change to the table would have to clear the flag.  To detect
> >   changes during the heap scan a counter could be set at the start and
> >   checked at the end --- if it is the same, the table has not been
> >   modified --- any table change would increment the counter.

Perhaps this is naive (or discussed and discarded... if so, I couldn't
find it, but I apologize if that's the case), but wouldn't recording
the xid of non-readonly transactions, at commit time, and at the table
level, be equivalent to the flag and remove the need for a counter?
Readers could just check the last-modification-xid at the beginning
and end of their scans to test for heap stability.

I suppose that would require a write-exclusive lock on some metadata
for each modified table during each commit... so perhaps it's a
non-starter right there.

--miker


Re: Another idea for index-only scans

From
Josh Berkus
Date:
Bruce,

> I have added another idea for index-only scans to the TODO list:
> >   A third idea would be for a heap scan to check if all rows are
> > visible and if so set a per-table flag which can be checked by index
> > scans. Any change to the table would have to clear the flag.  To
> > detect changes during the heap scan a counter could be set at the
> > start and checked at the end --- if it is the same, the table has not
> > been modified --- any table change would increment the counter.

Seems marginal at best.  Checking overlap between the index and the FSM/DSM 
and only check dirty pages seems more intelligent, and able to cover a 
larger number of cases.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco


Re: Another idea for index-only scans

From
Gregory Stark
Date:
"Bruce Momjian" <bruce@momjian.us> writes:

> I have added another idea for index-only scans to the TODO list:
>
>>   A third idea would be for a heap scan to check if all rows are visible
>>   and if so set a per-table flag which can be checked by index scans.
>>   Any change to the table would have to clear the flag.  To detect
>>   changes during the heap scan a counter could be set at the start and
>>   checked at the end --- if it is the same, the table has not been
>>   modified --- any table change would increment the counter.

I think I would prefer to address this in the same infrastructure as the
dead-space-map. That way you're not dependent on having no updates happening
on the table at all. Any tuples on pages which contain no in-doubt tuples
could have their visibility check skipped but when you come across a tuple on
a page which has been modified since the last vacuum then you have to check
the visibility.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: Another idea for index-only scans

From
"Luke Lonergan"
Date:
<p><font size="2">A hybrid scan approach combined with this idea would fit nicely - provide results for tids that are
directlyvisible and set a bit in a bitmap for those that need recheck and extend recheck to take a bitmap (wait - it
alreadydoes :-)<br /><br /> - Luke<br /><br /> Msg is shrt cuz m on ma treo<br /><br />  -----Original Message-----<br
/>From:   Gregory Stark [<a href="mailto:stark@enterprisedb.com">mailto:stark@enterprisedb.com</a>]<br /> Sent:  
Wednesday,August 15, 2007 02:58 PM Eastern Standard Time<br /> To:     Bruce Momjian<br /> Cc:    
PostgreSQL-development<br/> Subject:        Re: [HACKERS] Another idea for index-only scans<br /><br /> "Bruce Momjian"
<bruce@momjian.us>writes:<br /><br /> > I have added another idea for index-only scans to the TODO list:<br />
><br/> >>   A third idea would be for a heap scan to check if all rows are visible<br /> >>   and if so
seta per-table flag which can be checked by index scans.<br /> >>   Any change to the table would have to clear
theflag.  To detect<br /> >>   changes during the heap scan a counter could be set at the start and<br />
>>  checked at the end --- if it is the same, the table has not been<br /> >>   modified --- any table
changewould increment the counter.<br /><br /> I think I would prefer to address this in the same infrastructure as
the<br/> dead-space-map. That way you're not dependent on having no updates happening<br /> on the table at all. Any
tupleson pages which contain no in-doubt tuples<br /> could have their visibility check skipped but when you come
acrossa tuple on<br /> a page which has been modified since the last vacuum then you have to check<br /> the
visibility.<br/><br /> --<br />   Gregory Stark<br />   EnterpriseDB          <a
href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br/><br /> ---------------------------(end of
broadcast)---------------------------<br/> TIP 4: Have you searched our list archives?<br /><br />                <a
href="http://archives.postgresql.org">http://archives.postgresql.org</a><br/></font> 

Re: Another idea for index-only scans

From
Bruce Momjian
Date:
Gregory Stark wrote:
> "Bruce Momjian" <bruce@momjian.us> writes:
> 
> > I have added another idea for index-only scans to the TODO list:
> >
> >>   A third idea would be for a heap scan to check if all rows are visible
> >>   and if so set a per-table flag which can be checked by index scans.
> >>   Any change to the table would have to clear the flag.  To detect
> >>   changes during the heap scan a counter could be set at the start and
> >>   checked at the end --- if it is the same, the table has not been
> >>   modified --- any table change would increment the counter.
> 
> I think I would prefer to address this in the same infrastructure as the
> dead-space-map. That way you're not dependent on having no updates happening
> on the table at all. Any tuples on pages which contain no in-doubt tuples
> could have their visibility check skipped but when you come across a tuple on
> a page which has been modified since the last vacuum then you have to check
> the visibility.

Yea, the bitmap/page idea is already on the TODO list.  This was just a
less granular idea.

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Another idea for index-only scans

From
Decibel!
Date:
On Aug 15, 2007, at 1:54 PM, Gregory Stark wrote:
>>>   A third idea would be for a heap scan to check if all rows are  
>>> visible
>>>   and if so set a per-table flag which can be checked by index  
>>> scans.
>>>   Any change to the table would have to clear the flag.  To detect
>>>   changes during the heap scan a counter could be set at the  
>>> start and
>>>   checked at the end --- if it is the same, the table has not been
>>>   modified --- any table change would increment the counter.
>
> I think I would prefer to address this in the same infrastructure  
> as the
> dead-space-map. That way you're not dependent on having no updates  
> happening
> on the table at all. Any tuples on pages which contain no in-doubt  
> tuples
> could have their visibility check skipped but when you come across  
> a tuple on
> a page which has been modified since the last vacuum then you have  
> to check
> the visibility.

The advantage to Bruce's idea is that it sounds pretty simple to  
implement. While it wouldn't be of use for many general cases, it  
*would* be useful for read-only tables, ie: old partitions.
-- 
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)




Re: Another idea for index-only scans

From
Jeff Davis
Date:
On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:
> I have added another idea for index-only scans to the TODO list:
> 
> >   A third idea would be for a heap scan to check if all rows are visible
> >   and if so set a per-table flag which can be checked by index scans.
> >   Any change to the table would have to clear the flag.  To detect
> >   changes during the heap scan a counter could be set at the start and
> >   checked at the end --- if it is the same, the table has not been
> >   modified --- any table change would increment the counter.
> 

This sounds useful for read-only or read-mostly tables.

However, it also sounds a little dangerous. If you test your application
performance, but not thoroughly enough, you might end up with a surprise
when going into production.

Regards,Jeff Davis



Re: Another idea for index-only scans

From
Alvaro Herrera
Date:
Jeff Davis wrote:
> On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:
> > I have added another idea for index-only scans to the TODO list:
> > 
> > >   A third idea would be for a heap scan to check if all rows are visible
> > >   and if so set a per-table flag which can be checked by index scans.
> > >   Any change to the table would have to clear the flag.  To detect
> > >   changes during the heap scan a counter could be set at the start and
> > >   checked at the end --- if it is the same, the table has not been
> > >   modified --- any table change would increment the counter.
> 
> This sounds useful for read-only or read-mostly tables.

I think it's too coarse-grained to be really useful.  If it was one bit
per page it could work, but one bit per relation is going to be reset
too frequently.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Another idea for index-only scans

From
Decibel!
Date:
On Thu, Aug 16, 2007 at 04:06:35PM -0400, Alvaro Herrera wrote:
> Jeff Davis wrote:
> > On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:
> > > I have added another idea for index-only scans to the TODO list:
> > >
> > > >   A third idea would be for a heap scan to check if all rows are visible
> > > >   and if so set a per-table flag which can be checked by index scans.
> > > >   Any change to the table would have to clear the flag.  To detect
> > > >   changes during the heap scan a counter could be set at the start and
> > > >   checked at the end --- if it is the same, the table has not been
> > > >   modified --- any table change would increment the counter.
> >
> > This sounds useful for read-only or read-mostly tables.
>
> I think it's too coarse-grained to be really useful.  If it was one bit
> per page it could work, but one bit per relation is going to be reset
> too frequently.

Not for the most common use cases for table partitioning.
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: Another idea for index-only scans

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
>> On Wed, 2007-08-15 at 11:54 -0400, Bruce Momjian wrote:
>>> A third idea would be for a heap scan to check if all rows are visible
>>> and if so set a per-table flag which can be checked by index scans.

> I think it's too coarse-grained to be really useful.  If it was one bit
> per page it could work, but one bit per relation is going to be reset
> too frequently.

Another problem it would have is that the flag would be a single point
of contention.
        regards, tom lane


Re: Another idea for index-only scans

From
James Mansion
Date:
Decibel! wrote:
> The advantage to Bruce's idea is that it sounds pretty simple to 
> implement. While it wouldn't be of use for many general cases, it 
> *would* be useful for read-only tables, ie: old partitions.

Wouldn't the mostcommon case by foreign key checks against tables that 
essentially map application enums to display strings?  This is a rather 
common scenario.  It would be nice if such tables (which are typically 
small) could be retained in each backend process with a simple check 
that the cached data is still valid.

James