Thread: Page access pattern in query plan using index scan

Page access pattern in query plan using index scan

From
Jack Orenstein
Date:
Suppose I have a table as follows:

     testdb=> \d person
                   Table "public.person"
        Column   |          Type           | Modifiers
     ------------+-------------------------+-----------
      id         | integer                 | not null
      age        | integer                 |
      other_info | character varying(1000) |
     Indexes: person_pkey primary key btree (id),
              idx_person_age btree (age)

Here is the execution plan for a not very selective query on age:

     testdb=> explain select * from person where age between 30 and 40;
                                        QUERY PLAN
     --------------------------------------------------------------------------------
      Index Scan using idx_person_age on person  (cost=0.00..17.08 rows=5 width=524)
        Index Cond: ((age >= 30) AND (age <= 40))
     (2 rows)

What is the pattern of access to data pages? I can think of two likely
answers:

1) The index is scanned for ages 30 through 40. As each index entry is
scanned, a row is retrieved.

2) The index is scanned for ages 30 and 40. The index entries are
sorted so that rows on the same page are grouped together, increasing
the odds that a needed page will be present in the shared buffers.

I'm using 7.3.4, and will be upgrading to 7.4.2 soon.

Jack Orenstein



Re: Page access pattern in query plan using index scan

From
Alvaro Herrera
Date:
On Wed, Jun 02, 2004 at 08:38:58PM -0400, Jack Orenstein wrote:

> What is the pattern of access to data pages? I can think of two likely
> answers:
>
> 1) The index is scanned for ages 30 through 40. As each index entry is
> scanned, a row is retrieved.

This one.  There have been noises about doing the second, but it's non
trivial and there's no hacker currently working on it.  Not to be
expected on the next version, I'd think.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
Licensee shall have no right to use the Licensed Software
for productive or commercial use. (Licencia de StarOffice 6.0 beta)


Re: Page access pattern in query plan using index scan

From
Jack Orenstein
Date:
Alvaro Herrera wrote:
> On Wed, Jun 02, 2004 at 08:38:58PM -0400, Jack Orenstein wrote:
>
>
>>What is the pattern of access to data pages? I can think of two likely
>>answers:
>>
>>1) The index is scanned for ages 30 through 40. As each index entry is
>>scanned, a row is retrieved.
>
>
> This one.  There have been noises about doing the second, but it's non
> trivial and there's no hacker currently working on it.  Not to be
> expected on the next version, I'd think.

Can you recommend an application-level workaround that will access pages
in the right order?

Jack Orenstein


Re: Page access pattern in query plan using index scan

From
Alvaro Herrera
Date:
On Wed, Jun 02, 2004 at 11:22:53PM -0400, Jack Orenstein wrote:
> Alvaro Herrera wrote:
> >On Wed, Jun 02, 2004 at 08:38:58PM -0400, Jack Orenstein wrote:
> >
> >
> >>What is the pattern of access to data pages? I can think of two likely
> >>answers:
> >>
> >>1) The index is scanned for ages 30 through 40. As each index entry is
> >>scanned, a row is retrieved.
> >
> >This one.  There have been noises about doing the second, but it's non
> >trivial and there's no hacker currently working on it.  Not to be
> >expected on the next version, I'd think.
>
> My naive guess about this is that you read the index entries, each one
> contains a page number, and you sort by the page number. The set of
> index entries could be large, requiring a potentially expensive sort.
> I don't know enough about query plan internals to know if this sort of
> plan can even be expressed currently, so maybe a major extension would
> be needed. And then the optimizer would have to be extended to
> consider the two approaches. Is this why you say the problem is
> non-trivial, or is there some additional set of issues that I'm
> missing?

Part of the problem is that the sort is not expressable in the current
code.  Another part is that this kind of index scan is different and
violates some assumptions of the current indexscan, so it's not best in
all scenarios.  Several planner/optimizer improvements are needed for
this to work.  This has been discussed several times on the hackers'
list.  If you want to read them, search for "bitmap indexes" on
http://www.pgsql.ru.  Note that this seems to be different from what
other RDBMSs consider a bitmap index to be.

Regarding your other question:  I don't know any application workaround.
There probably isn't any (unless you know how the tuples are sorted on
the heap, which is quite unlikely).

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"No hay cielo posible sin hundir nuestras raíces
en la profundidad de la tierra"                        (Malucha Pinto)


Re: Page access pattern in query plan using index scan

From
Greg Stark
Date:
Jack Orenstein <jao@geophile.com> writes:

> Alvaro Herrera wrote:
>
> > On Wed, Jun 02, 2004 at 08:38:58PM -0400, Jack Orenstein wrote:
> >
> > >What is the pattern of access to data pages? I can think of two likely
> > >answers:
> > >
> > >1) The index is scanned for ages 30 through 40. As each index entry is
> > >scanned, a row is retrieved.
> >
> > This one.  There have been noises about doing the second, but it's non
> > trivial and there's no hacker currently working on it.  Not to be
> > expected on the next version, I'd think.

I think the main problem is dealing with locking issues. As long as the index
scan is using the tuples it sees right away it doesn't care if someone else
changes some tuple it hasn't gotten to yet or has already seen. If you read
them all and then sort them it's possible that while you're dealing with your
local copy that someone else comes along and does something that you would
want to know about.

Actually I'm not exactly clear on what types of changes would cause problems,
and thinking about it now I don't see how this is any different than a
materialize or other plan that stores intermediate results. So there's
something I'm missing.

> Can you recommend an application-level workaround that will access pages
> in the right order?

Well you could try CLUSTER-ing your table on that index.

Alternatively you could try partitioning your table so the optimizer can use a
sequential scan to efficiently get the records you want. But there's no native
support for partitioning so you would have to roll your own entirely manually.
That would mean e.g. having tables cheques_jan, cheques_feb, cheques_mar,...
And a union of all the months. Any query that only spanned one month would use
the monthly table so it could do a sequential scan of just that month. I don't
recommend this unless you have a _lot_ of data that you often deal with in
particular chunks. Even then unless you're purging and archiving data in those
chunks it's probably still not worth it.

--
greg

Re: Page access pattern in query plan using index scan

From
Jack Orenstein
Date:
Greg Stark wrote:

>>Can you recommend an application-level workaround that will access pages
>>in the right order?
>
>
> Well you could try CLUSTER-ing your table on that index.

For my application, the problem with CLUSTER is that it reclusters the entire
table. So as the table grows, the cost of CLUSTER goes up.

What would be really nice is something like "cluster recent". The idea is to
cluster only rows added since the last cluster. This amortizes the time and storage
costs of clustering across the invocations of cluster in a much more palatable way.
If such a command were used this wisely, (e.g. whenever the table grows by 20%) then
I suspect applications would obtain most of the benefit of a sort following an index
lookup. (And I'm guessing this is a much simpler development task.)

Admittedly, this would not work so well in cases where the index keys change in value.
But this would be a huge win in situations (like mine) where the keys never change.

Jack Orenstein


Re: Page access pattern in query plan using index scan

From
Tom Lane
Date:
Jack Orenstein <jao@geophile.com> writes:
> What would be really nice is something like "cluster recent". The idea is to
> cluster only rows added since the last cluster.

And you would put them where?

            regards, tom lane