Home > mailing lists

Re: index fragmentation on insert-only table with non-unique column - Mailing list pgsql-performance

From	Peter Geoghegan
Subject	Re: index fragmentation on insert-only table with non-unique column
Date	May 25, 2016 07:16:45
Msg-id	CAH2-WznDpnJaabA1tQht5rUZRQUp3YnQ21QbA-ePZ3xLm_X7ww@mail.gmail.com Whole thread Raw
In response to	index fragmentation on insert-only table with non-unique column (Justin Pryzby <pryzby@telsasoft.com>)
Responses	Re: index fragmentation on insert-only table with non-unique column Re: index fragmentation on insert-only table with non-unique column Re: index fragmentation on insert-only table with non-unique column
List	pgsql-performance

Tree view

On Tue, May 24, 2016 at 10:39 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> I was able to see great improvement without planner parameters by REINDEX the
> timestamp index.  My theory is that the index/planner doesn't handle well the
> case of many tuples with same column value, and returns pages out of logical
> order.  Reindex fixes that, rewriting the index data with pages in order
> (confirmed with pageinspect), which causes index scans to fetch heap data more
> or less monotonically (if not consecutively).  strace shows that consecutive
> read()s are common (without intervening seeks).  I gather this allows the OS
> readahead to kick in.

The basic problem is that the B-Tree code doesn't maintain this
property. However, B-Tree index builds will create an index that
initially has this property, because the tuplesort.c code happens to
sort index tuples with a CTID tie-breaker.

> Postgres seems to assume that the high degree of correlation of the table
> column seen in pg_stats is how it will get data from the index scan, which
> assumption seems to be very poor on what turns out to be a higly fragmented
> index.  Is there a way to help it to understand otherwise??

Your complaint is vague. Are you complaining about the planner making
a poor choice? I don't think that's the issue here, because you never
made any firm statement about the planner making a choice that was
worth than an alternative that it had available.

If you're arguing for the idea that B-Trees should reliably keep
tuples in order by a tie-break condition, that seems difficult to
implement, and likely not worth it in practice.

--
Peter Geoghegan

pgsql-performance by date:

From: Justin Pryzby
Date: 24 May 2016, 20:38:26
Subject: index fragmentation on insert-only table with non-unique column

From: Stephen Frost
Date: 25 May 2016, 07:26:29
Subject: Re: index fragmentation on insert-only table with non-unique column

Re: index fragmentation on insert-only table with non-unique column - Mailing list pgsql-performance

Previous

Next