Home > mailing lists

Re: Index on two columns not used - Mailing list pgsql-performance

From	Péter Kovács
Subject	Re: Index on two columns not used
Date	October 23, 2006 20:07:59
Msg-id	453D4B37.1000006@chemaxon.hu Whole thread
In response to	Re: Index on two columns not used (Markus Schaber <schabi@logix-tt.com>)
List	pgsql-performance

Tree view

Markus,

Thank you for your kind explanation.

Peter

Markus Schaber wrote:
> Hi, Peter,
>
> Péter Kovács wrote:
>
>> Sorry for the amateurish question, but what are "heap tuples"?
>>
>> Also, my understanding is that the following statement applies only for
>> composite indexes: "PostgreSQL can't use the values stored in the index
>> to check the join condition". I assume that PostgreSQL will be able to
>> use single-column-indexes for join conditions. Is this correct?
>>
>
> Both questions are tightly related:
>
> First, the "heap" is the part of the table where the actual tuples are
> stored.
>
> PostgreSQL uses an MVCC system, that means that multiple versions (with
> their transaction information) of a single row can coexist in the heap.
> This allows for higher concurrency in the backend.
>
> Now, the index basically stores pointers like "pages 23 and 42 contain
> rows with value 'foo'", but version information is not replicated to the
> index pages, this keeps the index' size requirements low.
>
> Additionally, in most UPDATE cases, the new row version will fit into
> the same page as the old version. In this case, the index does not have
> to be changed, which is an additional speed improvement.
>
> But when accessing the data via the index, it can only give a
> preselection of pages that contain interesting data, and PostgreSQL has
> to look into the actual heap pages to check whether there really are row
> versions that are visible in the current transaction.
>
>
> A further problem is that some GIST index types are lossy, that means
> the index does not retain the full information, but only an
> approximation, for efficiency reasons.
>
> A prominent example are the PostGIS geometry indices, they only store
> the bounding box (4 float values) instead of the whole geometry (may be
> millions of double precision coordinates). So it may be necessary to
> re-check the condition with the real data, using the lossy index for
> preselection.
>
> HTH,
> Markus
>

pgsql-performance by date:

From: "Craig A. James"
Date: 23 October 2006, 19:38:50
Subject: Re: Best COPY Performance

From: Stuart Bishop
Date: 24 October 2006, 04:01:31
Subject: Re: Slow functional indexes?

Re: Index on two columns not used - Mailing list pgsql-performance

Previous

Next