Re: Index on two columns not used

From: Péter Kovács
Subject: Re: Index on two columns not used
Date: ,
Msg-id: 453D4B37.1000006@chemaxon.hu
(view: Whole thread, Raw)
In response to: Re: Index on two columns not used  (Markus Schaber)
List: pgsql-performance

Tree view

Index on two columns not used  (Arnaud Lesauvage, )
 Re: Index on two columns not used  ("Heikki Linnakangas", )
  Re: Index on two columns not used  (Arnaud Lesauvage, )
   Re: Index on two columns not used  ("Heikki Linnakangas", )
    Re: Index on two columns not used  (Arnaud Lesauvage, )
     Re: Index on two columns not used  ("Heikki Linnakangas", )
      Re: Index on two columns not used  (Arnaud Lesauvage, )
  Re: Index on two columns not used  (Péter Kovács, )
   Re: Index on two columns not used  (Markus Schaber, )
    Re: Index on two columns not used  (Alvaro Herrera, )
     Re: Index on two columns not used  (Markus Schaber, )
      Re: Index on two columns not used  (Tom Lane, )
       Re: Index on two columns not used  (Markus Schaber, )
    Re: Index on two columns not used  (Péter Kovács, )
 Re: Index on two columns not used  (Tom Lane, )
  Re: Index on two columns not used  (Arnaud Lesauvage, )

Markus,

Thank you for your kind explanation.

Peter

Markus Schaber wrote:
> Hi, Peter,
>
> Péter Kovács wrote:
>
>> Sorry for the amateurish question, but what are "heap tuples"?
>>
>> Also, my understanding is that the following statement applies only for
>> composite indexes: "PostgreSQL can't use the values stored in the index
>> to check the join condition". I assume that PostgreSQL will be able to
>> use single-column-indexes for join conditions. Is this correct?
>>
>
> Both questions are tightly related:
>
> First, the "heap" is the part of the table where the actual tuples are
> stored.
>
> PostgreSQL uses an MVCC system, that means that multiple versions (with
> their transaction information) of a single row can coexist in the heap.
> This allows for higher concurrency in the backend.
>
> Now, the index basically stores pointers like "pages 23 and 42 contain
> rows with value 'foo'", but version information is not replicated to the
> index pages, this keeps the index' size requirements low.
>
> Additionally, in most UPDATE cases, the new row version will fit into
> the same page as the old version. In this case, the index does not have
> to be changed, which is an additional speed improvement.
>
> But when accessing the data via the index, it can only give a
> preselection of pages that contain interesting data, and PostgreSQL has
> to look into the actual heap pages to check whether there really are row
> versions that are visible in the current transaction.
>
>
> A further problem is that some GIST index types are lossy, that means
> the index does not retain the full information, but only an
> approximation, for efficiency reasons.
>
> A prominent example are the PostGIS geometry indices, they only store
> the bounding box (4 float values) instead of the whole geometry (may be
> millions of double precision coordinates). So it may be necessary to
> re-check the condition with the real data, using the lossy index for
> preselection.
>
> HTH,
> Markus
>


pgsql-performance by date:

From: Péter Kovács
Date:
Subject: Re: Index on two columns not used
From: Stuart Bishop
Date:
Subject: Re: Slow functional indexes?