Markus,
Thank you for your kind explanation.
Peter
Markus Schaber wrote:
> Hi, Peter,
>
> Péter Kovács wrote:
>
>> Sorry for the amateurish question, but what are "heap tuples"?
>>
>> Also, my understanding is that the following statement applies only for
>> composite indexes: "PostgreSQL can't use the values stored in the index
>> to check the join condition". I assume that PostgreSQL will be able to
>> use single-column-indexes for join conditions. Is this correct?
>>
>
> Both questions are tightly related:
>
> First, the "heap" is the part of the table where the actual tuples are
> stored.
>
> PostgreSQL uses an MVCC system, that means that multiple versions (with
> their transaction information) of a single row can coexist in the heap.
> This allows for higher concurrency in the backend.
>
> Now, the index basically stores pointers like "pages 23 and 42 contain
> rows with value 'foo'", but version information is not replicated to the
> index pages, this keeps the index' size requirements low.
>
> Additionally, in most UPDATE cases, the new row version will fit into
> the same page as the old version. In this case, the index does not have
> to be changed, which is an additional speed improvement.
>
> But when accessing the data via the index, it can only give a
> preselection of pages that contain interesting data, and PostgreSQL has
> to look into the actual heap pages to check whether there really are row
> versions that are visible in the current transaction.
>
>
> A further problem is that some GIST index types are lossy, that means
> the index does not retain the full information, but only an
> approximation, for efficiency reasons.
>
> A prominent example are the PostGIS geometry indices, they only store
> the bounding box (4 float values) instead of the whole geometry (may be
> millions of double precision coordinates). So it may be necessary to
> re-check the condition with the real data, using the lossy index for
> preselection.
>
> HTH,
> Markus
>