Home > mailing lists

Re: Why does CREATE INDEX CONCURRENTLY need two scans? - Mailing list pgsql-general

From	Michael Paquier
Subject	Re: Why does CREATE INDEX CONCURRENTLY need two scans?
Date	April 1, 2015 02:08:44
Msg-id	CAB7nPqSWkNm0UveY6xnr=cn4X9LS469NdHiG40P2XeH7VfHxOA@mail.gmail.com Whole thread Raw
In response to	Why does CREATE INDEX CONCURRENTLY need two scans? (Joshua Ma <josh@benchling.com>)
Responses	Re: Why does CREATE INDEX CONCURRENTLY need two scans? Re: Why does CREATE INDEX CONCURRENTLY need two scans?
List	pgsql-general

Tree view

On Wed, Apr 1, 2015 at 9:43 AM, Joshua Ma <josh@benchling.com> wrote:

Hi all,

I was curious about why CONCURRENTLY needs two scans to complete - from the documentation on HOT (access/heap/README.HOT), it looks like the process is:

1) insert pg_index entry, wait for relevant in-progress txns to finish (before marking index open for inserts, so HOT updates won't write incorrect index entries)
2) build index in 1st snapshot, mark index open for inserts
3) in 2nd snapshot, validate index and insert missing tuples since first snapshot, mark index valid for searches

Why are two scans necessary? What would break if it did something like the following?

1) insert pg_index entry, wait for relevant txns to finish, mark index open for inserts

2) build index in a single snapshot, mark index valid for searches

Wouldn't new inserts update the index correctly? Between the snapshot and index-updating txns afterwards, wouldn't all updates be covered?

When an index is built with index_build, are included in the index only the tuples seen at the start of the first scan. A second scan is needed to add in the index entries for the tuples that have been inserted into the table during the build phase.
--

Michael

pgsql-general by date:

From: Joshua Ma
Date: 01 April 2015, 00:44:05
Subject: Why does CREATE INDEX CONCURRENTLY need two scans?

From: TonyS
Date: 01 April 2015, 02:50:09
Subject: Would like to know how analyze works technically

Re: Why does CREATE INDEX CONCURRENTLY need two scans? - Mailing list pgsql-general

Previous

Next