Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements - Mailing list pgsql-hackers

From Sergey Sargsyan
Subject Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date
Msg-id CAMAof6-4xaV3QE2ErYJaJhu6qjFn99sWyo_HQeBhHikZM3GexA@mail.gmail.com
Whole thread Raw
In response to Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Mihail Nikalayeu <mihailnikalayeu@gmail.com>)
Responses Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
List pgsql-hackers
Hey Mihail,

I've started looking at the patches today, mostly the STIR part. Seems solid, but I've got a question about validation. Why are we still grabbing tids from the main index and sorting them?

I think it's to avoid duplicate errors when adding tuples from STIP to the main index, but couldn't we just suppress that error during validation and skip the new tuple insertion if it already exists?

The main index may get huge after building, and iterating over it in a single thread and then sorting tids can be time consuming.

At least I guess one can skip it when STIP is empty. But, I think we could skip it altogether by figuring out what to do with duplicates, making concurrent and non-concurrent index creation almost identical in speed (only locking and atomicity would differ).

p.s. I noticed that `stip.c` has a lot of functions that don't follow the Postgres coding style of return type on separate line.

On Mon, Jun 16, 2025, 6:41 PM Mihail Nikalayeu <mihailnikalayeu@gmail.com> wrote:
Hello, everyone!

Rebased, patch structure and comments available here [0]. Quick
introduction poster - here [1].

Best regards,
Mikhail.

[0]: https://www.postgresql.org/message-id/flat/CADzfLwVOcZ9mg8gOG%2BKXWurt%3DMHRcqNv3XSECYoXyM3ENrxyfQ%40mail.gmail.com#52c97e004b8f628473124c05e3bf2da1
[1]: https://www.postgresql.org/message-id/attachment/176651/STIR-poster.pdf

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: CHECKPOINT unlogged data
Next
From: Peter Geoghegan
Date:
Subject: Returning nbtree posting list TIDs in DESC order during backwards scans