Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements - Mailing list pgsql-hackers
From | Michail Nikolaev |
---|---|
Subject | Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements |
Date | |
Msg-id | CANtu0ohFr7OzNSbxqBhUpR0mXDYyt0Xt6+=Tbq0EC7as7kr+Lg@mail.gmail.com Whole thread Raw |
In response to | Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
|
List | pgsql-hackers |
Hello, Matthias!
> While waiting for this, here are some initial comments on the github diffs:
Thanks for your review!
While stress testing the POC, I found some issues unrelated to the patch that need to be fixed first.
This is [1] and [2].
>> Additional index is lightweight and does not produce any WAL.
> That doesn't seem to be what I see in the current patchset:
Persistence is passed as parameter [3] and set to RELPERSISTENCE_UNLOGGED for auxiliary indexes [4].
> - I notice you've added a new argument to
> heapam_index_build_range_scan. I think this could just as well be
> implemented by reading the indexInfo->ii_Concurrent field, as the
> values should be equivalent, right?
Not always; currently, it is set by ResetSnapshotsAllowed[5].
We fall back to regular index build if there is a predicate or expression in the index (which should be considered "safe" according to [6]).
However, we may remove this check later.
Additionally, there is no sense in resetting the snapshot if we already have an xmin assigned to the backend for some reason.
> In heapam_index_build_range_scan, it seems like you're popping the
> snapshot and registering a new one while holding a tuple from
> heap_getnext(), thus while holding a page lock. I'm not so sure that's
> OK, expecially when catalogs are also involved (specifically for
> expression indexes, where functions could potentially be updated or
> dropped if we re-create the visibility snapshot)
Yeah, good catch.
Initially, I implemented a different approach by extracting the catalog xmin to a separate horizon [7]. It might be better to return to this option.
> In heapam_index_build_range_scan, you pop the snapshot before the
> returned heaptuple is processed and passed to the index-provided
> callback. I think that's incorrect, as it'll change the visibility of
> the returned tuple before it's passed to the index's callback. I think
> the snapshot manipulation is best added at the end of the loop, if we
> add it at all in that function.
Yes, this needs to be fixed as well.
> The snapshot reset interval is quite high, at 500ms. Why did you
> configure it that low, and didn't you make this configurable?
It is just a random value for testing purposes.
I don't think there is a need to make it configurable.
Getting a new snapshot is a cheap operation now, so we can do it more often if required.
Internally, I was testing it with a 0ms interval.
> You seem to be using WAL in the STIR index, while it doesn't seem
> that relevant for the use case of auxiliary indexes that won't return
> any data and are only used on the primary. It would imply that the
> data is being sent to replicas and more data being written than
> strictly necessary, which to me seems wasteful.
It just looks like an index with WAL, but as mentioned above, it is unlogged in actual usage.
> The locking in stirinsert can probably be improved significantly if
> we use things like atomic operations on STIR pages. We'd need an
> exclusive lock only for page initialization, while share locks are
> enough if the page's data is modified without WAL. That should improve
> concurrent insert performance significantly, as it would further
> reduce the length of the exclusively locked hot path.
Hm, good idea. I'll check it later.
Best regards & thanks again,
Mikhail
[1]: https://www.postgresql.org/message-id/CANtu0ohHmYXsK5bxU9Thcq1FbELLAk0S2Zap0r8AnU3OTmcCOA%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CANtu0ojga8s9%2BJ89cAgLzn2e-bQgy3L0iQCKaCnTL%3Dppot%3Dqhw%40mail.gmail.com
[3]: https://github.com/postgres/postgres/compare/master...michail-nikolaev:postgres:new_index_concurrently_approach#diff-50abc48efcc362f0d3194aceba6969429f46fa1f07a119e555255545e6655933R93
[4]: https://github.com/michail-nikolaev/postgres/blob/e2698ca7c814a5fa5d4de8a170b7cae83034cade/src/backend/catalog/index.c#L1600
[5]: https://github.com/michail-nikolaev/postgres/blob/e2698ca7c814a5fa5d4de8a170b7cae83034cade/src/backend/catalog/index.c#L2657
[6]: https://github.com/michail-nikolaev/postgres/blob/e2698ca7c814a5fa5d4de8a170b7cae83034cade/src/backend/commands/indexcmds.c#L1129
[7]: https://github.com/postgres/postgres/commit/38b243d6cc7358a44cb1a865b919bf9633825b0c
> While waiting for this, here are some initial comments on the github diffs:
Thanks for your review!
While stress testing the POC, I found some issues unrelated to the patch that need to be fixed first.
This is [1] and [2].
>> Additional index is lightweight and does not produce any WAL.
> That doesn't seem to be what I see in the current patchset:
Persistence is passed as parameter [3] and set to RELPERSISTENCE_UNLOGGED for auxiliary indexes [4].
> - I notice you've added a new argument to
> heapam_index_build_range_scan. I think this could just as well be
> implemented by reading the indexInfo->ii_Concurrent field, as the
> values should be equivalent, right?
Not always; currently, it is set by ResetSnapshotsAllowed[5].
We fall back to regular index build if there is a predicate or expression in the index (which should be considered "safe" according to [6]).
However, we may remove this check later.
Additionally, there is no sense in resetting the snapshot if we already have an xmin assigned to the backend for some reason.
> In heapam_index_build_range_scan, it seems like you're popping the
> snapshot and registering a new one while holding a tuple from
> heap_getnext(), thus while holding a page lock. I'm not so sure that's
> OK, expecially when catalogs are also involved (specifically for
> expression indexes, where functions could potentially be updated or
> dropped if we re-create the visibility snapshot)
Yeah, good catch.
Initially, I implemented a different approach by extracting the catalog xmin to a separate horizon [7]. It might be better to return to this option.
> In heapam_index_build_range_scan, you pop the snapshot before the
> returned heaptuple is processed and passed to the index-provided
> callback. I think that's incorrect, as it'll change the visibility of
> the returned tuple before it's passed to the index's callback. I think
> the snapshot manipulation is best added at the end of the loop, if we
> add it at all in that function.
Yes, this needs to be fixed as well.
> The snapshot reset interval is quite high, at 500ms. Why did you
> configure it that low, and didn't you make this configurable?
It is just a random value for testing purposes.
I don't think there is a need to make it configurable.
Getting a new snapshot is a cheap operation now, so we can do it more often if required.
Internally, I was testing it with a 0ms interval.
> You seem to be using WAL in the STIR index, while it doesn't seem
> that relevant for the use case of auxiliary indexes that won't return
> any data and are only used on the primary. It would imply that the
> data is being sent to replicas and more data being written than
> strictly necessary, which to me seems wasteful.
It just looks like an index with WAL, but as mentioned above, it is unlogged in actual usage.
> The locking in stirinsert can probably be improved significantly if
> we use things like atomic operations on STIR pages. We'd need an
> exclusive lock only for page initialization, while share locks are
> enough if the page's data is modified without WAL. That should improve
> concurrent insert performance significantly, as it would further
> reduce the length of the exclusively locked hot path.
Hm, good idea. I'll check it later.
Best regards & thanks again,
Mikhail
[1]: https://www.postgresql.org/message-id/CANtu0ohHmYXsK5bxU9Thcq1FbELLAk0S2Zap0r8AnU3OTmcCOA%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/CANtu0ojga8s9%2BJ89cAgLzn2e-bQgy3L0iQCKaCnTL%3Dppot%3Dqhw%40mail.gmail.com
[3]: https://github.com/postgres/postgres/compare/master...michail-nikolaev:postgres:new_index_concurrently_approach#diff-50abc48efcc362f0d3194aceba6969429f46fa1f07a119e555255545e6655933R93
[4]: https://github.com/michail-nikolaev/postgres/blob/e2698ca7c814a5fa5d4de8a170b7cae83034cade/src/backend/catalog/index.c#L1600
[5]: https://github.com/michail-nikolaev/postgres/blob/e2698ca7c814a5fa5d4de8a170b7cae83034cade/src/backend/catalog/index.c#L2657
[6]: https://github.com/michail-nikolaev/postgres/blob/e2698ca7c814a5fa5d4de8a170b7cae83034cade/src/backend/commands/indexcmds.c#L1129
[7]: https://github.com/postgres/postgres/commit/38b243d6cc7358a44cb1a865b919bf9633825b0c
pgsql-hackers by date: