Thread: Index recreation details with REINDEX TABLE CONCURRENTLY
Hello, We're working on setting up some regular jobs to reindex tables where we wind up generating a lot of index bloat. We're planning on using REINDEX ... CONCURRENTLY. We'd like to reindex all of the indexes on the tables. In some of our environments, these tables are very large and under high load, and we want to minimize the resource consumption of index recreation if possible. With that in mind, my question is: does REINDEX TABLE CONCURRENTLY operate on the indexes of the table in parallel, or sequentially? If in parallel, I imagine we would see less DB resource utilization by updating one index at a time. Thanks in advance for any help you can provide. Best, Matthew
On 5/10/23 10:04, Matthew Planchard wrote: > Hello, > > We're working on setting up some regular jobs to reindex tables where we > wind up generating a lot of index bloat. We're planning on using REINDEX > ... CONCURRENTLY. We'd like to reindex all of the indexes on the tables. > > In some of our environments, these tables are very large and under high > load, and we want to minimize the resource consumption of index > recreation if possible. > > With that in mind, my question is: does REINDEX TABLE CONCURRENTLY > operate on the indexes of the table in parallel, or sequentially? If > in parallel, I imagine we would see less DB resource utilization by > updating one index at a time. https://www.postgresql.org/docs/11/sql-createindex.html The CREATE INDEX page certainly mentions parallel index builds. It would definitely be very good to know whether REINDEX TABLE CONCURRENTLY can have multiple "threads" leveraging table data already in cache so as not to have to reread the whole table each time. That (concurrent CREATE INDEX statements) really sped things up Back In The Day). -- Born in Arizona, moved to Babylonia.
How do I get off this list? Respectfully, James Fraley Anima Sana in Corpore Sano On Wednesday, May 10, 2023 at 01:59:37 PM EDT, Matthew Planchard <matthew@specprotected.com> wrote: Hello, We're working on setting up some regular jobs to reindex tables where we wind up generating a lot of index bloat. We're planning on using REINDEX ... CONCURRENTLY. We'd like to reindex all of the indexes on the tables. In some of our environments, these tables are very large and under high load, and we want to minimize the resource consumption of index recreation if possible. With that in mind, my question is: does REINDEX TABLE CONCURRENTLY operate on the indexes of the table in parallel, or sequentially? If in parallel, I imagine we would see less DB resource utilization by updating one index at a time. Thanks in advance for any help you can provide. Best, Matthew
On Wed, 2023-05-10 at 10:04 -0500, Matthew Planchard wrote: > We're working on setting up some regular jobs to reindex tables where we > wind up generating a lot of index bloat. We're planning on using REINDEX > ... CONCURRENTLY. We'd like to reindex all of the indexes on the tables. > > In some of our environments, these tables are very large and under high > load, and we want to minimize the resource consumption of index > recreation if possible. > > With that in mind, my question is: does REINDEX TABLE CONCURRENTLY > operate on the indexes of the table in parallel, or sequentially? If > in parallel, I imagine we would see less DB resource utilization by > updating one index at a time. If you use REINDEX TABLE CONCURRENTLY, the indexes will be built one after the other. Set "max_parallel_maintenance_workers" to 0 to keep the resource utilization low (at the price of a longer duration). Yours, Laurenz Albe
Hi James,
If you have an account, go to: https://lists.postgresql.org/manage/
Click "unsubscribe" next to any/all mailing lists that you wish to unsubscribe from.
If you don't have an account, the instructions (per email client) are here:
-Katie
From: James Fraley <jamesfraley@yahoo.com>
Sent: Wednesday, May 10, 2023 5:01:18 PM
To: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>; Matthew Planchard <matthew@specprotected.com>
Subject: Re: Index recreation details with REINDEX TABLE CONCURRENTLY
Sent: Wednesday, May 10, 2023 5:01:18 PM
To: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>; Matthew Planchard <matthew@specprotected.com>
Subject: Re: Index recreation details with REINDEX TABLE CONCURRENTLY
Attention : courriel externe | external email
How do I get off this list?
Respectfully,
James Fraley
Anima Sana in Corpore Sano
On Wednesday, May 10, 2023 at 01:59:37 PM EDT, Matthew Planchard <matthew@specprotected.com> wrote:
Hello,
We're working on setting up some regular jobs to reindex tables where we
wind up generating a lot of index bloat. We're planning on using REINDEX
... CONCURRENTLY. We'd like to reindex all of the indexes on the tables.
In some of our environments, these tables are very large and under high
load, and we want to minimize the resource consumption of index
recreation if possible.
With that in mind, my question is: does REINDEX TABLE CONCURRENTLY
operate on the indexes of the table in parallel, or sequentially? If
in parallel, I imagine we would see less DB resource utilization by
updating one index at a time.
Thanks in advance for any help you can provide.
Best,
Matthew
How do I get off this list?
Respectfully,
James Fraley
Anima Sana in Corpore Sano
On Wednesday, May 10, 2023 at 01:59:37 PM EDT, Matthew Planchard <matthew@specprotected.com> wrote:
Hello,
We're working on setting up some regular jobs to reindex tables where we
wind up generating a lot of index bloat. We're planning on using REINDEX
... CONCURRENTLY. We'd like to reindex all of the indexes on the tables.
In some of our environments, these tables are very large and under high
load, and we want to minimize the resource consumption of index
recreation if possible.
With that in mind, my question is: does REINDEX TABLE CONCURRENTLY
operate on the indexes of the table in parallel, or sequentially? If
in parallel, I imagine we would see less DB resource utilization by
updating one index at a time.
Thanks in advance for any help you can provide.
Best,
Matthew
That’s great, thank you very much!
On Thu, May 11, 2023 at 11:09 Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Wed, 2023-05-10 at 10:04 -0500, Matthew Planchard wrote:
> We're working on setting up some regular jobs to reindex tables where we
> wind up generating a lot of index bloat. We're planning on using REINDEX
> ... CONCURRENTLY. We'd like to reindex all of the indexes on the tables.
>
> In some of our environments, these tables are very large and under high
> load, and we want to minimize the resource consumption of index
> recreation if possible.
>
> With that in mind, my question is: does REINDEX TABLE CONCURRENTLY
> operate on the indexes of the table in parallel, or sequentially? If
> in parallel, I imagine we would see less DB resource utilization by
> updating one index at a time.
If you use REINDEX TABLE CONCURRENTLY, the indexes will be built one
after the other. Set "max_parallel_maintenance_workers" to 0 to keep
the resource utilization low (at the price of a longer duration).
Yours,
Laurenz Albe