Home > mailing lists

Building lots of indices in parallel - Mailing list pgsql-general

From	Ron
Subject	Building lots of indices in parallel
Date	February 12, 2021 06:31:40
Msg-id	7fa62c6a-dfab-4d0e-17cc-e4030f83fcc9@gmail.com Whole thread Raw
List	pgsql-general

Tree view

We're migrating a big database (200ish tables 30 of which have bytea or xml 
fields; the Oracle size is 10TB) to Postgresql 12.5 (RDS, if it matters), 
and after the loads are completed, it will be time to create the secondary 
indices (tables currently only have PK constraints), and they of course need 
to be built as fast as possible.

The first things to do are based in Postgres itself:
ALTER TABLE foo SET (parallel_workers = 4);
SET max_parallel_maintenance_workers TO 4;
CREATE INDEX foo_ind2 ON foo (f1);

That creates one index very quickly, but only one index at a time, wasting 
the other CPUs (there are 48 in the VM, and 384GB RAM).

What is the impact on locking/contention of using something like GNU 
Parallel (especially when it tries to create multiple indices on the same 
table in parallel)?

Is there a better way to quickly create lots of indices?

-- 
Angular momentum makes the world go 'round.

pgsql-general by date:

From: Martin Ritchie
Date: 12 February 2021, 05:42:35
Subject: Re: Encryption of Data Specific to a Tenant in PostgreSQL database | General Idea

From: Matthias Apitz
Date: 12 February 2021, 13:24:37
Subject: kernel.shmmax and kernel.shmall for Linux server

Building lots of indices in parallel - Mailing list pgsql-general

Previous

Next