Thread: BUG #18166: 100 Gb 18000000 records table update
The following bug has been logged on the website: Bug reference: 18166 Logged by: Ruslan Ganeev Email address: ruslan.ganeev@list.ru PostgreSQL version: 12.6 Operating system: CentOS 8 Description: Greetings Could you help us with the following We have a table with name «svjul», that contains 18 million records, 100 Gb disk space occupied The table «svjul» contains 3 columns - ID (Type: Integer) – it’s a key - enddate (Type: Date) - Data (Type: JSONB). That column contains fields «OGRN» and «DataVip». The value in a row «OGRN» is not unique: several records may contain the same OGRN. We tried to make a script, which sets enddate = '2022-12-31' for all records, having the value in «DataVip» that is not maximal. For other records the script set s enddate = null The problem is that the script is running for 6 hours, the main percentage of time is taken by the rebuilding of indexes. Would you be so kind to help us with the information whether there is any possibility to update record without rebuilding indexes. Or may be there is some programming technique to solve the problem of this kind. Sorry for my English, I’m not a native speaker Best regards --Our script-- -- 0 minute drop table if exists tempsvjul1; create table tempsvjul1 (id bigint not null, code text not null, datapostuch text, datavip text, constraint tempsvjul1_pkey primary key (id)); -- 30 minute insert into tempsvjul1(id, code, datapostuch, datavip) select id, data->>'OGRN', data -> 'SvUchetNO' ->> 'DataPostUch', data ->> 'DataVip' from svjul; -- 1 minute create index idx_deduplicate_tempsvjul1_btree_codedpostdvipid on tempsvjul1 using btree (code, datapostuch desc, datavip desc, id desc); -- 0 minute drop table if exists tempsvjul2; create table tempsvjul2(id bigint not null, constraint tempsvjul2_pkey primary key (id)); -- 2 minute insert into tempsvjul2(id) select distinct on (code) id from tempsvjul1 order by code, datapostuch desc, datavip desc, id desc; -- 4 minute without indexes 6 hour with indexes update svjul set code = coalesce((select tempsvjul1.code from tempsvjul1 where tempsvjul1.id = svjul.id), '???'), enddate = case when exists(select 1 from tempsvjul2 where tempsvjul2.id = svjul.id) then null else '2023-10-05'::date end; -- 0 minute drop table tempsvjul1; drop table tempsvjul2;
PG Bug reporting form <noreply@postgresql.org> writes: > We tried to make a script, which sets enddate = '2022-12-31' for all > records, having the value in «DataVip» that is not maximal. For other > records the script set s enddate = null > The problem is that the script is running for 6 hours, the main percentage > of time is taken by the rebuilding of indexes. This is not a bug. However ... a common workaround for bulk updates like that is to drop all the table's indexes and then recreate them afterwards. It's often quicker than doing row-by-row index updates. regards, tom lane
Thank you very much
--
Отправлено из Mail.ru для Android
PG Bug reporting form <noreply@postgresql.org> writes:
> We tried to make a script, which sets enddate = '2022-12-31' for all
> records, having the value in «DataVip» that is not maximal. For other
> records the script set s enddate = null
> The problem is that the script is running for 6 hours, the main percentage
> of time is taken by the rebuilding of indexes.
This is not a bug. However ... a common workaround for bulk updates
like that is to drop all the table's indexes and then recreate them
afterwards. It's often quicker than doing row-by-row index updates.
regards, tom lane
> We tried to make a script, which sets enddate = '2022-12-31' for all
> records, having the value in «DataVip» that is not maximal. For other
> records the script set s enddate = null
> The problem is that the script is running for 6 hours, the main percentage
> of time is taken by the rebuilding of indexes.
This is not a bug. However ... a common workaround for bulk updates
like that is to drop all the table's indexes and then recreate them
afterwards. It's often quicker than doing row-by-row index updates.
regards, tom lane