Thread: BUG #17586: Look like there are an another bug in REINDEX INDEX CONCURRENTLY of pg_toast indexes
BUG #17586: Look like there are an another bug in REINDEX INDEX CONCURRENTLY of pg_toast indexes
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17586 Logged by: Maxim Boguk Email address: maxim.boguk@gmail.com PostgreSQL version: 14.4 Operating system: Ubuntu Linux Description: Details about the data structure pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w partition of exports_new table (Partition key: RANGE (epoch)) During the reindex job there were quite a lot new inserts into exports_new_2022_32w table via partition head. 2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local] [vxid:12/63545267 txid:354388805] [INSERT] LOG: duration: 2711.995 ms statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch, field_ids) (SELECT field_board_id, (field_id % 256)::int field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids) t GROUP BY 1,2,3) 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY pg_toast.pg_toast_2502234687_index 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/1.0", size 8192 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY pg_toast.pg_toast_2502234687_index 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/0.0", size 8192 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY pg_toast.pg_toast_2502234687_index 2022-08-11 02:47:28.965 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp1187482.0.sharedfileset/2.0", size 252370944 2022-08-11 02:47:28.965 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY pg_toast.pg_toast_2502234687_index 2022-08-11 02:47:29.003 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp1187482.0.sharedfileset/1.0", size 253599744 2022-08-11 02:47:29.003 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY pg_toast.pg_toast_2502234687_index 2022-08-11 02:47:29.041 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp1187482.0.sharedfileset/0.0", size 251535360 2022-08-11 02:47:29.041 UTC 1187482 postgres@***_exports from [local] [vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY pg_toast.pg_toast_2502234687_index 2022-08-11 02:47:30.023 UTC 1175180 ***_push@***_exports from [local] [vxid:12/63545272 txid:354389107] [INSERT] LOG: duration: 1991.666 ms statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch, field_ids) (SELECT field_board_id, (field_id % 256)::int field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids) t GROUP BY 1,2,3) 2022-08-11 02:47:55.962 UTC 1175180 ***_push@***_exports from [local] [vxid:12/63545274 txid:354389164] [INSERT] LOG: duration: 4027.248 ms statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch, field_ids) (SELECT field_board_id, (field_id % 256)::int field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids) t GROUP BY 1,2,3) 2022-08-11 02:48:02.883 UTC 1175180 ***_push@***_exports from [local] [vxid:12/63545275 txid:354389277] [INSERT] LOG: duration: 1055.236 ms statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch, field_ids) (SELECT field_board_id, (field_id % 256)::int field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids) t GROUP BY 1,2,3) 2022-08-11 02:48:24.914 UTC 1175180 ***_push@***_exports from [local] [vxid:12/63545278 txid:354389390] [INSERT] LOG: duration: 2020.759 ms statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch, field_ids) (SELECT field_board_id, (field_id % 256)::int field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids) t GROUP BY 1,2,3) 2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from [local] [vxid:10/63478386 txid:0] [SELECT] ERROR: missing chunk number 0 for toast value 2546878653 in pg_toast_2502234687 2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from [local] [vxid:10/63478386 txid:0] [SELECT] STATEMENT: SELECT field_ids FROM exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND epoch = $3 2022-08-11 02:48:49.651 UTC 1187482 postgres@***_exports from [local] [vxid:14/0 txid:0] [REINDEX] LOG: duration: 187669.153 ms statement: REINDEX INDEX CONCURRENTLY pg_toast.pg_toast_2502234687_index 2022-08-11 02:48:50.048 UTC 1184504 ***_field_sync@***_exports from [local] [vxid:10/63478398 txid:0] [SELECT] ERROR: missing chunk number 0 for toast value 2546878502 in pg_toast_2502234687 2022-08-11 02:48:50.048 UTC 1184504 ***_field_sync@***_exports from [local] [vxid:10/63478398 txid:0] [SELECT] STATEMENT: SELECT field_ids FROM exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND epoch = $3 and so on... A follow-up secondary manual REINDEX INDEX CONCURRENTLY pg_toast.pg_toast_2502234687_index resolved the problem with damaged toast. Problems do not always happen (rarely) and I cannot build a quick reproducer so far. Kind Regards, Maxim
Re: BUG #17586: Look like there are an another bug in REINDEX INDEX CONCURRENTLY of pg_toast indexes
From
Andres Freund
Date:
Hi, On 2022-08-15 11:36:52 +0000, PG Bug reporting form wrote: > Bug reference: 17586 > Logged by: Maxim Boguk > Email address: maxim.boguk@gmail.com > PostgreSQL version: 14.4 > Operating system: Ubuntu Linux > Description: > > Details about the data structure > pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w > partition of exports_new table (Partition key: RANGE (epoch)) > > During the reindex job there were quite a lot new inserts into > exports_new_2022_32w table via partition head. > > 2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local] > [vxid:12/63545267 txid:354388805] [INSERT] LOG: duration: 2711.995 ms > statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch, > field_ids) > (SELECT field_board_id, (field_id % 256)::int > field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT > field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids) > t > GROUP BY 1,2,3) > > > 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local] > [vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path > "base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192 > [vxid:10/63478386 txid:0] [SELECT] ERROR: missing chunk number 0 for toast > value 2546878653 in pg_toast_2502234687 > 2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from [local] > [vxid:10/63478386 txid:0] [SELECT] STATEMENT: SELECT field_ids FROM > exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND epoch = > $3 Hm. What is the setup of this system? Do you have a hot_standby_feedback enabled replica? Does it use replication slots? Did it reconnect at the time? This issue could be due to the xmin horizon going backward, rather than a bug in reindex concurrently itself. Too bad we don't have an errcontext set up for this, it'd be quite helpful to know what row the missing toast value is in, so we could check its visibility. Greetings, Andres Freund
Re: BUG #17586: Look like there are an another bug in REINDEX INDEX CONCURRENTLY of pg_toast indexes
From
Maxim Boguk
Date:
On Tue, Aug 16, 2022 at 1:01 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2022-08-15 11:36:52 +0000, PG Bug reporting form wrote:
> Bug reference: 17586
> Logged by: Maxim Boguk
> Email address: maxim.boguk@gmail.com
> PostgreSQL version: 14.4
> Operating system: Ubuntu Linux
> Description:
>
> Details about the data structure
> pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w
> partition of exports_new table (Partition key: RANGE (epoch))
>
> During the reindex job there were quite a lot new inserts into
> exports_new_2022_32w table via partition head.
>
> 2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local]
> [vxid:12/63545267 txid:354388805] [INSERT] LOG: duration: 2711.995 ms
> statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
> field_ids)
> (SELECT field_board_id, (field_id % 256)::int
> field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
> field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
> t
> GROUP BY 1,2,3)
>
>
> 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
> [vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path
> "base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192
> [vxid:10/63478386 txid:0] [SELECT] ERROR: missing chunk number 0 for toast
> value 2546878653 in pg_toast_2502234687
> 2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from [local]
> [vxid:10/63478386 txid:0] [SELECT] STATEMENT: SELECT field_ids FROM
> exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND epoch =
> $3
Hm. What is the setup of this system? Do you have a hot_standby_feedback
enabled replica? Does it use replication slots? Did it reconnect at the time?
This issue could be due to the xmin horizon going backward, rather than a bug
in reindex concurrently itself.
Too bad we don't have an errcontext set up for this, it'd be quite helpful to
know what row the missing toast value is in, so we could check its visibility.
Hi,
hot_standby_feedback enabled replica - yes (but with minimal activity)
use replication slots - no and no other apps with replication slots
Did it reconnect at the time - most definitely no, pg_stat_replication shows
backend_start | 2022-06-30 07:36:51.095073+00
for replication connection.
Also I have log_replication_commands enabled and nothing about replication logged during the last week.
--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/
Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678
LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"
Senior Postgresql DBA
https://dataegret.com/
Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678
LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"
Re: BUG #17586: Look like there are an another bug in REINDEX INDEX CONCURRENTLY of pg_toast indexes
From
Andres Freund
Date:
Hi, On 2022-08-16 10:34:14 +0200, Maxim Boguk wrote: > On Tue, Aug 16, 2022 at 1:01 AM Andres Freund <andres@anarazel.de> wrote: > > On 2022-08-15 11:36:52 +0000, PG Bug reporting form wrote: > > > Bug reference: 17586 > > > Logged by: Maxim Boguk > > > Email address: maxim.boguk@gmail.com > > > PostgreSQL version: 14.4 > > > Operating system: Ubuntu Linux > > > Description: > > > > > > Details about the data structure > > > pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w > > > partition of exports_new table (Partition key: RANGE (epoch)) > > > > > > During the reindex job there were quite a lot new inserts into > > > exports_new_2022_32w table via partition head. > > > > > > > > 2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local] > > > [vxid:12/63545267 txid:354388805] [INSERT] LOG: duration: 2711.995 ms > > > statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, > > epoch, > > > field_ids) > > > (SELECT field_board_id, (field_id % 256)::int > > > field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT > > > field_board_id, epoch, unnest(field_ids) AS field_id FROM > > temp_export_ids) > > > t > > > GROUP BY 1,2,3) > > > > > > > > > 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local] > > > [vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path > > > "base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192 > > > > > > > > > [vxid:10/63478386 txid:0] [SELECT] ERROR: missing chunk number 0 for > > toast > > > value 2546878653 in pg_toast_2502234687 > > > 2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from > > [local] > > > [vxid:10/63478386 txid:0] [SELECT] STATEMENT: SELECT field_ids FROM > > > exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND > > epoch = > > > $3 > > > > Hm. What is the setup of this system? Do you have a hot_standby_feedback > > enabled replica? Does it use replication slots? Did it reconnect at the > > time? > > > > This issue could be due to the xmin horizon going backward, rather than a > > bug > > in reindex concurrently itself. > > > > > > Too bad we don't have an errcontext set up for this, it'd be quite helpful > > to > > know what row the missing toast value is in, so we could check its > > visibility. > > > > Hi, > hot_standby_feedback enabled replica - yes (but with minimal activity) > use replication slots - no and no other apps with replication slots > Did it reconnect at the time - most definitely no, pg_stat_replication > shows > backend_start | 2022-06-30 07:36:51.095073+00 > for replication connection. > Also I have log_replication_commands enabled and nothing about replication > logged during the last week. Hm. How often does this occur? And could you run a server with a bit of added debugging? Otherwise I don't see how we can narrow this down, unfortunately. Greetings, Andres Freund