Thread: BUG #18556: Parallel operation error during CREATE TABLE AS statement
BUG #18556: Parallel operation error during CREATE TABLE AS statement
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18556 Logged by: Shaun Thomas Email address: shaun@tembo.io PostgreSQL version: 16.3 Operating system: Ubuntu 22.04.4 LTS Description: We have been witnessing an intermittent error from a Postgres 16.3 instance if a table is created using a SELECT statement. The error is: ERROR: cannot start commands during a parallel operation The statement itself is very innocuous: CREATE TEMP TABLE parallel_bonk AS ( SELECT document_key, full_document FROM some_schema.source_table WHERE document_key ->> '_id' IN ('xyz', 'pdq') ); Just to eliminate some potential variables, this is a 54-million row HEAP table with the following definition: Column | Type | Collation | Nullable | Default -----------------------------------+--------------------------+-----------+----------+--------- _id | jsonb | | not null | operation_type | text | | | full_document | jsonb | | | ns_db | text | | | ns_coll | text | | | to_db | text | | | to_coll | text | | | document_key | jsonb | | not null | update_description_updated_fields | jsonb | | | update_description_removed_fields | text[] | | | cluster_time | jsonb | | | txn_number | bigint | | | lsid_id | text | | | lsid_uid | text | | | kafka_topic | text | | not null | kafka_partition | integer | | not null | kafka_offset | bigint | | not null | kafka_timestamp | timestamp with time zone | | not null | Indexes: "source_table_pkey" PRIMARY KEY, btree (document_key) "source_table_kafka_offset_idx" UNIQUE, btree (kafka_offset DESC) Note that there are no triggers, rules, constraints, partitions, or any parallel-unsafe entities associated with this table. The associated query plan looks like this: QUERY PLAN -------------------------------------------------------------------------------------------- Gather (cost=1000.00..4045627.93 rows=544751 width=479) Workers Planned: 8 -> Parallel Seq Scan on source_table (cost=0.00..3990152.83 rows=68094 width=479) Filter: ((document_key ->> '_id'::text) = ANY ('{xyz,pdq}'::text[])) JIT: Functions: 4 Options: Inlining true, Optimization true, Expressions true, Deforming true Note that the problem persists regardless of the JIT setting. This appears to be related to the parallel workers in some manner. We haven't been able to distill this down to a simple or reliable repro case yet, as the problem seems to occur entirely at random, suggesting it's some kind of parallel operation race condition. There have been other similar discussions regarding the dbt Postgres ETL toolkit. It appears this ETL kit utilizes a loading procedure that is more likely to trigger this edge case.
Re: BUG #18556: Parallel operation error during CREATE TABLE AS statement
From
Aleksander Alekseev
Date:
Hi, > We have been witnessing an intermittent error from a Postgres 16.3 instance > if a table is created using a SELECT statement. The error is: > > ERROR: cannot start commands during a parallel operation > > [...] I tried to reproduce the issue on debug and release builds of PostgreSQL `master` and `REL_16_STABLE` on different hardware and didn't succeed so far. Here are the queries I used: ``` CREATE TABLE IF NOT EXISTS phonebook( id SERIAL PRIMARY KEY, name TEXT, phone BIGINT); INSERT INTO phonebook SELECT i, 'name_' || i, i # 0xDEADBEEF FROM generate_series(1, 54_000_000) AS i; SET parallel_tuple_cost = 0; -- execute several times: -- make sure Parallel Seq Scan is used with EXPLAIN ANALYZE CREATE [TEMP] TABLE pb_subset AS ( SELECT id, name FROM phonebook WHERE phone > [choose a number] ); ``` If you could provide minimal steps that reproduce the issue on your hardware this would be helpful. Please let us know whether the issue reproduces on PostgreSQL 17 beta, or only on 16.3. Also please give us a little more details about the hardware. -- Best regards, Aleksander Alekseev
I need to rescind this report.
It turns out the customer in question had set the default table access method to columnar, and in this case, it was the Hydra columnar extension causing the issue. It turns out that this is a known bug:
They had thought it was fixed with this PR:
But that turned out not to be the case. I re-submitted a bug report with a more thorough writeup to them:
They seem to think there's basically no way to fix this with the current extension APIs. If someone here happens to know a workaround, I'm sure they would be receptive.
Cheers!
Re: BUG #18556: Parallel operation error during CREATE TABLE AS statement
From
Aleksander Alekseev
Date:
Hi, > They seem to think there's basically no way to fix this with the current extension APIs. If someone here happens to knowa workaround, I'm sure they would be receptive. To my knowledge TableAM was never designed to build columnar storages. The main idea was implementing an undo-logged engine and maybe index-organized tables. If Hydra developers know how to improve TableAMs in order to make columnar storage implementation easier / more efficient I'm certain their input will be welcomed in the pgsql-hackers@ mailing list. -- Best regards, Aleksander Alekseev