RE: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers
From | tsunakawa.takay@fujitsu.com |
---|---|
Subject | RE: Parallel INSERT (INTO ... SELECT ...) |
Date | |
Msg-id | OSBPR01MB298274BCB8959932191EA9A6FECB0@OSBPR01MB2982.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Parallel INSERT (INTO ... SELECT ...) (Greg Nancarrow <gregn4422@gmail.com>) |
Responses |
Re: Parallel INSERT (INTO ... SELECT ...)
|
List | pgsql-hackers |
From: Greg Nancarrow <gregn4422@gmail.com> > Firstly, in order to perform parallel-safety checks in the case of partitions, the > patch currently recursively locks/unlocks > (AccessShareLock) each partition during such checks (as each partition may > itself be a partitioned table). Is there a better way of performing the > parallel-safety checks and reducing the locking requirements? First of all, as you demonstrated the planning time and execution time of parallel insert, I think the increased planningtime is negligible when the parallel insert is intentionally used for loading large amount of data. However, it'sa problem if the overhead is imposed on OLTP transactions. Does the overhead occur with the default values of max_parallel_workers_per_gather= 2 and max_parall_workers = 8? To avoid this heavy checking during planning, I'm wondering if we can have an attribute in pg_class, something like relhasindexesand relhas triggers. The concerning point is that we have to maintain the accuracy of the value when droppingancillary objects around the table/partition. > Secondly, I found that when running "make check-world", the > "partition-concurrent-attach" test fails, because it is expecting a partition > constraint to be violated on insert, while an "alter table attach partition ..." is > concurrently being executed in another transaction. Because of the partition > locking done by the patch's parallel-safety checking code, the insert blocks on > the exclusive lock held by the "alter table" in the other transaction until the > transaction ends, so the insert ends up successfully completing (and thus fails > the test) when the other transaction ends. To overcome this test failure, the > patch code was updated to instead perform a conditional lock on the partition, > and on failure (i.e. because of an exclusive lock held somewhere else), just > assume it's parallel-unsafe because the parallel-safety can't be determined > without blocking on the lock. This is not ideal, but I'm not sure of what other > approach could be used and I am somewhat reluctant to change that test. If > anybody is familiar with the "partition-concurrent-attach" test, any ideas or > insights would be appreciated. That test looks sane. I think what we should do is to disable parallel operation during that test. It looks like some ofother existing test cases disable parallel query by setting max_parallel_workers_per_gather to 0. It's not strange thatsome tests fail with some configuration. autovacuum is disabled in many places of the regression test. Rather, I don't think we should introduce the trick to use ConditionalLockAcquire(). Otherwise, the insert would be executedin a serial fashion without the user knowing it -- "What? The insert suddenly slowed down multiple times today,and it didn't finish within the planned maintenance window. What's wrong?" Regards Takayuki Tsunakawa
pgsql-hackers by date: