Hi Lakshmi, Hayato,
Thanks a lot for your input!
I'm not sure why the VACUUM phase takes longer compared to the serial
run. We can potentially get a clue with a profiler. I know there is an
ongoing effort to introduce parallel heap vacuum [1] which I expect will
help with this.
The code comments you have provided me have been applied to the v2 patch
attached. Below I provide answers to the questions.
> Also, why is -j accepted in case of non-partitions?
For non-partitioned tables, each worker loads a separate range of rows
via its own connection in parallel.
> Copying seems to be divided into chunks per COPY_BATCH_SIZE. Is it really
> essential to parallelize the initialization? I feel it may optimize even
> serialized case thus can be discussed independently.
You're right that the COPY batching is an optimization that's
independent. I wanted to see how fast I can get this patch, so I looked
for bottlenecks in the new code with a profiler and this was one of
them. I agree it makes sense to apply this for the serialised case
separately.
> Per my understanding, each thread creates its tables, and all of them are
> attached to the parent table. Is it right? I think it needs more code
> changes, and I am not sure it is critical to make initialization faster.
Yes, that's correct. Each worker creates its assigned partitions as
standalone tables, loads data into them, and then the main thread
attaches them all to the parent after loading completes. It's to avoid
AccessExclusiveLock contention on the parent table during parallel
loading and allow each worker to use COPY FREEZE on its standalone table.
> So I suggest using the incremental approach. The first patch only
> parallelizes
> the data load, and the second patch implements the CREATE TABLE and
> ALTER TABLE
> ATTACH PARTITION. You can benchmark three patterns, master, 0001, and
> 0001 + 0002, then compare the results. IIUC, this is the common
> approach to
> reduce the patch size and make them more reviewable.
Thanks for the recommendation, I extracted 0001 and 0002 as per your
suggestion. I will see if I can split it more, as indeed it helps with
the review.
Results are similar with the previous runs.
master
pgbench -i -s 100 -j 10
done in 20.95 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 14.51 s, vacuum 0.27 s, primary keys 6.16 s).
pgbench -i -s 100 -j 10 --partitions=10
done in 29.73 s (drop tables 0.00 s, create tables 0.02 s, client-side
generate 16.33 s, vacuum 8.72 s, primary keys 4.67 s).
0001
pgbench -i -s 100 -j 10
done in 18.75 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 6.51 s, vacuum 5.73 s, primary keys 6.50 s).
pgbench -i -s 100 -j 10 --partitions=10
done in 29.33 s (drop tables 0.00 s, create tables 0.02 s, client-side
generate 16.48 s, vacuum 7.59 s, primary keys 5.24 s).
0002
pgbench -i -s 100 -j 10
done in 18.12 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 6.64 s, vacuum 5.81 s, primary keys 5.65 s).
pgbench -i -s 100 -j 10 --partitions=10
done in 14.38 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 7.97 s, vacuum 1.55 s, primary keys 4.85 s).
Looking forward to your feedback.
[1]:
https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H24AiJeNoPGCmZNSwL1YA%40mail.gmail.com
--
Thanks,
Mircea Cadariu