Re: parallel data loading for pgbench -i - Mailing list pgsql-hackers

From Mircea Cadariu
Subject Re: parallel data loading for pgbench -i
Date
Msg-id 10868918-cdf9-49dc-99af-8e8ccd6e368c@gmail.com
Whole thread Raw
In response to Re: parallel data loading for pgbench -i  (lakshmi <lakshmigcdac@gmail.com>)
List pgsql-hackers
Hi Lakshmi, Hayato,


Thanks a lot for your input!

I'm not sure why the VACUUM phase takes longer compared to the serial 
run. We can potentially get a clue with a profiler. I know there is an 
ongoing effort to introduce parallel heap vacuum [1] which I expect will 
help with this.

The code comments you have provided me have been applied to the v2 patch 
attached. Below I provide answers to the questions.

> Also, why is -j accepted in case of non-partitions?
For non-partitioned tables, each worker loads a separate range of rows 
via its own connection in parallel.

> Copying seems to be divided into chunks per COPY_BATCH_SIZE. Is it really
> essential to parallelize the initialization? I feel it may optimize even
> serialized case thus can be discussed independently.
You're right that the COPY batching is an optimization that's 
independent. I wanted to see how fast I can get this patch, so I looked 
for bottlenecks in the new code with a profiler and this was one of 
them. I agree it makes sense to apply this for the serialised case 
separately.

> Per my understanding, each thread creates its tables, and all of them are
> attached to the parent table. Is it right? I think it needs more code
> changes, and I am not sure it is critical to make initialization faster.
Yes, that's correct. Each worker creates its assigned partitions as 
standalone tables, loads data into them, and then the main thread 
attaches them all to the parent after loading completes. It's to avoid 
AccessExclusiveLock contention on the parent table during parallel 
loading and allow each worker to use COPY FREEZE on its standalone table.

> So I suggest using the incremental approach. The first patch only 
> parallelizes
> the data load, and the second patch implements the CREATE TABLE and 
> ALTER TABLE
> ATTACH PARTITION. You can benchmark three patterns, master, 0001, and
> 0001 + 0002, then compare the results. IIUC, this is the common 
> approach to
> reduce the patch size and make them more reviewable.

Thanks for the recommendation, I extracted 0001 and 0002 as per your 
suggestion. I will see if I can split it more, as indeed it helps with 
the review.

Results are similar with the previous runs.

master

pgbench -i -s 100 -j 10
done in 20.95 s (drop tables 0.00 s, create tables 0.01 s, client-side 
generate 14.51 s, vacuum 0.27 s, primary keys 6.16 s).

pgbench -i -s 100 -j 10 --partitions=10
done in 29.73 s (drop tables 0.00 s, create tables 0.02 s, client-side 
generate 16.33 s, vacuum 8.72 s, primary keys 4.67 s).


0001
pgbench -i -s 100 -j 10
done in 18.75 s (drop tables 0.00 s, create tables 0.01 s, client-side 
generate 6.51 s, vacuum 5.73 s, primary keys 6.50 s).

pgbench -i -s 100 -j 10 --partitions=10
done in 29.33 s (drop tables 0.00 s, create tables 0.02 s, client-side 
generate 16.48 s, vacuum 7.59 s, primary keys 5.24 s).

0002
pgbench -i -s 100 -j 10
done in 18.12 s (drop tables 0.00 s, create tables 0.01 s, client-side 
generate 6.64 s, vacuum 5.81 s, primary keys 5.65 s).

pgbench -i -s 100 -j 10 --partitions=10
done in 14.38 s (drop tables 0.00 s, create tables 0.01 s, client-side 
generate 7.97 s, vacuum 1.55 s, primary keys 4.85 s).


Looking forward to your feedback.

[1]: 
https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H24AiJeNoPGCmZNSwL1YA%40mail.gmail.com

-- 
Thanks,
Mircea Cadariu

Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Skipping schema changes in publication
Next
From: Álvaro Herrera
Date:
Subject: Re: some more include removal from headers