Home > mailing lists

parallel data loading for pgbench -i - Mailing list pgsql-hackers

From	Mircea Cadariu
Subject	parallel data loading for pgbench -i
Date	November 17 15:46:12
Msg-id	cb014f00-66b2-4328-a65e-d11c681c9f45@gmail.com Whole thread Raw
List	pgsql-hackers

Tree view

Hi,

I propose a patch for speeding up pgbench -i through multithreading.

To enable this, pass -j and then the number of workers you want to use.

Here are some results I got on my laptop:


master

---

-i -s 100
done in 20.95 s (drop tables 0.00 s, create tables 0.01 s, client-side 
generate 14.51 s, vacuum 0.27 s, primary keys 6.16 s).

-i -s 100 --partitions=10
done in 29.73 s (drop tables 0.00 s, create tables 0.02 s, client-side 
generate 16.33 s, vacuum 8.72 s, primary keys 4.67 s).


patch (-j 10)

---

-i -s 100 -j 10
done in 18.64 s (drop tables 0.00 s, create tables 0.01 s, client-side 
generate 5.82 s, vacuum 6.89 s, primary keys 5.93 s).

-i -s 100 -j 10 --partitions=10
done in 14.66 s (drop tables 0.00 s, create tables 0.01 s, client-side 
generate 8.42 s, vacuum 1.55 s, primary keys 4.68 s).

The speedup is more significant for the partitioned use-case. This is 
because all workers can use COPY FREEZE (thus incurring a lower vacuum 
penalty) because they create their separate partitions.

For the non-partitioned case the speedup is lower, but I observe it 
improves somewhat with larger scale factors. When parallel vacuum 
support is merged, this should further reduce the time.

I'd still need to update docs, tests, better integrate the code with its 
surroundings, and other aspects. Would appreciate any feedback on what I 
have so far though. Thanks!

Kind regards,

Mircea Cadariu

Attachment

v1-0001-parallel-pgbench-wip.patch

pgsql-hackers by date:

From: Boris Mironov
Date: 17 November, 15:43:52
Subject: Re: Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY)

From: Amit Langote
Date: 17 November, 15:50:01
Subject: Re: generic plans and "initial" pruning

parallel data loading for pgbench -i - Mailing list pgsql-hackers

Attachment

Previous

Next