Thread: Use COPY for populating all pgbench tables
Hello, We (Neon) have noticed that pgbench can be quite slow to populate data in regard to higher latency connections. Higher scale factors exacerbate this problem. Some employees work on a different continent than the databases they might be benchmarking. By moving pgbench to use COPY for populating all tables, we can reduce some of the time pgbench takes for this particular step. I wanted to come with benchmarks, but unfortunately I won't have them until next month. I can follow-up in a future email. -- Tristan Partin Neon (https://neon.tech)
Attachment
On Tue May 23, 2023 at 12:33 PM CDT, Tristan Partin wrote: > I wanted to come with benchmarks, but unfortunately I won't have them > until next month. I can follow-up in a future email. I finally got around to benchmarking. master: $ ./build/src/bin/pgbench/pgbench -i -s 500 CONNECTION_STRING dropping old tables... NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_branches" does not exist, skipping NOTICE: table "pgbench_history" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping creating tables... generating data (client-side)... 50000000 of 50000000 tuples (100%) done (elapsed 260.93 s, remaining 0.00 s)) vacuuming... creating primary keys... done in 1414.26 s (drop tables 0.20 s, create tables 0.82 s, client-side generate 1280.43 s, vacuum 2.55 s, primary keys130.25 s). patchset: $ ./build/src/bin/pgbench/pgbench -i -s 500 CONNECTION_STRING dropping old tables... NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_branches" does not exist, skipping NOTICE: table "pgbench_history" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping creating tables... generating data (client-side)... 50000000 of 50000000 tuples (100%) of pgbench_accounts done (elapsed 243.82 s, remaining 0.00 s)) vacuuming... creating primary keys... done in 375.66 s (drop tables 0.14 s, create tables 0.73 s, client-side generate 246.27 s, vacuum 2.77 s, primary keys 125.75s). I am located in Austin, Texas. The server was located in Ireland. Just wanted to put some distance between the client and server. To summarize, that is about an 80% reduction in client-side data generation times. Note that I used Neon to collect these results. -- Tristan Partin Neon (https://neon.tech)
On Thu, 8 Jun 2023 at 07:16, Tristan Partin <tristan@neon.tech> wrote: > > master: > > 50000000 of 50000000 tuples (100%) done (elapsed 260.93 s, remaining 0.00 s)) > vacuuming... > creating primary keys... > done in 1414.26 s (drop tables 0.20 s, create tables 0.82 s, client-side generate 1280.43 s, vacuum 2.55 s, primary keys130.25 s). > > patchset: > > 50000000 of 50000000 tuples (100%) of pgbench_accounts done (elapsed 243.82 s, remaining 0.00 s)) > vacuuming... > creating primary keys... > done in 375.66 s (drop tables 0.14 s, create tables 0.73 s, client-side generate 246.27 s, vacuum 2.77 s, primary keys125.75 s). I've also previously found pgbench -i to be slow. It was a while ago, and IIRC, it was due to the printfPQExpBuffer() being a bottleneck inside pgbench. On seeing your email, it makes me wonder if PG16's hex integer literals might help here. These should be much faster to generate in pgbench and also parse on the postgres side. I wrote a quick and dirty patch to try that and I'm not really getting the same performance increases as I'd have expected. I also tested with your patch too and it does not look that impressive either when running pgbench on the same machine as postgres. pgbench copy speedup ** master drowley@amd3990x:~$ pgbench -i -s 1000 postgres 100000000 of 100000000 tuples (100%) done (elapsed 74.15 s, remaining 0.00 s) vacuuming... creating primary keys... done in 95.71 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 74.45 s, vacuum 0.12 s, primary keys 21.13 s). ** David's Patched drowley@amd3990x:~$ pgbench -i -s 1000 postgres 100000000 of 100000000 tuples (100%) done (elapsed 69.64 s, remaining 0.00 s) vacuuming... creating primary keys... done in 90.22 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 69.91 s, vacuum 0.12 s, primary keys 20.18 s). ** Tristan's patch drowley@amd3990x:~$ pgbench -i -s 1000 postgres 100000000 of 100000000 tuples (100%) of pgbench_accounts done (elapsed 77.44 s, remaining 0.00 s) vacuuming... creating primary keys... done in 98.64 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 77.47 s, vacuum 0.12 s, primary keys 21.04 s). I'm interested to see what numbers you get. You'd need to test on PG16 however. I left the old code in place to generate the decimal numbers for versions < 16. David
Attachment
I guess that COPY will still be slower than generating the data server-side ( --init-steps=...G... ) ? What I'd really like to see is providing all the pgbench functions also on the server. Specifically the various random(...) functions - random_exponential(...), random_gaussian(...), random_zipfian(...) so that also custom data generationm could be easily done server-side with matching distributions. On Thu, Jun 8, 2023 at 7:34 AM David Rowley <dgrowleyml@gmail.com> wrote: > > On Thu, 8 Jun 2023 at 07:16, Tristan Partin <tristan@neon.tech> wrote: > > > > master: > > > > 50000000 of 50000000 tuples (100%) done (elapsed 260.93 s, remaining 0.00 s)) > > vacuuming... > > creating primary keys... > > done in 1414.26 s (drop tables 0.20 s, create tables 0.82 s, client-side generate 1280.43 s, vacuum 2.55 s, primary keys130.25 s). > > > > patchset: > > > > 50000000 of 50000000 tuples (100%) of pgbench_accounts done (elapsed 243.82 s, remaining 0.00 s)) > > vacuuming... > > creating primary keys... > > done in 375.66 s (drop tables 0.14 s, create tables 0.73 s, client-side generate 246.27 s, vacuum 2.77 s, primary keys125.75 s). > > I've also previously found pgbench -i to be slow. It was a while ago, > and IIRC, it was due to the printfPQExpBuffer() being a bottleneck > inside pgbench. > > On seeing your email, it makes me wonder if PG16's hex integer > literals might help here. These should be much faster to generate in > pgbench and also parse on the postgres side. > > I wrote a quick and dirty patch to try that and I'm not really getting > the same performance increases as I'd have expected. I also tested > with your patch too and it does not look that impressive either when > running pgbench on the same machine as postgres. > > pgbench copy speedup > > ** master > drowley@amd3990x:~$ pgbench -i -s 1000 postgres > 100000000 of 100000000 tuples (100%) done (elapsed 74.15 s, remaining 0.00 s) > vacuuming... > creating primary keys... > done in 95.71 s (drop tables 0.00 s, create tables 0.01 s, client-side > generate 74.45 s, vacuum 0.12 s, primary keys 21.13 s). > > ** David's Patched > drowley@amd3990x:~$ pgbench -i -s 1000 postgres > 100000000 of 100000000 tuples (100%) done (elapsed 69.64 s, remaining 0.00 s) > vacuuming... > creating primary keys... > done in 90.22 s (drop tables 0.00 s, create tables 0.01 s, client-side > generate 69.91 s, vacuum 0.12 s, primary keys 20.18 s). > > ** Tristan's patch > drowley@amd3990x:~$ pgbench -i -s 1000 postgres > 100000000 of 100000000 tuples (100%) of pgbench_accounts done (elapsed > 77.44 s, remaining 0.00 s) > vacuuming... > creating primary keys... > done in 98.64 s (drop tables 0.00 s, create tables 0.01 s, client-side > generate 77.47 s, vacuum 0.12 s, primary keys 21.04 s). > > I'm interested to see what numbers you get. You'd need to test on > PG16 however. I left the old code in place to generate the decimal > numbers for versions < 16. > > David
On Thu Jun 8, 2023 at 12:33 AM CDT, David Rowley wrote: > On Thu, 8 Jun 2023 at 07:16, Tristan Partin <tristan@neon.tech> wrote: > > > > master: > > > > 50000000 of 50000000 tuples (100%) done (elapsed 260.93 s, remaining 0.00 s)) > > vacuuming... > > creating primary keys... > > done in 1414.26 s (drop tables 0.20 s, create tables 0.82 s, client-side generate 1280.43 s, vacuum 2.55 s, primary keys130.25 s). > > > > patchset: > > > > 50000000 of 50000000 tuples (100%) of pgbench_accounts done (elapsed 243.82 s, remaining 0.00 s)) > > vacuuming... > > creating primary keys... > > done in 375.66 s (drop tables 0.14 s, create tables 0.73 s, client-side generate 246.27 s, vacuum 2.77 s, primary keys125.75 s). > > I've also previously found pgbench -i to be slow. It was a while ago, > and IIRC, it was due to the printfPQExpBuffer() being a bottleneck > inside pgbench. > > On seeing your email, it makes me wonder if PG16's hex integer > literals might help here. These should be much faster to generate in > pgbench and also parse on the postgres side. Do you have a link to some docs? I have not heard of the feature. Definitely feels like a worthy cause. > I wrote a quick and dirty patch to try that and I'm not really getting > the same performance increases as I'd have expected. I also tested > with your patch too and it does not look that impressive either when > running pgbench on the same machine as postgres. I didn't expect my patch to increase performance in all workloads. I was mainly aiming to fix high-latency connections. Based on your results that looks like a 4% reduction in performance of client-side data generation. I had thought maybe it is worth having a flag to keep the old way too, but I am not sure a 4% hit is really that big of a deal. > pgbench copy speedup > > ** master > drowley@amd3990x:~$ pgbench -i -s 1000 postgres > 100000000 of 100000000 tuples (100%) done (elapsed 74.15 s, remaining 0.00 s) > vacuuming... > creating primary keys... > done in 95.71 s (drop tables 0.00 s, create tables 0.01 s, client-side > generate 74.45 s, vacuum 0.12 s, primary keys 21.13 s). > > ** David's Patched > drowley@amd3990x:~$ pgbench -i -s 1000 postgres > 100000000 of 100000000 tuples (100%) done (elapsed 69.64 s, remaining 0.00 s) > vacuuming... > creating primary keys... > done in 90.22 s (drop tables 0.00 s, create tables 0.01 s, client-side > generate 69.91 s, vacuum 0.12 s, primary keys 20.18 s). > > ** Tristan's patch > drowley@amd3990x:~$ pgbench -i -s 1000 postgres > 100000000 of 100000000 tuples (100%) of pgbench_accounts done (elapsed > 77.44 s, remaining 0.00 s) > vacuuming... > creating primary keys... > done in 98.64 s (drop tables 0.00 s, create tables 0.01 s, client-side > generate 77.47 s, vacuum 0.12 s, primary keys 21.04 s). > > I'm interested to see what numbers you get. You'd need to test on > PG16 however. I left the old code in place to generate the decimal > numbers for versions < 16. I will try to test this soon and follow up on the thread. I definitely see no problems with your patch as is though. I would be more than happy to rebase my patches on yours. -- Tristan Partin Neon (https://neon.tech)
On Tue, May 23, 2023 at 1:33 PM Tristan Partin <tristan@neon.tech> wrote:
We (Neon) have noticed that pgbench can be quite slow to populate data
in regard to higher latency connections. Higher scale factors exacerbate
this problem. Some employees work on a different continent than the
databases they might be benchmarking. By moving pgbench to use COPY for
populating all tables, we can reduce some of the time pgbench takes for
this particular step.
When latency is continent size high, pgbench should be run with server-side table generation instead of using COPY at all, for any table. The default COPY based pgbench generation is only intended for testing where the client and server are very close on the network.
Unfortunately there's no simple command line option to change just that one thing about how pgbench runs. You have to construct a command line that documents each and every step you want instead. You probably just want this form:
$ pgbench -i -I dtGvp -s 500
That's server-side table generation with all the usual steps. I use this instead of COPY in pgbench-tools so much now, basically whenever I'm talking to a cloud system, that I have a simple 0/1 config option to switch between the modes, and this long weird one is the default now.
Try that out, and once you see the numbers my bet is you'll see extending which tables get COPY isn't needed by your use case anymore. Basically, if you are close enough to use COPY instead of server-side generation, you are close enough that every table besides accounts will not add up to enough time to worry about optimizing the little ones.
On Fri Jun 9, 2023 at 8:24 AM CDT, Gregory Smith wrote: > On Tue, May 23, 2023 at 1:33 PM Tristan Partin <tristan@neon.tech> wrote: > > > We (Neon) have noticed that pgbench can be quite slow to populate data > > in regard to higher latency connections. Higher scale factors exacerbate > > this problem. Some employees work on a different continent than the > > databases they might be benchmarking. By moving pgbench to use COPY for > > populating all tables, we can reduce some of the time pgbench takes for > > this particular step. > > > > When latency is continent size high, pgbench should be run with server-side > table generation instead of using COPY at all, for any table. The default > COPY based pgbench generation is only intended for testing where the client > and server are very close on the network. > > Unfortunately there's no simple command line option to change just that one > thing about how pgbench runs. You have to construct a command line that > documents each and every step you want instead. You probably just want > this form: > > $ pgbench -i -I dtGvp -s 500 > > That's server-side table generation with all the usual steps. I use this > instead of COPY in pgbench-tools so much now, basically whenever I'm > talking to a cloud system, that I have a simple 0/1 config option to switch > between the modes, and this long weird one is the default now. > > Try that out, and once you see the numbers my bet is you'll see extending > which tables get COPY isn't needed by your use case anymore. Basically, if > you are close enough to use COPY instead of server-side generation, you are > close enough that every table besides accounts will not add up to enough > time to worry about optimizing the little ones. Thanks for your input Greg. I'm sure you're correct that server-side data generation would probably fix the problem. I guess I am trying to understand if there are any downsides to just committing this anyway. I haven't done any more testing, but David's email only showed a 4% performance drop in the workload he tested. Combine this with his hex patch and we would see an overall performance improvement when everything is said and done. It seems like this patch would still be good for client-side high scale factor data generation (when server and client are close), but I would need to do testing to confirm. -- Tristan Partin Neon (https://neon.tech)
David, I think you should submit this patch standalone. I don't see any reason this shouldn't be reviewed and committed when fully fleshed out. -- Tristan Partin Neon (https://neon.tech)
On Fri, Jun 9, 2023 at 6:24 AM Gregory Smith <gregsmithpgsql@gmail.com> wrote: > > Unfortunately there's no simple command line option to change just that one thing about how pgbench runs. You have toconstruct a command line that documents each and every step you want instead. You probably just want this form: > > $ pgbench -i -I dtGvp -s 500 The steps are severely under-documented in pgbench --help output. Grepping that output I could not find any explanation of these steps, so I dug through the code and found them in runInitSteps(). Just as I was thinking of writing a patch to remedy that, just to be sure, I checked online docs and sure enough, they are well-documented under pgbench [1]. I think at least a pointer to the the pgbench docs should be mentioned in the pgbench --help output; an average user may not rush to read the code to find the explanation, but a hint to where to find more details about what the letters in --init-steps mean, would save them a lot of time. [1]: https://www.postgresql.org/docs/15/pgbench.html Best regards, Gurjeet http://Gurje.et
On Fri Jun 9, 2023 at 12:25 PM CDT, Gurjeet Singh wrote: > On Fri, Jun 9, 2023 at 6:24 AM Gregory Smith <gregsmithpgsql@gmail.com> wrote: > > > > Unfortunately there's no simple command line option to change just that one thing about how pgbench runs. You have toconstruct a command line that documents each and every step you want instead. You probably just want this form: > > > > $ pgbench -i -I dtGvp -s 500 > > The steps are severely under-documented in pgbench --help output. > Grepping that output I could not find any explanation of these steps, > so I dug through the code and found them in runInitSteps(). Just as I > was thinking of writing a patch to remedy that, just to be sure, I > checked online docs and sure enough, they are well-documented under > pgbench [1]. > > I think at least a pointer to the the pgbench docs should be mentioned > in the pgbench --help output; an average user may not rush to read the > code to find the explanation, but a hint to where to find more details > about what the letters in --init-steps mean, would save them a lot of > time. > > [1]: https://www.postgresql.org/docs/15/pgbench.html > > Best regards, > Gurjeet > http://Gurje.et I think this would be nice to have if you wanted to submit a patch. -- Tristan Partin Neon (https://neon.tech)
On Fri, Jun 9, 2023 at 1:25 PM Gurjeet Singh <gurjeet@singh.im> wrote:
> $ pgbench -i -I dtGvp -s 500
The steps are severely under-documented in pgbench --help output.
I agree it's not easy to find information. I just went through double checking I had the order recently enough to remember what I did. The man pages have this:
> Each step is invoked in the specified order. The default is dtgvp.
Which was what I wanted to read. Meanwhile the --help output says:
> -I, --init-steps=[dtgGvpf]+ (default "dtgvp")
%T$%%Which has the information without a lot of context for what it's used for. I'd welcome some text expansion that added a minimal but functional improvement to that the existing help output; I don't have such text.
On Fri, Jun 9, 2023 at 5:42 PM Gregory Smith <gregsmithpgsql@gmail.com> wrote: > > On Fri, Jun 9, 2023 at 1:25 PM Gurjeet Singh <gurjeet@singh.im> wrote: >> >> > $ pgbench -i -I dtGvp -s 500 >> >> The steps are severely under-documented in pgbench --help output. > > > I agree it's not easy to find information. I just went through double checking I had the order recently enough to rememberwhat I did. The man pages have this: > > > Each step is invoked in the specified order. The default is dtgvp. > > Which was what I wanted to read. Meanwhile the --help output says: > > > -I, --init-steps=[dtgGvpf]+ (default "dtgvp") > > %T$%%Which has the information without a lot of context for what it's used for. I'd welcome some text expansion that addeda minimal but functional improvement to that the existing help output; I don't have such text. Please see attached 2 variants of the patch. Variant 1 simply tells the reader to consult pgbench documentation. The second variant provides a description for each of the letters, as the documentation does. The committer can pick the one they find suitable. Best regards, Gurjeet http://Gurje.et
Attachment
On Fri, Jun 9, 2023 at 6:20 PM Gurjeet Singh <gurjeet@singh.im> wrote: > On Fri, Jun 9, 2023 at 5:42 PM Gregory Smith <gregsmithpgsql@gmail.com> wrote: > > On Fri, Jun 9, 2023 at 1:25 PM Gurjeet Singh <gurjeet@singh.im> wrote: > >> > >> > $ pgbench -i -I dtGvp -s 500 > >> > >> The steps are severely under-documented in pgbench --help output. > > > > I agree it's not easy to find information. I just went through double checking I had the order recently enough to rememberwhat I did. The man pages have this: > > > > > Each step is invoked in the specified order. The default is dtgvp. > > > > Which was what I wanted to read. Meanwhile the --help output says: > > > > > -I, --init-steps=[dtgGvpf]+ (default "dtgvp") > > > > %T$%%Which has the information without a lot of context for what it's used for. I'd welcome some text expansion thatadded a minimal but functional improvement to that the existing help output; I don't have such text. > > Please see attached 2 variants of the patch. Variant 1 simply tells > the reader to consult pgbench documentation. The second variant > provides a description for each of the letters, as the documentation > does. The committer can pick the one they find suitable. (I was short on time, so had to keep it short in the above email. To justify this additional email, I have added 2 more options to choose from. :) The text ", in the specified order" is an important detail, that should be included irrespective of the rest of the patch. My preference would be to use the first variant, since the second one feels too wordy for --help output. I believe we'll have to choose between these two; the alternatives will not make anyone happy. These two variants show the two extremes; bare minimum vs. everything but the kitchen sink. So one may feel the need to find a middle ground and provide a "sufficient, but not too much" variant. I have attempted that in variants 3 and 4; see attached. The third variant does away with the list of steps, and uses a paragraph to describe the letters. And the fourth variant makes that paragraph terse. In the order of preference I'd choose variant 1, then 2. Variants 3 and 4 feel like a significant degradation from variant 2. Attached samples.txt shows the snippets of --help output of each of the variants/patches, for comparison. Best regards, Gurjeet http://Gurje.et
Attachment
I think I am partial to number 2. Removing a context switch always leads to more productivity. -- Tristan Partin Neon (https://neon.tech)
On Thu Jun 8, 2023 at 11:38 AM CDT, Tristan Partin wrote: > On Thu Jun 8, 2023 at 12:33 AM CDT, David Rowley wrote: > > On Thu, 8 Jun 2023 at 07:16, Tristan Partin <tristan@neon.tech> wrote: > > > > > > master: > > > > > > 50000000 of 50000000 tuples (100%) done (elapsed 260.93 s, remaining 0.00 s)) > > > vacuuming... > > > creating primary keys... > > > done in 1414.26 s (drop tables 0.20 s, create tables 0.82 s, client-side generate 1280.43 s, vacuum 2.55 s, primarykeys 130.25 s). > > > > > > patchset: > > > > > > 50000000 of 50000000 tuples (100%) of pgbench_accounts done (elapsed 243.82 s, remaining 0.00 s)) > > > vacuuming... > > > creating primary keys... > > > done in 375.66 s (drop tables 0.14 s, create tables 0.73 s, client-side generate 246.27 s, vacuum 2.77 s, primary keys125.75 s). > > > > I've also previously found pgbench -i to be slow. It was a while ago, > > and IIRC, it was due to the printfPQExpBuffer() being a bottleneck > > inside pgbench. > > > > On seeing your email, it makes me wonder if PG16's hex integer > > literals might help here. These should be much faster to generate in > > pgbench and also parse on the postgres side. > > > > I wrote a quick and dirty patch to try that and I'm not really getting > > the same performance increases as I'd have expected. I also tested > > with your patch too and it does not look that impressive either when > > running pgbench on the same machine as postgres. > > I didn't expect my patch to increase performance in all workloads. I was > mainly aiming to fix high-latency connections. Based on your results > that looks like a 4% reduction in performance of client-side data > generation. I had thought maybe it is worth having a flag to keep the > old way too, but I am not sure a 4% hit is really that big of a deal. > > > pgbench copy speedup > > > > ** master > > drowley@amd3990x:~$ pgbench -i -s 1000 postgres > > 100000000 of 100000000 tuples (100%) done (elapsed 74.15 s, remaining 0.00 s) > > vacuuming... > > creating primary keys... > > done in 95.71 s (drop tables 0.00 s, create tables 0.01 s, client-side > > generate 74.45 s, vacuum 0.12 s, primary keys 21.13 s). > > > > ** David's Patched > > drowley@amd3990x:~$ pgbench -i -s 1000 postgres > > 100000000 of 100000000 tuples (100%) done (elapsed 69.64 s, remaining 0.00 s) > > vacuuming... > > creating primary keys... > > done in 90.22 s (drop tables 0.00 s, create tables 0.01 s, client-side > > generate 69.91 s, vacuum 0.12 s, primary keys 20.18 s). > > > > ** Tristan's patch > > drowley@amd3990x:~$ pgbench -i -s 1000 postgres > > 100000000 of 100000000 tuples (100%) of pgbench_accounts done (elapsed > > 77.44 s, remaining 0.00 s) > > vacuuming... > > creating primary keys... > > done in 98.64 s (drop tables 0.00 s, create tables 0.01 s, client-side > > generate 77.47 s, vacuum 0.12 s, primary keys 21.04 s). > > > > I'm interested to see what numbers you get. You'd need to test on > > PG16 however. I left the old code in place to generate the decimal > > numbers for versions < 16. > > I will try to test this soon and follow up on the thread. I definitely > see no problems with your patch as is though. I would be more than happy > to rebase my patches on yours. Finally got around to doing more benchmarking. Using an EC2 instance hosted in Ireland, and my client laptop in Austin, Texas. Workload: pgbench -i -s 500 master (9aee26a491) done in 1369.41 s (drop tables 0.21 s, create tables 0.72 s, client-side generate 1336.44 s, vacuum 1.02 s, primary keys31.03 s). done in 1318.31 s (drop tables 0.21 s, create tables 0.72 s, client-side generate 1282.67 s, vacuum 1.02 s, primary keys33.69 s). copy done in 307.42 s (drop tables 0.21 s, create tables 0.82 s, client-side generate 270.95 s, vacuum 1.02 s, primary keys 34.42s). david done in 1311.14 s (drop tables 0.72 s, create tables 0.72 s, client-side generate 1274.98 s, vacuum 0.94 s, primary keys33.79 s). done in 1340.18 s (drop tables 0.14 s, create tables 0.59 s, client-side generate 1304.78 s, vacuum 0.92 s, primary keys33.75 s). copy + david done in 348.70 s (drop tables 0.23 s, create tables 0.72 s, client-side generate 312.94 s, vacuum 0.92 s, primary keys 33.90s). I ran two tests for master and your patch David. For the last test, I adapted your patch onto mine. I am still seeing the huge performance gains on my branch. -- Tristan Partin Neon (https://neon.tech)
Here is a v2. It cleans up the output when printing to a tty. The last "x of y tuples" line gets overwritten now, so the final output looks like: dropping old tables... creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 0.14 s (drop tables 0.01 s, create tables 0.01 s, client-side generate 0.05 s, vacuum 0.03 s, primary keys 0.03 s). -- Tristan Partin Neon (https://neon.tech)
Again, I forget to actually attach. Holy guacamole. -- Tristan Partin Neon (https://neon.tech)
Attachment
On Wed, Jun 14, 2023 at 10:58:06AM -0500, Tristan Partin wrote: > Again, I forget to actually attach. Holy guacamole. Looks rather OK seen from here, applied 0001 while browsing the series. I have a few comments about 0002. static void -initGenerateDataClientSide(PGconn *con) +initBranch(PGconn *con, PQExpBufferData *sql, int64 curr) +{ + /* "filler" column defaults to NULL */ + printfPQExpBuffer(sql, + INT64_FORMAT "\t0\t\n", + curr + 1); +} + +static void +initTeller(PGconn *con, PQExpBufferData *sql, int64 curr) +{ + /* "filler" column defaults to NULL */ + printfPQExpBuffer(sql, + INT64_FORMAT "\t" INT64_FORMAT "\t0\t\n", + curr + 1, curr / ntellers + 1); +} + +static void +initAccount(PGconn *con, PQExpBufferData *sql, int64 curr) +{ + /* "filler" column defaults to blank padded empty string */ + printfPQExpBuffer(sql, + INT64_FORMAT "\t" INT64_FORMAT "\t0\t\n", + curr + 1, curr / naccounts + 1); +} I was looking at that, and while this strikes me as a duplication for the second and third ones, I'm OK with the use of a callback to fill in the data, even if naccounts and ntellers are equivalent to the "base" number given to initPopulateTable(), because the "filler" column has different expectations for each table. These routines don't need a PGconn argument, by the way. /* use COPY with FREEZE on v14 and later without partitioning */ if (partitions == 0 && PQserverVersion(con) >= 140000) - copy_statement = "copy pgbench_accounts from stdin with (freeze on)"; + copy_statement_fmt = "copy %s from stdin with (freeze on)"; else - copy_statement = "copy pgbench_accounts from stdin"; + copy_statement_fmt = "copy %s from stdin"; This seems a bit incorrect because partitioning only applies to pgbench_accounts, no? This change means that the teller and branch tables would not benefit from FREEZE under a pgbench compiled with this patch and a backend version of 14 or newer if --partitions is used. So, perhaps "partitions" should be an argument of initPopulateTable() specified for each table? -- Michael
Attachment
On Tue Jul 11, 2023 at 12:03 AM CDT, Michael Paquier wrote: > On Wed, Jun 14, 2023 at 10:58:06AM -0500, Tristan Partin wrote: > static void > -initGenerateDataClientSide(PGconn *con) > +initBranch(PGconn *con, PQExpBufferData *sql, int64 curr) > +{ > + /* "filler" column defaults to NULL */ > + printfPQExpBuffer(sql, > + INT64_FORMAT "\t0\t\n", > + curr + 1); > +} > + > +static void > +initTeller(PGconn *con, PQExpBufferData *sql, int64 curr) > +{ > + /* "filler" column defaults to NULL */ > + printfPQExpBuffer(sql, > + INT64_FORMAT "\t" INT64_FORMAT "\t0\t\n", > + curr + 1, curr / ntellers + 1); > +} > + > +static void > +initAccount(PGconn *con, PQExpBufferData *sql, int64 curr) > +{ > + /* "filler" column defaults to blank padded empty string */ > + printfPQExpBuffer(sql, > + INT64_FORMAT "\t" INT64_FORMAT "\t0\t\n", > + curr + 1, curr / naccounts + 1); > +} > > These routines don't need a PGconn argument, by the way. Nice catch! > /* use COPY with FREEZE on v14 and later without partitioning */ > if (partitions == 0 && PQserverVersion(con) >= 140000) > - copy_statement = "copy pgbench_accounts from stdin with (freeze on)"; > + copy_statement_fmt = "copy %s from stdin with (freeze on)"; > else > - copy_statement = "copy pgbench_accounts from stdin"; > + copy_statement_fmt = "copy %s from stdin"; > > This seems a bit incorrect because partitioning only applies to > pgbench_accounts, no? This change means that the teller and branch > tables would not benefit from FREEZE under a pgbench compiled with > this patch and a backend version of 14 or newer if --partitions is > used. So, perhaps "partitions" should be an argument of > initPopulateTable() specified for each table? Whoops, looks like I didn't read the comment for what the partitions variable means. It only applies to pgbench_accounts as you said. I don't think passing it in as an argument would make much sense. Let me know what you think about the change in this new version, which only hits the first portion of the `if` statement if the table is pgbench_accounts. -- Tristan Partin Neon (https://neon.tech)
Attachment
On Tue, Jul 11, 2023 at 09:46:43AM -0500, Tristan Partin wrote: > On Tue Jul 11, 2023 at 12:03 AM CDT, Michael Paquier wrote: >> This seems a bit incorrect because partitioning only applies to >> pgbench_accounts, no? This change means that the teller and branch >> tables would not benefit from FREEZE under a pgbench compiled with >> this patch and a backend version of 14 or newer if --partitions is >> used. So, perhaps "partitions" should be an argument of >> initPopulateTable() specified for each table? > > Whoops, looks like I didn't read the comment for what the partitions > variable means. It only applies to pgbench_accounts as you said. I don't > think passing it in as an argument would make much sense. Let me know > what you think about the change in this new version, which only hits the > first portion of the `if` statement if the table is pgbench_accounts. - /* use COPY with FREEZE on v14 and later without partitioning */ - if (partitions == 0 && PQserverVersion(con) >= 140000) - copy_statement = "copy pgbench_accounts from stdin with (freeze on)"; + if (partitions == 0 && strcmp(table, "pgbench_accounts") == 0 && PQserverVersion(con) >= 140000) + copy_statement_fmt = "copy %s from stdin with (freeze on)"; This would use the freeze option only on pgbench_accounts when no partitioning is defined, but my point was a bit different. We could use the FREEZE option on the teller and branch tables as well, no? Okay, the impact is limited compared to accounts in terms of amount of data loaded, but perhaps some people like playing with large scaling factors where this could show a benefit in the initial data loading. In passing, I have noticed the following sentence in pgbench.sgml: Using <literal>g</literal> causes logging to print one message every 100,000 rows while generating data for the <structname>pgbench_accounts</structname> table. It would become incorrect as the same code path would be used for the teller and branch tables. -- Michael
Attachment
On Wed Jul 12, 2023 at 1:06 AM CDT, Michael Paquier wrote: > On Tue, Jul 11, 2023 at 09:46:43AM -0500, Tristan Partin wrote: > > On Tue Jul 11, 2023 at 12:03 AM CDT, Michael Paquier wrote: > >> This seems a bit incorrect because partitioning only applies to > >> pgbench_accounts, no? This change means that the teller and branch > >> tables would not benefit from FREEZE under a pgbench compiled with > >> this patch and a backend version of 14 or newer if --partitions is > >> used. So, perhaps "partitions" should be an argument of > >> initPopulateTable() specified for each table? > > > > Whoops, looks like I didn't read the comment for what the partitions > > variable means. It only applies to pgbench_accounts as you said. I don't > > think passing it in as an argument would make much sense. Let me know > > what you think about the change in this new version, which only hits the > > first portion of the `if` statement if the table is pgbench_accounts. > > - /* use COPY with FREEZE on v14 and later without partitioning */ > - if (partitions == 0 && PQserverVersion(con) >= 140000) > - copy_statement = "copy pgbench_accounts from stdin with (freeze on)"; > + if (partitions == 0 && strcmp(table, "pgbench_accounts") == 0 && PQserverVersion(con) >= 140000) > + copy_statement_fmt = "copy %s from stdin with (freeze on)"; > > This would use the freeze option only on pgbench_accounts when no > partitioning is defined, but my point was a bit different. We could > use the FREEZE option on the teller and branch tables as well, no? > Okay, the impact is limited compared to accounts in terms of amount of > data loaded, but perhaps some people like playing with large scaling > factors where this could show a benefit in the initial data loading. Perhaps, should they all be keyed off the same option? Seemed like in your previous comment you wanted multiple options. Sorry for not reading your comment correctly. > In passing, I have noticed the following sentence in pgbench.sgml: > Using <literal>g</literal> causes logging to print one message > every 100,000 rows while generating data for the > <structname>pgbench_accounts</structname> table. > It would become incorrect as the same code path would be used for the > teller and branch tables. I will amend the documentation to mention all tables rather than being specific to pgbench_accounts in the next patch revision pending what you want to do about the partition CLI argument. -- Tristan Partin Neon (https://neon.tech)
On Wed, Jul 12, 2023 at 09:29:35AM -0500, Tristan Partin wrote: > On Wed Jul 12, 2023 at 1:06 AM CDT, Michael Paquier wrote: >> This would use the freeze option only on pgbench_accounts when no >> partitioning is defined, but my point was a bit different. We could >> use the FREEZE option on the teller and branch tables as well, no? >> Okay, the impact is limited compared to accounts in terms of amount of >> data loaded, but perhaps some people like playing with large scaling >> factors where this could show a benefit in the initial data loading. > > Perhaps, should they all be keyed off the same option? Seemed like in > your previous comment you wanted multiple options. Sorry for not reading > your comment correctly. I would have though that --partition should only apply to the pgbench_accounts table, while FREEZE should apply where it is possible to use it, aka all the COPY queries except when pgbench_accounts is a partition. Would you do something different, like not applying FREEZE to pgbench_tellers and pgbench_branches as these have much less tuples than pgbench_accounts? -- Michael
Attachment
On Wed Jul 12, 2023 at 10:52 PM CDT, Michael Paquier wrote: > On Wed, Jul 12, 2023 at 09:29:35AM -0500, Tristan Partin wrote: > > On Wed Jul 12, 2023 at 1:06 AM CDT, Michael Paquier wrote: > >> This would use the freeze option only on pgbench_accounts when no > >> partitioning is defined, but my point was a bit different. We could > >> use the FREEZE option on the teller and branch tables as well, no? > >> Okay, the impact is limited compared to accounts in terms of amount of > >> data loaded, but perhaps some people like playing with large scaling > >> factors where this could show a benefit in the initial data loading. > > > > Perhaps, should they all be keyed off the same option? Seemed like in > > your previous comment you wanted multiple options. Sorry for not reading > > your comment correctly. > > I would have though that --partition should only apply to the > pgbench_accounts table, while FREEZE should apply where it is possible > to use it, aka all the COPY queries except when pgbench_accounts is a > partition. Would you do something different, like not applying FREEZE > to pgbench_tellers and pgbench_branches as these have much less tuples > than pgbench_accounts? Michael, I think I completely misunderstood you. From what I can tell, you want to use FREEZE wherever possible. I think the new patch covers your comment and fixes the documentation. I am hoping that I have finally gotten what you are looking for. -- Tristan Partin Neon (https://neon.tech)
Attachment
Didn't actually include the changes in the previous patch. -- Tristan Partin Neon (https://neon.tech)
Attachment
On Wed, Jul 19, 2023 at 01:03:21PM -0500, Tristan Partin wrote: > Didn't actually include the changes in the previous patch. -initGenerateDataClientSide(PGconn *con) +initBranch(PQExpBufferData *sql, int64 curr) { - PQExpBufferData sql; + /* "filler" column defaults to NULL */ + printfPQExpBuffer(sql, + INT64_FORMAT "\t0\t\n", + curr + 1); +} + +static void +initTeller(PQExpBufferData *sql, int64 curr) +{ + /* "filler" column defaults to NULL */ + printfPQExpBuffer(sql, + INT64_FORMAT "\t" INT64_FORMAT "\t0\t\n", + curr + 1, curr / ntellers + 1); Hmm. Something's not right here, see: =# select count(*) has_nulls from pgbench_accounts where filler is null; has_nulls ----------- 0 (1 row) =# select count(*) > 0 has_nulls from pgbench_tellers where filler is null; has_nulls ----------- f (1 row) =# select count(*) > 0 has_nulls from pgbench_branches where filler is null; has_nulls ----------- f (1 row) Note as well this comment in initCreateTables(): /* * Note: TPC-B requires at least 100 bytes per row, and the "filler" * fields in these table declarations were intended to comply with that. * The pgbench_accounts table complies with that because the "filler" * column is set to blank-padded empty string. But for all other tables * the columns default to NULL and so don't actually take any space. We * could fix that by giving them non-null default values. However, that * would completely break comparability of pgbench results with prior * versions. Since pgbench has never pretended to be fully TPC-B compliant * anyway, we stick with the historical behavior. */ So this patch causes pgbench to not stick with its historical behavior, and the change is incompatible with the comments because the tellers and branches tables don't use NULL for their filler attribute anymore. -- Michael
Attachment
On Wed Jul 19, 2023 at 10:07 PM CDT, Michael Paquier wrote: > So this patch causes pgbench to not stick with its historical > behavior, and the change is incompatible with the comments because the > tellers and branches tables don't use NULL for their filler attribute > anymore. Great find. This was a problem of me just not understanding the COPY command properly. Relevant documentation snippet: > Specifies the string that represents a null value. The default is \N > (backslash-N) in text format, and an unquoted empty string in CSV > format. You might prefer an empty string even in text format for cases > where you don't want to distinguish nulls from empty strings. This > option is not allowed when using binary format. This new revision populates the column with the NULL value. > psql (17devel) > Type "help" for help. > > tristan957=# select count(1) from pgbench_branches; > count > ------- > 1 > (1 row) > > tristan957=# select count(1) from pgbench_branches where filler is null; > count > ------- > 1 > (1 row) Thanks for your testing Michael. I went ahead and added a test to make sure that this behavior doesn't regress accidentally, but I am struggling to get the test to fail using the previous version of this patch. Do you have any advice? This is my first time writing a test for Postgres. I can recreate the issue outside of the test script, but not within it for whatever reason. -- Tristan Partin Neon (https://neon.tech)
Attachment
On Thu, Jul 20, 2023 at 02:22:51PM -0500, Tristan Partin wrote: > Thanks for your testing Michael. I went ahead and added a test to make sure > that this behavior doesn't regress accidentally, but I am struggling to get > the test to fail using the previous version of this patch. Do you have any > advice? This is my first time writing a test for Postgres. I can recreate > the issue outside of the test script, but not within it for whatever reason. We're all here to learn, and writing TAP tests is important these days for a lot of patches. +# Check that the pgbench_branches and pgbench_tellers filler columns are filled +# with NULLs +foreach my $table ('pgbench_branches', 'pgbench_tellers') { + my ($ret, $out, $err) = $node->psql( + 'postgres', + "SELECT COUNT(1) FROM $table; + SELECT COUNT(1) FROM $table WHERE filler is NULL;", + extra_params => ['--no-align', '--tuples-only']); + + is($ret, 0, "psql $table counts status is 0"); + is($err, '', "psql $table counts stderr is empty"); + if ($out =~ m/^(\d+)\n(\d+)$/g) { + is($1, $2, "psql $table filler column filled with NULLs"); + } else { + fail("psql $table stdout m/^(\\d+)\n(\\d+)$/g"); + } +} This is IMO hard to parse, and I'd rather add some checks for the accounts and history tables as well. Let's use four simple SQL queries like what I posted upthread (no data for history inserted after initialization), as of the attached. I'd be tempted to apply that first as a separate patch, because we've never add coverage for that and we have specific expectations in the code from this filler column for tpc-b. And this needs to cover both client-side and server-side data generation. Note that the indentation was a bit incorrect, so fixed while on it. Attached is a v7, with these tests (should be a patch on its own but I'm lazy to split this morning) and some more adjustments that I have done while going through the patch. What do you think? -- Michael
Attachment
On Thu Jul 20, 2023 at 9:14 PM CDT, Michael Paquier wrote: > Attached is a v7, with these tests (should be a patch on its own but > I'm lazy to split this morning) and some more adjustments that I have > done while going through the patch. What do you think? v7 looks good from my perspective. Thanks for working through this patch with me. Much appreciated. -- Tristan Partin Neon (https://neon.tech)
On Fri, Jul 21, 2023 at 12:22:06PM -0500, Tristan Partin wrote: > v7 looks good from my perspective. Thanks for working through this patch > with me. Much appreciated. Cool. I have applied the new tests for now to move on with this thread. -- Michael
Attachment
On Sun, Jul 23, 2023 at 08:21:51PM +0900, Michael Paquier wrote: > Cool. I have applied the new tests for now to move on with this > thread. I have done a few more things on this patch today, including measurements with a local host and large scaling numbers. One of my hosts was taking for instance around 36.8s with scale=500 using the INSERTs and 36.5s with the COPY when loading data (average of 4 runs) with -I dtg. Greg's upthread point is true as well that for high latency the server-side generation is the most adapted option, but I don't see a reason to not switch to a COPY as this option is hidden behind -I, just for the sake of improving the default option set. So, at the end, applied. -- Michael
Attachment
Michael, Once again I appreciate your patience with this patchset. Thanks for your help and reviews. -- Tristan Partin Neon (https://neon.tech)