Thread: CPU costs of random_zipfian in pgbench
Hi, I'm trying to use random_zipfian() for benchmarking of skewed data sets, and I ran head-first into an issue with rather excessive CPU costs. Consider an example like this: pgbench -i -s 10000 test pgbench -s 10000 -f zipf.sql -T 30 test where zipf.sql does this: \SET id random_zipfian(1, 100000 * :scale, 0.1) BEGIN; SELECT * FROM pgbench_accounts WHERE aid = :id; END; Unfortunately, this produces something like this: transaction type: zipf.sql scaling factor: 10000 query mode: simple number of clients: 1 number of threads: 1 duration: 30 s number of transactions actually processed: 1 latency average = 43849.143 ms tps = 0.022805 (including connections establishing) tps = 0.022806 (excluding connections establishing) which is somewhat ... not great, I guess. This happens because generalizedHarmonicNumber() does this: for (i = n; i > 1; i--) ans += pow(i, -s); where n happens to be 1000000000 (range passed to random_zipfian), so the loop takes quite a bit of time. So much that we only ever complete a single transaction, because this work happens in the context of the first transction, and so it counts into the 30-second limit. The docs actually do mention performance of this function: The function's performance is poor for parameter values close and above 1.0 and on a small range. But that does not seem to cover the example I just posted, because 0.1 is not particularly close to 1.0 (or above 1.0), and 1e9 values hardly counts as "small range". I see this part of random_zipfian comes from "Quickly Generating Billion-Record Synthetic Databases" which deals with generating data sets, where wasting a bit of time is not a huge deal. But when running benchmarks it matters much more. So maybe there's a disconnect here? Interestingly enough, I don't see this issue on values above 1.0, no matter how close to 1.0 those are. Although the throughput seems lower than with uniform access, so this part of random_zipfian is not quite cheap either. Now, I don't know what to do about this. Ideally, we'd have a faster algorithm to generate zipfian distributions - I don't know if such thing exists though. Or maybe we could precompute the distribution first, not counting it into the benchmark duration. But I guess the very least we can/should do is improving the docs to make it more obvious which cases are expected to be slow. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello Tomas, > I'm trying to use random_zipfian() for benchmarking of skewed data sets, > and I ran head-first into an issue with rather excessive CPU costs. > [...] This happens because generalizedHarmonicNumber() does this: > > for (i = n; i > 1; i--) > ans += pow(i, -s); > > where n happens to be 1000000000 (range passed to random_zipfian), so > the loop takes quite a bit of time. If you find a better formula for the harmonic number, you are welcome and probably get your name on it:-) > So much that we only ever complete a > single transaction, because this work happens in the context of the > first transction, and so it counts into the 30-second limit. That is why the value is cached, so it is done once per size & value. If you want skewed but not especially zipfian, use exponential which is quite cheap. Also zipfian with a > 1.0 parameter does not have to compute the harmonic number, so it depends in the parameter. Please also beware that non uniform keys are correlated (the more frequent are close values), which is somewhat non representative of what you would expect in real life. This is why I submitted a pseudo-random permutation function, which alas does not get much momentum from committers. > The docs actually do mention performance of this function: > > The function's performance is poor for parameter values close and > above 1.0 and on a small range. > > But that does not seem to cover the example I just posted, because 0.1 > is not particularly close to 1.0 (or above 1.0), and 1e9 values hardly > counts as "small range". Yep. The warning is about the repeated cost of calling random_zipfian, which is not good when the parameter is close to and above 1.0. It is not about the setup cost when the value is between 0 and &. This could indeed deserve a warning. Now if you do benchmarking on a database, probably you want to run hours of to level out checkpointing and other background tasks, so the setup cost should be negligeable, in the end... > I see this part of random_zipfian comes from "Quickly Generating > Billion-Record Synthetic Databases" which deals with generating data > sets, where wasting a bit of time is not a huge deal. But when running > benchmarks it matters much more. So maybe there's a disconnect here? Hmmm. The first author of this paper got a Turing award:-) The disconnect is just that the setup cost is neglected, or computed offline. > Interestingly enough, I don't see this issue on values above 1.0, no > matter how close to 1.0 those are. Although the throughput seems lower > than with uniform access, so this part of random_zipfian is not quite > cheap either. Indeed. Pg provides an underlying uniform pseudo-random function, so generating uniform is cheap. Others need more or less expensive transformations. > Now, I don't know what to do about this. Ideally, we'd have a faster > algorithm to generate zipfian distributions You are welcome to find one, and get famous (hmmm... among some specialized mathematicians at least:-) for it. > - I don't know if such thing exists though. Or maybe we could precompute > the distribution first, not counting it into the benchmark duration. Could be done, but this would require significant partial evaluation efforts and only work when the size and parameter are constants (although using the function with variable parameters would be a very bad idea anyway). > But I guess the very least we can/should do is improving the docs to > make it more obvious which cases are expected to be slow. Yep. Attached is a doc & comment improvement. -- Fabien.
Attachment
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> I'm trying to use random_zipfian() for benchmarking of skewed data sets, >> and I ran head-first into an issue with rather excessive CPU costs. > If you want skewed but not especially zipfian, use exponential which is > quite cheap. Also zipfian with a > 1.0 parameter does not have to compute > the harmonic number, so it depends in the parameter. Maybe we should drop support for parameter values < 1.0, then. The idea that pgbench is doing something so expensive as to require caching seems flat-out insane from here. That cannot be seen as anything but a foot-gun for unwary users. Under what circumstances would an informed user use that random distribution rather than another far-cheaper-to-compute one? > ... This is why I submitted a pseudo-random permutation > function, which alas does not get much momentum from committers. TBH, I think pgbench is now much too complex; it does not need more features, especially not ones that need large caveats in the docs. (What exactly is the point of having zipfian at all?) regards, tom lane
On Sun, Feb 17, 2019 at 11:09:27AM -0500, Tom Lane wrote: > Fabien COELHO <coelho@cri.ensmp.fr> writes: > >> I'm trying to use random_zipfian() for benchmarking of skewed data sets, > >> and I ran head-first into an issue with rather excessive CPU costs. > > > If you want skewed but not especially zipfian, use exponential which is > > quite cheap. Also zipfian with a > 1.0 parameter does not have to compute > > the harmonic number, so it depends in the parameter. > > Maybe we should drop support for parameter values < 1.0, then. The idea > that pgbench is doing something so expensive as to require caching seems > flat-out insane from here. That cannot be seen as anything but a foot-gun > for unwary users. Under what circumstances would an informed user use > that random distribution rather than another far-cheaper-to-compute one? > > > ... This is why I submitted a pseudo-random permutation > > function, which alas does not get much momentum from committers. > > TBH, I think pgbench is now much too complex; it does not need more > features, especially not ones that need large caveats in the docs. > (What exactly is the point of having zipfian at all?) Taking a statistical perspective, Zipfian distributions violate some assumptions we make by assuming uniform distributions. This matters because Zipf-distributed data sets are quite common in real life. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 2/17/19 6:33 PM, David Fetter wrote: > On Sun, Feb 17, 2019 at 11:09:27AM -0500, Tom Lane wrote: >> Fabien COELHO <coelho@cri.ensmp.fr> writes: >>>> I'm trying to use random_zipfian() for benchmarking of skewed data sets, >>>> and I ran head-first into an issue with rather excessive CPU costs. >> >>> If you want skewed but not especially zipfian, use exponential which is >>> quite cheap. Also zipfian with a > 1.0 parameter does not have to compute >>> the harmonic number, so it depends in the parameter. >> >> Maybe we should drop support for parameter values < 1.0, then. The idea >> that pgbench is doing something so expensive as to require caching seems >> flat-out insane from here. That cannot be seen as anything but a foot-gun >> for unwary users. Under what circumstances would an informed user use >> that random distribution rather than another far-cheaper-to-compute one? >> >>> ... This is why I submitted a pseudo-random permutation >>> function, which alas does not get much momentum from committers. >> >> TBH, I think pgbench is now much too complex; it does not need more >> features, especially not ones that need large caveats in the docs. >> (What exactly is the point of having zipfian at all?) > > Taking a statistical perspective, Zipfian distributions violate some > assumptions we make by assuming uniform distributions. This matters > because Zipf-distributed data sets are quite common in real life. > I don't think there's any disagreement about the value of non-uniform distributions. The question is whether it has to be a zipfian one, when the best algorithm we know about is this expensive in some cases? Or would an exponential distribution be enough? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2/17/19 5:09 PM, Tom Lane wrote: > Fabien COELHO <coelho@cri.ensmp.fr> writes: >>> I'm trying to use random_zipfian() for benchmarking of skewed data sets, >>> and I ran head-first into an issue with rather excessive CPU costs. > >> If you want skewed but not especially zipfian, use exponential which is >> quite cheap. Also zipfian with a > 1.0 parameter does not have to compute >> the harmonic number, so it depends in the parameter. > > Maybe we should drop support for parameter values < 1.0, then. The idea > that pgbench is doing something so expensive as to require caching seems > flat-out insane from here. Maybe. It's not quite clear to me why we support the two modes at all? We use one algorithm for values < 1.0 and another one for values > 1.0, what's the difference there? Are those distributions materially different? Also, I wonder if just dropping support for parameters < 1.0 would be enough, because the docs say: The function's performance is poor for parameter values close and above 1.0 and on a small range. which seems to suggest it might be slow even for values > 1.0 in some cases. Not sure. > That cannot be seen as anything but a foot-gun > for unwary users. Under what circumstances would an informed user use > that random distribution rather than another far-cheaper-to-compute one? > >> ... This is why I submitted a pseudo-random permutation >> function, which alas does not get much momentum from committers. > > TBH, I think pgbench is now much too complex; it does not need more > features, especially not ones that need large caveats in the docs. > (What exactly is the point of having zipfian at all?) > I wonder about the growing complexity of pgbench too ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Feb 17, 2019 at 8:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > TBH, I think pgbench is now much too complex; it does not need more > features, especially not ones that need large caveats in the docs. > (What exactly is the point of having zipfian at all?) I agree that pgbench is too complex, given its mandate and design. While I found Zipfian useful once or twice, I probably would have done just as well with an exponential distribution. I have been using BenchmarkSQL as a fair-use TPC-C implementation for my indexing project, with great results. pgbench just isn't very useful when validating the changes to B-Tree page splits that I propose, because the insertion pattern cannot be modeled probabilistically. Besides, I really think that things like latency graphs are table stakes for this kind of work, which BenchmarkSQL offers out of the box. It isn't practical to make pgbench into a framework, which is what I'd really like to see. There just isn't that much more than can be done there. BenchmarkSQL seems to have recently become abandonware, though it's abandonware that I rely on. OpenSCG were acquired by Amazon, and the Bitbucket repository vanished without explanation. I would be very pleased if Fabien or somebody else considered maintaining it going forward -- it still has a lot of rough edges, and still could stand to be improved in a number of ways (I know that Fabien is interested in both indexing and benchmarking, which is why I thought of him). TPC-C is a highly studied benchmark, and I'm sure that there is more we can learn from it. I maintain a mirror of BenchmarkSQL here: https://github.com/petergeoghegan/benchmarksql/ There are at least 2 or 3 other fair-use implementations of TPC-C that work with Postgres that I'm aware of, all of which seem to have several major problems. BenchmarkSQL is a solid basis for an externally maintained, defacto standard Postgres benchmarking tool that comes with "batteries included". -- Peter Geoghegan
Hello Peter, My 0.02€: I'm not quite interested in maintaining a tool for *one* benchmark, whatever the benchmark, its standardness or quality. What I like in "pgbench" is that it is both versatile and simple so that people can benchmark their own data with their own load and their own queries by writing a few lines of trivial SQL and psql-like slash command and adjusting a few options, and extract meaningful statistics out of it. I've been, but not only me, improving it so that it keeps its usage simplicity but provides key features so that anyone can write a simple but realistic benchmark. The key features needed for that, and which happen to be nearly all there now are: - some expressions (thanks Roberts for the initial push) - non uniform random (ok, some are more expensive, too bad) however using non uniform random generates a correlation issue, hence the permutation function submission, which took time because this is a non trivial problem. - conditionals (\if, taken from psql's implementation) - getting a result out and being able to do something with it (\gset, and the associated \cset that Tom does not like). - improved reporting (including around latency, per script/command/...) - realistic loads (--rate vs only pedal-to-the-metal runs, --latency-limit) I have not encountered other tools with this versatility and simplicity. The TPC-C implementation you point out and others I have seen are structurally targetted at TPC-C and nothing else. I do not care about TPC-C per se, I care about people being able to run relevant benchmarks with minimal effort. I'm not planning to submit many things in the future (current: a strict-tpcb implementation which is really of show case of the existing features, faster server-side initialization, simple refactoring to simplify/clarify the code structure here and there, maybe some stuff may migrate to fe_utils if useful to psql), and review what other people find useful because I know the code base quite well. I do thing that the maintainability of the code has globally been improved recently because (1) the process-based implementation has been dropped (2) the FSA implementation makes the code easier to understand and check, compared to the lengthy plenty-of-if many-variables function used beforehand. Bugs have been identified and fixed. > I agree that pgbench is too complex, given its mandate and design. > While I found Zipfian useful once or twice, I probably would have done > just as well with an exponential distribution. Yep, I agree that exponential is mostly okay for most practical benchmarking uses, but some benchmark/people seem to really want zipf, so zipf and its intrinsic underlying complexity was submitted and finally included. > I have been using BenchmarkSQL as a fair-use TPC-C implementation for > my indexing project, with great results. pgbench just isn't very > useful when validating the changes to B-Tree page splits that I > propose, because the insertion pattern cannot be modeled > probabilistically. I do not understand the use case, and why pgbench could not be used for this purpose. > Besides, I really think that things like latency graphs are table stakes > for this kind of work, which BenchmarkSQL offers out of the box. It > isn't practical to make pgbench into a framework, which is what I'd > really like to see. There just isn't that much more than can be done > there. Yep. Pgbench only does "simple stats". I script around the per-second progress output for graphical display and additional stats (eg 5 number summary…). -- Fabien.
On Sun, Feb 17, 2019 at 11:02:37PM +0100, Tomas Vondra wrote: > On 2/17/19 6:33 PM, David Fetter wrote: > > On Sun, Feb 17, 2019 at 11:09:27AM -0500, Tom Lane wrote: > >> Fabien COELHO <coelho@cri.ensmp.fr> writes: > >>>> I'm trying to use random_zipfian() for benchmarking of skewed data sets, > >>>> and I ran head-first into an issue with rather excessive CPU costs. > >> > >>> If you want skewed but not especially zipfian, use exponential which is > >>> quite cheap. Also zipfian with a > 1.0 parameter does not have to compute > >>> the harmonic number, so it depends in the parameter. > >> > >> Maybe we should drop support for parameter values < 1.0, then. The idea > >> that pgbench is doing something so expensive as to require caching seems > >> flat-out insane from here. That cannot be seen as anything but a foot-gun > >> for unwary users. Under what circumstances would an informed user use > >> that random distribution rather than another far-cheaper-to-compute one? > >> > >>> ... This is why I submitted a pseudo-random permutation > >>> function, which alas does not get much momentum from committers. > >> > >> TBH, I think pgbench is now much too complex; it does not need more > >> features, especially not ones that need large caveats in the docs. > >> (What exactly is the point of having zipfian at all?) > > > > Taking a statistical perspective, Zipfian distributions violate some > > assumptions we make by assuming uniform distributions. This matters > > because Zipf-distributed data sets are quite common in real life. > > > > I don't think there's any disagreement about the value of non-uniform > distributions. The question is whether it has to be a zipfian one, when > the best algorithm we know about is this expensive in some cases? Or > would an exponential distribution be enough? I suppose to people who care about the difference between Zipf and exponential would appreciate having the former around to test. Whether pgbench should support this is a different question, and it's sounding a little like the answer to that one is "no." Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Tue, Feb 19, 2019 at 7:14 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote: > What I like in "pgbench" is that it is both versatile and simple so that > people can benchmark their own data with their own load and their own > queries by writing a few lines of trivial SQL and psql-like slash command > and adjusting a few options, and extract meaningful statistics out of it. That's also what I like about it. However, I don't think that pgbench is capable of helping me answer questions that are not relatively simple. That is going to become less and less interesting over time. > I have not encountered other tools with this versatility and simplicity. > The TPC-C implementation you point out and others I have seen are > structurally targetted at TPC-C and nothing else. I do not care about > TPC-C per se, I care about people being able to run relevant benchmarks > with minimal effort. Lots and lots of people care about TPC-C. Far more than care about TPC-B, which has been officially obsolete for a long time. I don't doubt that there are some bad reasons for the interest that you see from vendors, but the TPC-C stuff has real merit (just read Jim Gray, who you referenced in relation to the Zipfian generator). Lots of smart people worked for a couple of years on the original specification of TPC-C. There is a lot of papers on TPC-C. It *is* complicated in various ways, which is a good thing, as it approximates a real-world workload, and exercises a bunch of code paths that TPC-B does not. TPC-A and TPC-B were early attempts, and managed to be better than nothing at a time when performance validation was not nearly as advanced as it is today. > > I have been using BenchmarkSQL as a fair-use TPC-C implementation for > > my indexing project, with great results. pgbench just isn't very > > useful when validating the changes to B-Tree page splits that I > > propose, because the insertion pattern cannot be modeled > > probabilistically. > > I do not understand the use case, and why pgbench could not be used for > this purpose. TPC-C is characterized by *localized* monotonically increasing insertions in most of its indexes. By far the biggest index is the order lines table primary key, which is on '(ol_w_id, ol_d_id, ol_o_id, ol_number)'. You get pathological performance with this currently, because you should really to split at the point that new items are inserted at, but we do a standard 50/50 page split. The left half of the page isn't inserted into again (except by rare non-HOT updates), so you end up *reliably* wasting about half of all space in the index. IOW, there are cases where we should behave like we're doing a rightmost page split (kind of), that don't happen to involve the rightmost page. The problem was described but not diagnosed in this blog post: https://www.commandprompt.com/blog/postgres_autovacuum_bloat_tpc-c/ If you had random insertions (or insertions that were characterized or defined in terms of a probability distribution and range), then you would not see this problem. Instead, you'd get something like 70% space utilization -- not 50% utilization. I think that it would be difficult if not impossible to reproduce the pathological performance with pgbench, even though it's a totally realistic scenario. There needs to be explicit overall ordering/phases across co-operating backends, or backends that are in some sense localized (e.g. associated with a particular warehouse inputting a particular order). TPC-C offers several variations of this same pathological case. This is just an example. The point is that there is a lot to be said for investing significant effort in coming up with a benchmark that is a distillation of a real workload, with realistic though still kind of adversarial bottlenecks. I wouldn't have become aware of the page split problem without TPC-C, which suggests to me that the TPC people know what they're doing. Also, there is an advantage to having something that is a known quantity, that enables comparisons across systems. I also think that TPC-E is interesting, since it stresses OLTP systems in a way that is quite different to TPC-C. It's much more read-heavy, and has many more secondary indexes. > Yep. Pgbench only does "simple stats". I script around the per-second > progress output for graphical display and additional stats (eg 5 number > summary…). It's far easier to spot regressions over time and other such surprises if you have latency graphs that break down latency by transaction. When you're benchmarking queries with joins, then you need to be vigilant of planner issues over time. The complexity has its pluses as well as its minuses. I'm hardly in a position to tell you what to work on. I think that there may be another perspective on this that you could take something away from, though. -- Peter Geoghegan
On Sun, Feb 17, 2019 at 10:52 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> I'm trying to use random_zipfian() for benchmarking of skewed data sets,
> and I ran head-first into an issue with rather excessive CPU costs.
> [...] This happens because generalizedHarmonicNumber() does this:
>
> for (i = n; i > 1; i--)
> ans += pow(i, -s);
>
> where n happens to be 1000000000 (range passed to random_zipfian), so
> the loop takes quite a bit of time.
If you find a better formula for the harmonic number, you are welcome
and probably get your name on it:-)
There are pretty good approximations for s > 1.0 using Riemann zeta function and Euler derived a formula for the s = 1 case.
I also noticed that i is int in this function, but n is int64. That seems like an oversight.
Regards,
Ants Aasma
On 2/22/19 11:22 AM, Ants Aasma wrote: > On Sun, Feb 17, 2019 at 10:52 AM Fabien COELHO <coelho@cri.ensmp.fr > <mailto:coelho@cri.ensmp.fr>> wrote: > > > I'm trying to use random_zipfian() for benchmarking of skewed data > sets, > > and I ran head-first into an issue with rather excessive CPU costs. > > [...] This happens because generalizedHarmonicNumber() does this: > > > > for (i = n; i > 1; i--) > > ans += pow(i, -s); > > > > where n happens to be 1000000000 (range passed to random_zipfian), so > > the loop takes quite a bit of time. > > If you find a better formula for the harmonic number, you are welcome > and probably get your name on it:-) > > > There are pretty good approximations for s > 1.0 using Riemann zeta > function and Euler derived a formula for the s = 1 case. > I believe that's what random_zipfian() already uses, because for s > 1.0 it refers to "Non-Uniform Random Variate Generation" by Luc Devroye, and the text references the zeta function. Also, I have not observed serious issues with the s > 1.0 case (despite the docs seem to suggest there may be some). > I also noticed that i is int in this function, but n is int64. That > seems like an oversight. > Indeed. > Regards, > Ants Aasma > > cheers -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>> There are pretty good approximations for s > 1.0 using Riemann zeta >> function and Euler derived a formula for the s = 1 case. > > I believe that's what random_zipfian() already uses, because for s > 1.0 > it refers to "Non-Uniform Random Variate Generation" by Luc Devroye, and > the text references the zeta function. Yep. > Also, I have not observed serious issues with the s > 1.0 case (despite > the docs seem to suggest there may be some). The performance issue is for s > 1.0 and very close to 1.0, et things like s = 1.000001 >> I also noticed that i is int in this function, but n is int64. That >> seems like an oversight. Indeed, that is a bug! Using it for a really int64 value would be very bad, though. Maybe there should be an error if the value is too large, because calling pow billions of times is bad for the computer health. -- Fabien.
>>> I also noticed that i is int in this function, but n is int64. That >>> seems like an oversight. > > Indeed, that is a bug! Here is a v2 with hopefully better wording, comments and a fix for the bug you pointed out. -- Fabien.
Attachment
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation: not tested For whatever it is worth, the patch looks good to me. A minor nitpick would be to use a verb in the part: `cost when the parameter in (0, 1)` maybe: `cost when the parameter's value is in (0, 1)` or similar. Apart from that, I would suggest it that the patch could be moved to Waiting for Author state.
> For whatever it is worth, the patch looks good to me. > > A minor nitpick would be to use a verb in the part: > > `cost when the parameter in (0, 1)` > maybe: > > `cost when the parameter's value is in (0, 1)` or similar. Looks ok. > Apart from that, I would suggest it that the patch could be moved to > Waiting for Author state. Attached an upated. -- Fabien.
Attachment
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation: not tested Version 3 of the patch looks ready for committer. Thank you for taking the time to code! The new status of this patch is: Ready for Committer
Fabien COELHO <coelho@cri.ensmp.fr> writes: > [ pgbench-zipf-doc-3.patch ] I started to look through this, and the more I looked the more unhappy I got that we're having this discussion at all. The zipfian support in pgbench is seriously over-engineered and under-documented. As an example, I was flabbergasted to find out that the end-of-run summary statistics now include this: /* Report zipfian cache overflow */ for (i = 0; i < nthreads; i++) { totalCacheOverflows += threads[i].zipf_cache.overflowCount; } if (totalCacheOverflows > 0) { printf("zipfian cache array overflowed %d time(s)\n", totalCacheOverflows); } What is the point of that, and if there is a point, why is it nowhere mentioned in pgbench.sgml? What would a user do with this information, and how would they know what to do? I remain of the opinion that we ought to simply rip out support for zipfian with s < 1. It's not useful for benchmarking purposes to have a random-number function with such poor computational properties. I think leaving it in there is just a foot-gun: we'd be a lot better off throwing an error that tells people to use some other distribution. Or if we do leave it in there, we for sure have to have documentation that *actually* explains how to use it, which this patch still doesn't. There's nothing suggesting that you'd better not use a large number of different (n,s) combinations. regards, tom lane
Hello Tom, > I started to look through this, and the more I looked the more unhappy > I got that we're having this discussion at all. The zipfian support > in pgbench is seriously over-engineered and under-documented. As an > example, I was flabbergasted to find out that the end-of-run summary > statistics now include this: > > /* Report zipfian cache overflow */ > for (i = 0; i < nthreads; i++) > { > totalCacheOverflows += threads[i].zipf_cache.overflowCount; > } > if (totalCacheOverflows > 0) > { > printf("zipfian cache array overflowed %d time(s)\n", totalCacheOverflows); > } > > What is the point of that, and if there is a point, why is it nowhere > mentioned in pgbench.sgml? Indeed, there should. > What would a user do with this information, and how would they know what > to do? Sure, but it was unclear what to do. Extending the cache to avoid that would look like over-engineering. > I remain of the opinion that we ought to simply rip out support for > zipfian with s < 1. Some people really want zipfian because it reflects their data access pattern, possibly with s < 1. We cannot helpt it: real life seems zipfianly distributed:-) > It's not useful for benchmarking purposes to have a random-number > function with such poor computational properties. This is mostly a startup cost, the generation cost when a bench is running is reasonable. How to best implement the precomputation is an open question. As a reviewer I was not thrilled by the cache stuff, but I had no better idea that would not fall under "over-over engineering" or the like. Maybe it could error out and say "recompile me", but then someone would have said "that is unacceptable". Maybe it could auto extend the cache, but that is still more unnecessary over-engineering, IMHO. Maybe a there could be some mandatory declarations or partial eval that could precompute the needed parameters out/before the bench is started, with a clear message "precomputing stuff...", but that would be over over over engineering again... and that would mean restricting random_zipfian parameters to near-constants, which would require some explaining, but maybe it is an option. I guess that in the paper original context, the parameters (s & n) are known before the bench is started, so that the needed value are computed offline once and for all. > I think leaving it in there is just a foot-gun: we'd be a lot better off > throwing an error that tells people to use some other distribution. When s < 1, the startup cost is indeed a pain. However, it is a pain prescribed by a Turing Award. > Or if we do leave it in there, we for sure have to have documentation > that *actually* explains how to use it, which this patch still doesn't. I'm not sure what explaining there could be about how to use it: one calls the function to obtain pseudo-random integers with the desired distribution? > There's nothing suggesting that you'd better not use a large number of > different (n,s) combinations. Indeed, there is no caveat about this point, as noted above. Please find an updated patch for the documentation, pointing out the existence of the cache and an advice not to overdo it. It does not solve the underlying problem which raised your complaint, but at least it is documented. -- Fabien.
Hello again, >> I started to look through this, and the more I looked the more unhappy >> I got that we're having this discussion at all. The zipfian support >> in pgbench is seriously over-engineered and under-documented. As an >> example, I was flabbergasted to find out that the end-of-run summary >> statistics now include this: >> >> /* Report zipfian cache overflow */ >> for (i = 0; i < nthreads; i++) >> { >> totalCacheOverflows += threads[i].zipf_cache.overflowCount; >> } >> if (totalCacheOverflows > 0) >> { >> printf("zipfian cache array overflowed %d time(s)\n", >> totalCacheOverflows); >> } >> >> What is the point of that, and if there is a point, why is it nowhere >> mentioned in pgbench.sgml? The attached patch simplifies the code by erroring on cache overflow, instead of the LRU replacement strategy and unhelpful final report. The above lines are removed. -- Fabien.
Attachment
>>> What is the point of that, and if there is a point, why is it nowhere >>> mentioned in pgbench.sgml? > > The attached patch simplifies the code by erroring on cache overflow, instead > of the LRU replacement strategy and unhelpful final report. The above lines > are removed. Same, but without the compiler warning about an unused variable. Sorry for the noise. -- Fabien.
Attachment
On 3/23/19 7:45 PM, Fabien COELHO wrote: > >>>> What is the point of that, and if there is a point, why is it nowhere >>>> mentioned in pgbench.sgml? >> >> The attached patch simplifies the code by erroring on cache overflow, >> instead of the LRU replacement strategy and unhelpful final report. >> The above lines are removed. > Eh? Do I understand correctly that pgbench might start failing after some random amount of time, instead of reporting the overflow at the end? I'm not sure that's really an improvement ... Why is the cache fixed-size at all? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 3/23/19 6:44 PM, Fabien COELHO wrote: > > Hello Tom, > >> I started to look through this, and the more I looked the more unhappy >> I got that we're having this discussion at all. The zipfian support >> in pgbench is seriously over-engineered and under-documented. As an >> example, I was flabbergasted to find out that the end-of-run summary >> statistics now include this: >> >> /* Report zipfian cache overflow */ >> for (i = 0; i < nthreads; i++) >> { >> totalCacheOverflows += threads[i].zipf_cache.overflowCount; >> } >> if (totalCacheOverflows > 0) >> { >> printf("zipfian cache array overflowed %d time(s)\n", >> totalCacheOverflows); >> } >> >> What is the point of that, and if there is a point, why is it nowhere >> mentioned in pgbench.sgml? > > Indeed, there should. > >> What would a user do with this information, and how would they know >> what to do? > > Sure, but it was unclear what to do. Extending the cache to avoid that > would look like over-engineering. > That seems like a rather strange argument. What exactly is so complex on resizing the cache to quality as over-engineering? If the choice is between reporting the failure to the user, and addressing the failure, surely the latter would be the default option? Particularly if the user can't really address the issue easily (recompiling psql is not very practical solution). >> I remain of the opinion that we ought to simply rip out support for >> zipfian with s < 1. > +1 to that > Some people really want zipfian because it reflects their data access > pattern, possibly with s < 1. > > We cannot helpt it: real life seems zipfianly distributed:-) > Sure. But that hardly means we ought to provide algorithms that we know are not suitable for benchmarking tools, which I'd argue is this case. Also, we have two algorithms for generating zipfian distributions. Why wouldn't it be sufficient to keep just one of them? >> It's not useful for benchmarking purposes to have a random-number >> function with such poor computational properties. > > This is mostly a startup cost, the generation cost when a bench is > running is reasonable. How to best implement the precomputation is an > open question. > ... which means it's not a startup cost. IMHO this simply shows pgbench does not have the necessary infrastructure to provide this feature. > As a reviewer I was not thrilled by the cache stuff, but I had no better > idea that would not fall under "over-over engineering" or the like. > > Maybe it could error out and say "recompile me", but then someone > would have said "that is unacceptable". > > Maybe it could auto extend the cache, but that is still more unnecessary > over-engineering, IMHO. > I'm puzzled. Why would that be over-engineering? > Maybe a there could be some mandatory declarations or partial eval that > could precompute the needed parameters out/before the bench is started, > with a clear message "precomputing stuff...", but that would be over > over over engineering again... and that would mean restricting > random_zipfian parameters to near-constants, which would require some > explaining, but maybe it is an option. I guess that in the paper > original context, the parameters (s & n) are known before the bench is > started, so that the needed value are computed offline once and for all. > >> I think leaving it in there is just a foot-gun: we'd be a lot better >> off throwing an error that tells people to use some other distribution. > > When s < 1, the startup cost is indeed a pain. However, it is a pain > prescribed by a Turing Award. > Then we shouldn't have it, probably. Or we should at least implement a proper startup phase, so that the costly precomputation is not included in the test interval. >> Or if we do leave it in there, we for sure have to have documentation >> that *actually* explains how to use it, which this patch still doesn't. > > I'm not sure what explaining there could be about how to use it: one > calls the function to obtain pseudo-random integers with the desired > distribution? > Well, I'd argue the current description "performance is poor" is not particularly clear. >> There's nothing suggesting that you'd better not use a large number of >> different (n,s) combinations. > > Indeed, there is no caveat about this point, as noted above. > > Please find an updated patch for the documentation, pointing out the > existence of the cache and an advice not to overdo it. > > It does not solve the underlying problem which raised your complaint, > but at least it is documented. > I think you forgot to attache the patch ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>>> What is the point of that, and if there is a point, why is it nowhere >>>>> mentioned in pgbench.sgml? >>> >>> The attached patch simplifies the code by erroring on cache overflow, >>> instead of the LRU replacement strategy and unhelpful final report. >>> The above lines are removed. > > Eh? Do I understand correctly that pgbench might start failing after > some random amount of time, instead of reporting the overflow at the > end? Indeed, that what this patch would induce, although very probably under a *short* random amount of time. > I'm not sure that's really an improvement ... Depends. If the cache is full it means repeating the possibly expensive constant computations, which looks like a very bad idea for the user anyway. > Why is the cache fixed-size at all? The cache can diverge and the search is linear, so it does not seem a good idea to keep it for very long: \set size random(100000, 1000000) \set i random_zipfian(1, :size, ...) The implementation only makes some sense if there are very few values (param & size pairs with param < 1) used in the whole script. Tom is complaining of over engineering, and he has a point, so I'm trying to simplify (eg dropping LRU and erroring) for cases where the feature is not really appropriate anyway. -- Fabien.
Hello Tomas, >>> What would a user do with this information, and how would they know >>> what to do? >> >> Sure, but it was unclear what to do. Extending the cache to avoid that >> would look like over-engineering. > > That seems like a rather strange argument. What exactly is so complex on > resizing the cache to quality as over-engineering? Because the cache size can diverge on a bad script, and the search is currently linear. Even with a log search it does not look attractive. > If the choice is between reporting the failure to the user, and > addressing the failure, surely the latter would be the default option? > Particularly if the user can't really address the issue easily > (recompiling psql is not very practical solution). > >>> I remain of the opinion that we ought to simply rip out support for >>> zipfian with s < 1. > > +1 to that If this is done, some people with zipfian distribution that currently work might be unhappy. >> Some people really want zipfian because it reflects their data access >> pattern, possibly with s < 1. >> >> We cannot helpt it: real life seems zipfianly distributed:-) > > Sure. But that hardly means we ought to provide algorithms that we know > are not suitable for benchmarking tools, which I'd argue is this case. Hmmm. It really depends on the values actually used. > Also, we have two algorithms for generating zipfian distributions. Why > wouldn't it be sufficient to keep just one of them? Because the two methods work for different values of the parameter, so it depends on the distribution which is targetted. If you want s < 1, the exclusion method does not help (precisely, the real "infinite" zipfian distribution is not mathematical defined when s < 1 because the underlying sum diverges. Having s < 1 can only be done on a finite set). >>> It's not useful for benchmarking purposes to have a random-number >>> function with such poor computational properties. >> >> This is mostly a startup cost, the generation cost when a bench is >> running is reasonable. How to best implement the precomputation is an >> open question. > > ... which means it's not a startup cost. IMHO this simply shows pgbench > does not have the necessary infrastructure to provide this feature. Well, a quick one has been proposed with a cache. Now I can imagine alternatives, but I'm wondering how much it is worth it. Eg, restraint random_zipfian to more or less constant parameters when s < 1, so that a partial evaluation could collect the needed pairs and perform the pre-computations before the bench is started. Now, that looks like over-engineering, and then someone would complain of the restrictions. >> As a reviewer I was not thrilled by the cache stuff, but I had no better >> idea that would not fall under "over-over engineering" or the like. >> >> Maybe it could error out and say "recompile me", but then someone >> would have said "that is unacceptable". >> >> Maybe it could auto extend the cache, but that is still more unnecessary >> over-engineering, IMHO. > > I'm puzzled. Why would that be over-engineering? As stated above, cache size divergence and O(n) search. Even a log2(n) search would be bad, IMO. >>> I think leaving it in there is just a foot-gun: we'd be a lot better >>> off throwing an error that tells people to use some other distribution. >> >> When s < 1, the startup cost is indeed a pain. However, it is a pain >> prescribed by a Turing Award. > > Then we shouldn't have it, probably. Or we should at least implement a > proper startup phase, so that the costly precomputation is not included > in the test interval. That can be done, but I'm not sure of an very elegant design. And I was just the reviewer on this one. >>> Or if we do leave it in there, we for sure have to have documentation >>> that *actually* explains how to use it, which this patch still doesn't. >> >> I'm not sure what explaining there could be about how to use it: one >> calls the function to obtain pseudo-random integers with the desired >> distribution? > > Well, I'd argue the current description "performance is poor" is not > particularly clear. > >>> There's nothing suggesting that you'd better not use a large number of >>> different (n,s) combinations. >> >> Indeed, there is no caveat about this point, as noted above. >>> Please find an updated patch for the documentation, pointing out the >> existence of the cache and an advice not to overdo it. >>> It does not solve the underlying problem which raised your complaint, >> but at least it is documented. >> > > I think you forgot to attache the patch ... Indeed, this is becoming a habbit:-( Attached. Hopefully. -- Fabien.
Attachment
Hello Tom & Tomas, >> If the choice is between reporting the failure to the user, and >> addressing the failure, surely the latter would be the default option? >> Particularly if the user can't really address the issue easily >> (recompiling psql is not very practical solution). >> >>>> I remain of the opinion that we ought to simply rip out support for >>>> zipfian with s < 1. >> >> +1 to that > > If this is done, some people with zipfian distribution that currently > work might be unhappy. After giving it some thought, I think that this cannot be fully fixed for 12. The attached patch removes the code for param in (0, 1), and slightly improve the documentation about the performance, if you want to proceed. For s > 1, there is no such constraint, and it works fine, there is no reason to remove it. Given the constraint of Jim Gray's approximated method for s in (0, 1), which really does zipfian for the first two integers and then uses an exponential approximation, the only approach is that the parameters must be computed in a partial eval preparation phase before the bench code is run. This means that only (mostly) constants would be allowed as parameters when s is in (0, 1), but I think that this is acceptable because anyway the method fundamentaly requires it. I think that it can be implemented reasonably well (meaning not too much code), but would requires a few round of reviews if someone implements it (for a reminder, I was only the reviewer on this one). An added benefit would be that the parameter cache could be shared between thread, which would be a good thing. The attached other attached patch illustrate what I call poor performance for stupid parameters (no point in doing zipfian on 2 integers…) : ./pgbench -T 3 -D n=2 -D s=1.01 -f zipf_perf.sql # 46981 tps ./pgbench -T 3 -D n=2 -D s=1.001 -f zipf_perf.sql # 6187 tps ./pgbench -T 3 -D n=2 -D s=1.0001 -f zipf_perf.sql # 710 tps ./pgbench -T 3 -D n=100 -D s=1.01 -f zipf_perf.sql # 142910 tps ./pgbench -T 3 -D n=100 -D s=1.001 -f zipf_perf.sql # 21214 tps ./pgbench -T 3 -D n=100 -D s=1.0001 -f zipf_perf.sql # 2466 tps ./pgbench -T 3 -D n=1000000 -D s=1.01 -f zipf_perf.sql # 376453 tps ./pgbench -T 3 -D n=1000000 -D s=1.001 -f zipf_perf.sql # 57441 tps ./pgbench -T 3 -D n=1000000 -D s=1.0001 -f zipf_perf.sql # 6780 tps Maybe the implementation could impose that s is at least 1.001 to avoid the lower performance? -- Fabien.
Attachment
Fabien COELHO <coelho@cri.ensmp.fr> writes: >>>> I remain of the opinion that we ought to simply rip out support for >>>> zipfian with s < 1. >>> +1 to that >> If this is done, some people with zipfian distribution that currently >> work might be unhappy. > After giving it some thought, I think that this cannot be fully fixed for > 12. Just to clarify --- my complaint about "over engineering" referred to the fact that a cache exists at all; fooling around with the overflow behavior isn't really going to answer that. The bigger picture here is that a benchmarking tool that contains its own performance surprises is not a nice thing to have. It's not very hard to imagine somebody wasting a lot of time trying to explain weird results, only to find out that the cause is unstable performance of random_zipfian. Or worse, people might draw totally incorrect conclusions if they fail to drill down enough to realize that there's a problem in pgbench rather than on the server side. > Given the constraint of Jim Gray's approximated method for s in (0, 1), > which really does zipfian for the first two integers and then uses an > exponential approximation, the only approach is that the parameters must > be computed in a partial eval preparation phase before the bench code is > run. This means that only (mostly) constants would be allowed as > parameters when s is in (0, 1), but I think that this is acceptable > because anyway the method fundamentaly requires it. Yeah, if we could store all the required harmonic numbers before the test run timing starts, that would address the concern about stable performance. But I have to wonder whether zipfian with s < 1 is useful enough to justify so much code. > The attached other attached patch illustrate what I call poor performance > for stupid parameters (no point in doing zipfian on 2 integers…) : > ... > Maybe the implementation could impose that s is at least 1.001 to avoid > the lower performance? I was wondering about that too. It seems like it'd be a wise idea to further constrain s and/or n to ensure that the s > 1 code path doesn't do anything too awful ... unless someone comes up with a better implementation that has stable performance without such constraints. regards, tom lane
Hello Tom, >>> If this is done, some people with zipfian distribution that currently >>> work might be unhappy. > >> After giving it some thought, I think that this cannot be fully fixed for >> 12. > > Just to clarify --- my complaint about "over engineering" referred to > the fact that a cache exists at all; fooling around with the overflow > behavior isn't really going to answer that. Having some cache really makes sense for s < 1, AFAICS. > The bigger picture here is that a benchmarking tool that contains its > own performance surprises is not a nice thing to have. Hmmm. It seems that it depends. Sometimes self inflected wounds are the soldier's responsability, sometimes they must be prevented. > It's not very hard to imagine somebody wasting a lot of time trying to > explain weird results, only to find out that the cause is unstable > performance of random_zipfian. Or worse, people might draw totally > incorrect conclusions if they fail to drill down enough to realize that > there's a problem in pgbench rather than on the server side. Yep, benchmarking is tougher than it looks: it is very easy to get it wrong without knowing, whatever tool you use. >> Given the constraint of Jim Gray's approximated method for s in (0, 1), >> which really does zipfian for the first two integers and then uses an >> exponential approximation, the only approach is that the parameters must >> be computed in a partial eval preparation phase before the bench code is >> run. This means that only (mostly) constants would be allowed as >> parameters when s is in (0, 1), but I think that this is acceptable >> because anyway the method fundamentaly requires it. > > Yeah, if we could store all the required harmonic numbers before the > test run timing starts, that would address the concern about stable > performance. But I have to wonder whether zipfian with s < 1 is useful > enough to justify so much code. I do not know about that. I do not think that Jim Gray chose to invent an approximated method for s < 1 because he thought it was fun. I think he did it because his benchmark data required it. If you need it, you need it. If you do not need it, you do not care. >> The attached other attached patch illustrate what I call poor performance >> for stupid parameters (no point in doing zipfian on 2 integers…) : >> ... >> Maybe the implementation could impose that s is at least 1.001 to avoid >> the lower performance? > > I was wondering about that too. It seems like it'd be a wise idea to > further constrain s and/or n to ensure that the s > 1 code path doesn't do > anything too awful ... Yep. The attached version enforces s >= 1.001, which avoids the worse cost of iterating, according to my small tests. > unless someone comes up with a better implementation that has stable > performance without such constraints. Hmmm… -- Fabien.
Attachment
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> I was wondering about that too. It seems like it'd be a wise idea to >> further constrain s and/or n to ensure that the s > 1 code path doesn't do >> anything too awful ... > Yep. The attached version enforces s >= 1.001, which avoids the worse cost > of iterating, according to my small tests. Seems reasonable. Pushed with minor documentation editing. regards, tom lane
On 2019-Apr-01, Tom Lane wrote: > Fabien COELHO <coelho@cri.ensmp.fr> writes: > >> I was wondering about that too. It seems like it'd be a wise idea to > >> further constrain s and/or n to ensure that the s > 1 code path doesn't do > >> anything too awful ... > > > Yep. The attached version enforces s >= 1.001, which avoids the worse cost > > of iterating, according to my small tests. > > Seems reasonable. Pushed with minor documentation editing. Ah, so we now we can get rid of the TState * being passed around separately for expression execution, too? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-Apr-01, Tom Lane wrote: >> Seems reasonable. Pushed with minor documentation editing. > Ah, so we now we can get rid of the TState * being passed around > separately for expression execution, too? I didn't really look for follow-on simplifications, but if there are some to be made, great. (That would be further evidence for my opinion that this feature was overcomplicated ...) regards, tom lane
> Ah, so we now we can get rid of the TState * being passed around > separately for expression execution, too? Indeed. I would have thought that the compiler would have warned if it is unused, but because of the recursion it is uselessly used. Ok, maybe the sentence above is not very clear. Attached a patch which simplifies further. -- Fabien.
Attachment
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> Ah, so we now we can get rid of the TState * being passed around >> separately for expression execution, too? > Indeed. Indeed. Pushed with some additional cleanup. regards, tom lane