Thread: PSA: we lack TAP test coverage on NetBSD and OpenBSD
Although we've got a few NetBSD and OpenBSD buildfarm critters, none of them are doing --enable-tap-tests. If they were, we'd have noticed the pgbench regression tests falling over: not ok 3 - pgbench option error: bad option stderr /(?^:(unrecognized|illegal) option)/ # Failed test 'pgbench option error: bad option stderr /(?^:(unrecognized|illegal) option)/' # at t/002_pgbench_no_server.pl line 190. # 'pgbench: unknown option -- bad-option # Try "pgbench --help" for more information. # ' # doesn't match '(?^:(unrecognized|illegal) option)' Sure enough, manual testing confirms that on these platforms that error message is spelled differently: $ pgbench --bad-option pgbench: unknown option -- bad-option Try "pgbench --help" for more information. I am, TBH, inclined to fix this by removing that test case rather than teaching it another spelling to accept. I think it's very hard to make the case that tests like this one are anything but a waste of developer and buildfarm time. When they are also a portability hazard, it's time to cut our losses. (I also note that this test has caused us problems before, cf 869aa40a2 and 933851033.) regards, tom lane
Hello Tom, > Although we've got a few NetBSD and OpenBSD buildfarm critters, > none of them are doing --enable-tap-tests. If they were, we'd > have noticed the pgbench regression tests falling over: > > [...] > > I am, TBH, inclined to fix this by removing that test case rather > than teaching it another spelling to accept. I think it's very > hard to make the case that tests like this one are anything but > a waste of developer and buildfarm time. When they are also a > portability hazard, it's time to cut our losses. (I also note > that this test has caused us problems before, cf 869aa40a2 and > 933851033.) I'd rather keep it by simply adding the "|unknown" alternative. 30 years of programming have taught me that testing limit & error cases is useful, although you never know when it will be proven so. Client application coverage is currently abysmal, especially "psql" despite the many script used for testing (39% of lines, 42% of functions!), pgbench is under 90%. Generally we really need more tests, not less. TAP tests are a good compromise because they are not always run, and ISTM sometimes (i.e. you asked for it) is enough. I agree that some tests can be useless, but I do not think that it applies to this one. This test also checks that under a bad option pgbench stops with an appropriate 1 exit status. Recently a patch updated the exit status of pgbench in many cases to distinguish between different kind errors, thus having non-regression in this area was shown to be a good idea. Moreover, knowing that the exit status on getopt errors is consistent across platform has value, and knowing that there is some variability is not uninteresting. -- Fabien.
On 2019-01-17 06:04, Tom Lane wrote: > Although we've got a few NetBSD and OpenBSD buildfarm critters, > none of them are doing --enable-tap-tests. If they were, we'd > have noticed the pgbench regression tests falling over: For what it's worth I've enabled tap-tests for my OpenBSD 5.9 (curculio) and NetBSD 7 (sidewinder) animals now. /Mikael
=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes: > On 2019-01-17 06:04, Tom Lane wrote: >> Although we've got a few NetBSD and OpenBSD buildfarm critters, >> none of them are doing --enable-tap-tests. If they were, we'd >> have noticed the pgbench regression tests falling over: > For what it's worth I've enabled tap-tests for my OpenBSD 5.9 (curculio) > and NetBSD 7 (sidewinder) animals now. Oh, thanks! I'm guessing they'll fail their next runs, but I'll wait to see confirmation of that before I do anything about the test bug. regards, tom lane
On 2019-01-17 22:16, Tom Lane wrote: >> For what it's worth I've enabled tap-tests for my OpenBSD 5.9 (curculio) >> and NetBSD 7 (sidewinder) animals now. > > Oh, thanks! I'm guessing they'll fail their next runs, but I'll > wait to see confirmation of that before I do anything about the > test bug. They should run the next time within the hour or hour and a half so I guess we will find out soon enough. /Mikael
On 2019-01-17 22:19, Mikael Kjellström wrote: > On 2019-01-17 22:16, Tom Lane wrote: > >>> For what it's worth I've enabled tap-tests for my OpenBSD 5.9 (curculio) >>> and NetBSD 7 (sidewinder) animals now. >> >> Oh, thanks! I'm guessing they'll fail their next runs, but I'll >> wait to see confirmation of that before I do anything about the >> test bug. > > They should run the next time within the hour or hour and a half so I > guess we will find out soon enough. Hm, that didn't go so well. It says: configure: error: Additional Perl modules are required to run TAP tests so how do I find out with Perl modules that are required? /Mikael
=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes: > It says: > configure: error: Additional Perl modules are required to run TAP tests > so how do I find out with Perl modules that are required? If you look into the configure log it should say just above that, but I'm betting you just need p5-IPC-Run. regards, tom lane
On 2019-01-17 22:42, Tom Lane wrote: > =?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes: >> It says: >> configure: error: Additional Perl modules are required to run TAP tests >> so how do I find out with Perl modules that are required? > > If you look into the configure log it should say just above that, > but I'm betting you just need p5-IPC-Run. Yes it seems to be IPC::Run that is missing. I've installed it manually through CPAN. Let's see if it works better this time. /Mikael
On 2019-01-17 22:47, Mikael Kjellström wrote: > > > On 2019-01-17 22:42, Tom Lane wrote: > >> =?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes: >>> It says: >>> configure: error: Additional Perl modules are required to run TAP tests >>> so how do I find out with Perl modules that are required? >> >> If you look into the configure log it should say just above that, >> but I'm betting you just need p5-IPC-Run. > > Yes it seems to be IPC::Run that is missing. > > I've installed it manually through CPAN. > > Let's see if it works better this time. Hmmm, nope: ================== pgsql.build/src/bin/pg_ctl/tmp_check/log/003_promote_standby.log =================== 2019-01-17 23:09:20.343 CET [9129] LOG: listening on Unix socket "/tmp/g66P1fpMFK/.s.PGSQL.64980" 2019-01-17 23:09:20.343 CET [9129] FATAL: could not create semaphores: No space left on device 2019-01-17 23:09:20.343 CET [9129] DETAIL: Failed system call was semget(64980002, 17, 03600). 2019-01-17 23:09:20.343 CET [9129] HINT: This error does *not* mean that you have run out of disk space. It occurs when either the system limit for the maximum number of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be exceeded. You need to raise the respective kernel parameter. Alternatively, reduce PostgreSQL's consumption of semaphores by reducing its max_connections parameter. The PostgreSQL documentation contains more information about configuring your system for PostgreSQL. 2019-01-17 23:09:20.345 CET [9129] LOG: database system is shut down will try and increase SEMMNI and see if that helps. /Mikael
=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes: >> Let's see if it works better this time. > Hmmm, nope: > 2019-01-17 23:09:20.343 CET [9129] FATAL: could not create semaphores: > No space left on device Yeah, you might've been able to get by with OpenBSD/NetBSD's default semaphore settings before, but they really only let one postmaster run at a time; and the TAP tests want to start more than one. For me it seems to work to append this to /etc/sysctl.conf: kern.seminfo.semmni=100 kern.seminfo.semmns=2000 and either reboot, or install those settings manually with sysctl. regards, tom lane
On 2019-01-17 23:23, Tom Lane wrote: > Yeah, you might've been able to get by with OpenBSD/NetBSD's default > semaphore settings before, but they really only let one postmaster > run at a time; and the TAP tests want to start more than one. > For me it seems to work to append this to /etc/sysctl.conf: > > kern.seminfo.semmni=100 > kern.seminfo.semmns=2000 > > and either reboot, or install those settings manually with sysctl. Looks that way. I've increased the values and rebooted the machines. Let's hope 5th time is the charm :-) /Mikael
On 2019-01-17 23:37, Mikael Kjellström wrote: > > On 2019-01-17 23:23, Tom Lane wrote: > >> Yeah, you might've been able to get by with OpenBSD/NetBSD's default >> semaphore settings before, but they really only let one postmaster >> run at a time; and the TAP tests want to start more than one. >> For me it seems to work to append this to /etc/sysctl.conf: >> >> kern.seminfo.semmni=100 >> kern.seminfo.semmns=2000 >> >> and either reboot, or install those settings manually with sysctl. > > Looks that way. > > I've increased the values and rebooted the machines. > > Let's hope 5th time is the charm :-) Nope! But it looks like in NetBSD the options are called: netbsd7-pgbf# sysctl -a | grep semmn kern.ipc.semmni = 10 kern.ipc.semmns = 60 kern.ipc.semmnu = 30 so I will try and set that in /etc/sysctl.conf and reboot and see what happens. /Mikael
On 2019-01-17 23:54, Mikael Kjellström wrote: > But it looks like in NetBSD the options are called: > > netbsd7-pgbf# sysctl -a | grep semmn > kern.ipc.semmni = 10 > kern.ipc.semmns = 60 > kern.ipc.semmnu = 30 > > so I will try and set that in /etc/sysctl.conf and reboot and see what > happens. That seems to have done the trick: netbsd7-pgbf# sysctl -a | grep semmn kern.ipc.semmni = 100 kern.ipc.semmns = 2000 kern.ipc.semmnu = 30 I just started another run on sidewinder (NetBSD 7), let's see how that goes. but the OpenBSD machine went further and now fails on: pgbenchCheck instead. Is that the failure you expected to get? /Mikael
On 2019-01-18 00:00, Mikael Kjellström wrote: > I just started another run on sidewinder (NetBSD 7), let's see how that > goes. > > but the OpenBSD machine went further and now fails on: > > pgbenchCheck instead. > > Is that the failure you expected to get? And now also the NetBSD machine failed on pgbenchCheck. https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2019-01-17%2022%3A57%3A14 should I leave it as it is for now? /Mikael
=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes: >> But it looks like in NetBSD the options are called: Sorry about that, I copied-and-pasted from the openbsd machine I was looking at without remembering that netbsd is just a shade different. > but the OpenBSD machine went further and now fails on: > pgbenchCheck instead. > Is that the failure you expected to get? Yup, sure is. Thanks! regards, tom lane
=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes: > And now also the NetBSD machine failed on pgbenchCheck. Indeed, as expected. > should I leave it as it is for now? Please. I'll push a fix for the broken test case in a bit --- I just wanted to confirm that somebody else's machines agreed that it's broken. regards, tom lane
On 2019-01-18 00:31, Tom Lane wrote: > =?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@mksoft.nu> writes: >> And now also the NetBSD machine failed on pgbenchCheck. > > Indeed, as expected. Ok. >> should I leave it as it is for now? > > Please. I'll push a fix for the broken test case in a bit --- I > just wanted to confirm that somebody else's machines agreed that > it's broken. Ok, I will leave it on then. /Mikael
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> I am, TBH, inclined to fix this by removing that test case rather >> than teaching it another spelling to accept. I think it's very >> hard to make the case that tests like this one are anything but >> a waste of developer and buildfarm time. When they are also a >> portability hazard, it's time to cut our losses. (I also note >> that this test has caused us problems before, cf 869aa40a2 and >> 933851033.) > I'd rather keep it by simply adding the "|unknown" alternative. 30 years > of programming have taught me that testing limit & error cases is useful, > although you never know when it will be proven so. Sorry, I don't buy this line of argument. Reasonable test design requires making cost/benefit tradeoffs: the cost to run the test over and over, and the cost to maintain the test itself (e.g. fix portability issues in it) have to be balanced against the probability of it finding something useful. I judge that the chance of this particular test finding something is small, and I've had quite enough of the maintenance costs. Just to point up that we're still not clearly done with the maintenance costs of supposing that we know how every version of getopt_long will spell this error message, I note that my Linux box seems to have two variants of it: $ pgbench -z pgbench: invalid option -- 'z' Try "pgbench --help" for more information. $ pgbench --z pgbench: unrecognized option '--z' Try "pgbench --help" for more information. of which the "invalid" alternative is also not in our list right now. Who's to say how many more versions of getopt_long are out there, or what the maintainers thereof might do in the future? > I agree that some tests can be useless, but I do not think that it applies > to this one. This test also checks that under a bad option pgbench stops > with an appropriate 1 exit status. It's possible that it's worth the trouble to check for exit status 1, but I entirely fail to see the point of checking exactly what is the spelling of a message that is issued by code not under our control. Looking closer at the test case: [ 'bad option', '-h home -p 5432 -U calvin -d --bad-option', [ qr{(unrecognized|illegal) option}, qr{--help.*more information} ] ], ISTM that just removing the first qr{} pattern, and checking only that we get the text that *is* under our control, is a reasonable compromise here. regards, tom lane
On Thu, Jan 17, 2019 at 07:21:08PM -0500, Tom Lane wrote: > Sorry, I don't buy this line of argument. Reasonable test design requires > making cost/benefit tradeoffs: the cost to run the test over and over, > and the cost to maintain the test itself (e.g. fix portability issues in > it) have to be balanced against the probability of it finding something > useful. I judge that the chance of this particular test finding something > is small, and I've had quite enough of the maintenance costs. Yes, I agree with Tom's line of thoughts here. It seems to me that just dropping this part of the test is just but fine. -- Michael
Attachment
BTW, if you're wondering why curculio is still failing the pgbench test, all is explained here: https://man.openbsd.org/srandom Or at least most is explained there. While curculio is unsurprisingly failing all four seeded_random tests, when I try it locally on an OpenBSD 6.4 installation, only the uniform, exponential, and gaussian cases reliably "fail". zipfian usually doesn't. It looks like the zipfian code almost always produces 4000 regardless of the seed value, though occasionally it produces 4001. Bad parameters for that algorithm, perhaps? regards, tom lane
>> I'd rather keep it by simply adding the "|unknown" alternative. 30 years >> of programming have taught me that testing limit & error cases is useful, >> although you never know when it will be proven so. > > Sorry, I don't buy this line of argument. > Reasonable test design requires making cost/benefit tradeoffs: the cost > to run the test over and over, and the cost to maintain the test itself > (e.g. fix portability issues in it) have to be balanced against the > probability of it finding something useful. I judge that the chance of > this particular test finding something is small, and I've had quite > enough of the maintenance costs. > > Just to point up that we're still not clearly done with the maintenance > costs of supposing that we know how every version of getopt_long will > spell this error message, I note that my Linux box seems to have two > variants of it: > > $ pgbench -z > pgbench: invalid option -- 'z' > Try "pgbench --help" for more information. > $ pgbench --z > pgbench: unrecognized option '--z' > Try "pgbench --help" for more information. > > of which the "invalid" alternative is also not in our list right now. > Who's to say how many more versions of getopt_long are out there, > or what the maintainers thereof might do in the future? ISTM that the getopt implementers imagination should run out in the end:-) invalid, unknown, unrecognized, unexpected, incorrect... Ok English has many words:-) >> I agree that some tests can be useless, but I do not think that it applies >> to this one. This test also checks that under a bad option pgbench stops >> with an appropriate 1 exit status. > > It's possible that it's worth the trouble to check for exit status 1, > but I entirely fail to see the point of checking exactly what is the > spelling of a message that is issued by code not under our control. > > Looking closer at the test case: > > [ > 'bad option', > '-h home -p 5432 -U calvin -d --bad-option', > [ qr{(unrecognized|illegal) option}, qr{--help.*more information} ] > ], > > ISTM that just removing the first qr{} pattern, and checking only that > we get the text that *is* under our control, is a reasonable compromise > here. Possibly. I'd be a little happier if it checks for a non-empty error message, eg qr{...} or qr{option} (the message should say something about the option). -- Fabien.
> BTW, if you're wondering why curculio is still failing the pgbench > test, Hmmm, that is interesting! It shows that at least some TAP tests are useful. > all is explained here: > > https://man.openbsd.org/srandom > > Or at least most is explained there. Yep. They try to be more serious than other systems about PRNG, which is not bad in itself. > While curculio is unsurprisingly failing all four seeded_random tests, > when I try it locally on an OpenBSD 6.4 installation, only the uniform, > exponential, and gaussian cases reliably "fail". zipfian usually > doesn't. > It looks like the zipfian code almost always produces 4000 regardless of > the seed value, though occasionally it produces 4001. Bad parameters > for that algorithm, perhaps? Welcome to the zipfian highly skewed distribution! I'll check the parameters used in the test, maybe it should use something less extreme. srandom is only used for initializing the state of various internal rand48 LCG PRNG for pgbench. Maybe on OpenBSD pg should switch srandom to srandom_deterministic? -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> all is explained here: >> https://man.openbsd.org/srandom >> Or at least most is explained there. > Yep. They try to be more serious than other systems about PRNG, which is > not bad in itself. > Maybe on OpenBSD pg should switch srandom to srandom_deterministic? Dunno. I'm fairly annoyed by their idea that they're smarter than POSIX. However, for most of our uses of srandom, this behavior isn't awful; it's only pgbench that has an expectation that the platform random() can be made to behave deterministically. And TBH I think that's just an expectation that's going to bite us. I'd suggest that maybe we should get rid of the use of both random() and srandom() in pgbench, and go over to letting set_random_seed() fill the pg_erand48 state directly. In the integer-seed case you could use something equivalent to pg_srand48. (In the other cases probably you could do better, certainly the strong-random case could just fill all 6 bytes directly.) That would get us to a place where the behavior of --random-seed=N is not only deterministic but platform-independent, which seems like an improvement. regards, tom lane
>> Maybe on OpenBSD pg should switch srandom to srandom_deterministic? > > Dunno. I'm fairly annoyed by their idea that they're smarter than POSIX. > However, for most of our uses of srandom, this behavior isn't awful; > it's only pgbench that has an expectation that the platform random() > can be made to behave deterministically. And TBH I think that's just > an expectation that's going to bite us. > > I'd suggest that maybe we should get rid of the use of both random() > and srandom() in pgbench, and go over to letting set_random_seed() > fill the pg_erand48 state directly. In the integer-seed case you > could use something equivalent to pg_srand48. (In the other cases > probably you could do better, certainly the strong-random case could > just fill all 6 bytes directly.) That would get us to a place where > the behavior of --random-seed=N is not only deterministic but > platform-independent, which seems like an improvement. That's a point. Althought I'm not found of round48, indeed having something platform independent for testing makes definite sense. I'll look into it. -- Fabien.
Hello Tom, >>> Maybe on OpenBSD pg should switch srandom to srandom_deterministic? >> >> Dunno. I'm fairly annoyed by their idea that they're smarter than POSIX. Hmmm. I'm afraid that is not that hard. >> However, for most of our uses of srandom, this behavior isn't awful; >> it's only pgbench that has an expectation that the platform random() >> can be made to behave deterministically. And TBH I think that's just >> an expectation that's going to bite us. >> >> I'd suggest that maybe we should get rid of the use of both random() >> and srandom() in pgbench, and go over to letting set_random_seed() >> fill the pg_erand48 state directly. In the integer-seed case you >> could use something equivalent to pg_srand48. (In the other cases >> probably you could do better, certainly the strong-random case could >> just fill all 6 bytes directly.) That would get us to a place where >> the behavior of --random-seed=N is not only deterministic but >> platform-independent, which seems like an improvement. > > That's a point. Althought I'm not found of round48, indeed having something > platform independent for testing makes definite sense. > > I'll look into it. Here is a POC which defines an internal interface for a PRNG, and use it within pgbench, with several possible implementations which default to rand48. I must admit that I have a grudge against standard rand48: - it is a known poor PRNG which was designed at a time when LCG where basically the only low cost PRNG available. Newer designs were very recent when the standard was set. - it is a LCG, i.e. its low bits cycle quickly, so should not be used. - so the 48 bit state size is relevant for generating 32 bits ints and floats. - however it eis used to generate more bits... - the double function uses all 48 bits, whereas 52 need to be filled... - and it is used to generate integers, which means that for large range some values are inaccessible. - 3 * 16 bits integers state looks silly on 32/64 bit architectures. - ... Given that postgres needs doubles (52 bits mantissa) and possibly 64 bits integers, IMO the internal state should be 64 bits as a bare minimum, which anyway is also the minimal bite on 64 bit architectures, which is what is encoutered in practice. -- Fabien.
Attachment
Fabien COELHO <coelho@cri.ensmp.fr> writes: >>> I'd suggest that maybe we should get rid of the use of both random() >>> and srandom() in pgbench, and go over to letting set_random_seed() >>> fill the pg_erand48 state directly. > Here is a POC which defines an internal interface for a PRNG, and use it > within pgbench, with several possible implementations which default to > rand48. I seriously dislike this patch. pgbench's random support is quite overengineered already IMO, and this proposes to add a whole batch of new code and new APIs to fix a very small bug. > I must admit that I have a grudge against standard rand48: I think this is nonsense, particularly the claim that anything in PG cares about the lowest-order bits of random doubles. I'm aware that there are applications where that does matter, but people aren't doing high-precision weather simulations in pgbench. BTW, did you look at the question of the range of zipfian? I confirmed here that as used in the test case, it's generating a range way smaller than the other ones: repeating the insertion snippet 1000x produces stats like this: regression=# select seed,rand,min(val),max(val),count(distinct val) from seeded_random group by 1,2 order by 2,1; seed | rand | min | max | count ------------+-------------+------+------+------- 1957482663 | exponential | 2000 | 2993 | 586 1958556409 | exponential | 2000 | 2995 | 569 1959867462 | exponential | 2000 | 2997 | 569 1957482663 | gaussian | 3009 | 3997 | 493 1958556409 | gaussian | 3027 | 3956 | 501 1959867462 | gaussian | 3018 | 3960 | 511 1957482663 | uniform | 1001 | 1999 | 625 1958556409 | uniform | 1000 | 1999 | 642 1959867462 | uniform | 1001 | 1999 | 630 1957482663 | zipfian | 4000 | 4081 | 19 1958556409 | zipfian | 4000 | 4022 | 18 1959867462 | zipfian | 4000 | 4156 | 23 I have no idea whether that indicates an actual bug, or just poor choice of parameter in the test's call. But the very small number of distinct outputs is disheartening at least. regards, tom lane
Hello Tom, >> Here is a POC which defines an internal interface for a PRNG, and use it >> within pgbench, with several possible implementations which default to >> rand48. > > I seriously dislike this patch. pgbench's random support is quite > overengineered already IMO, and this proposes to add a whole batch of > new code and new APIs to fix a very small bug. My intention is rather to discuss postgres' PRNG, in passing. Full success on this point:-) >> I must admit that I have a grudge against standard rand48: > > I think this is nonsense, particularly the claim that anything in PG > cares about the lowest-order bits of random doubles. I'm aware that > there are applications where that does matter, but people aren't doing > high-precision weather simulations in pgbench. Sure. My point is not that it is an actual issue for pgbench, but as the same PRNG is used more or less everywhere in postgres, I think that it should be a good one rather than a known bad one. Eg, about double: \set i debug(random(1, POWER(2,49)) % 2) Always return 1 because of the 48 bit precision, i.e. the output is never even. \set i debug(random(1, POWER(2,48)) % 2) Return 0 1 0 1 0 1 0 1 0 1 0 1 0 1 ... because it is a LCG. \set i debug(random(1, POWER(2,48)) % 4) Cycles over (3 2 1 0)* \set i debug(random(1, power(2, 47)) % 4) Cycles over (0 0 1 1 2 2 3 3)*, and so on. > BTW, did you look at the question of the range of zipfian? Yep. > I confirmed here that as used in the test case, it's generating a range > way smaller than the other ones: repeating the insertion snippet 1000x > produces stats like this: [...] > I have no idea whether that indicates an actual bug, or just poor > choice of parameter in the test's call. But the very small number > of distinct outputs is disheartening at least. Zipf distribution is highly skewed, somehow close to an exponential. To reduce the decreasing probability the parameter must be closer to 1, eg 1.05 or something. However as far as the test is concerned I do not see this as a significant issue. I was rather planning to submit a documentation improvement to provide more precise hints about how the distribution behaves depending on the parameter, and possibly reduce the parameter used in the test in passing, but I see this as not very urgent. -- Fabien.
Hello Tom, >> BTW, did you look at the question of the range of zipfian? > > Yep. > >> I confirmed here that as used in the test case, it's generating a range way >> smaller than the other ones: repeating the insertion snippet 1000x produces >> stats like this: [...] > >> I have no idea whether that indicates an actual bug, or just poor >> choice of parameter in the test's call. But the very small number >> of distinct outputs is disheartening at least. > > Zipf distribution is highly skewed, somehow close to an exponential. To > reduce the decreasing probability the parameter must be closer to 1, eg 1.05 > or something. However as far as the test is concerned I do not see this as a > significant issue. I was rather planning to submit a documentation > improvement to provide more precise hints about how the distribution behaves > depending on the parameter, and possibly reduce the parameter used in the > test in passing, but I see this as not very urgent. Attached a documentation patch and a scripts to check the distribution (here for N = 10 & s = 2.5), the kind of thing I used when checking the initial patch: sh> psql < zipf_init.sql sh> pgbench -t 500000 -c 2 -M prepared -f zipf_test.sql -P 1 -- close to 29000 tps on my laptop sh> psql < zipf_end.sql ┌────┬────────┬────────────────────┬────────────────────────┐ │ i │ cnt │ ratio │ expected │ ├────┼────────┼────────────────────┼────────────────────────┤ │ 1 │ 756371 │ • │ • │ │ 2 │ 133431 │ 5.6686302283577280 │ 5.65685424949238019521 │ │ 3 │ 48661 │ 2.7420521567579787 │ 2.7556759606310754 │ │ 4 │ 23677 │ 2.0552012501583816 │ 2.0528009571186693 │ │ 5 │ 13534 │ 1.7494458401063987 │ 1.7469281074217107 │ │ 6 │ 8773 │ 1.5426877920893651 │ 1.5774409656148784 │ │ 7 │ 5709 │ 1.5366964442108951 │ 1.4701680288054869 │ │ 8 │ 4247 │ 1.3442429950553332 │ 1.3963036312159316 │ │ 9 │ 3147 │ 1.3495392437241818 │ 1.3423980299088363 │ │ 10 │ 2450 │ 1.2844897959183673 │ 1.3013488313450120 │ └────┴────────┴────────────────────┴────────────────────────┘ sh> psql < zipf_clean.sql Given these results, I do not think that it is useful to change random_zipfian TAP test parameter from 2.5 to something else. -- Fabien.
Attachment
Fabien COELHO <coelho@cri.ensmp.fr> writes: > Given these results, I do not think that it is useful to change > random_zipfian TAP test parameter from 2.5 to something else. I'm not following this argument. The test case is basically useless for its intended purpose with that parameter, because it's highly likely that the failure mode it's supposedly checking for will be masked by the "random" function's tendency to spit out the same value all the time. We might as well drop zipfian from the test altogether and save ourselves some buildfarm cycles. regards, tom lane
Hello Tom, >> Given these results, I do not think that it is useful to change >> random_zipfian TAP test parameter from 2.5 to something else. > > I'm not following this argument. The test case is basically useless > for its intended purpose with that parameter, because it's highly > likely that the failure mode it's supposedly checking for will be > masked by the "random" function's tendency to spit out the same > value all the time. The first value is taken about 75% of the time for N=1000 and s=2.5, which means that a non deterministic implementation would succeed about 0.75² ~ 56% of the time on that one. Then there is other lower probability random successes. ISTM that if a test fails every three run it would be detected, so the purpose of testing random_zipfian determinism is somehow served. Also, the drawing procedure is less efficient when the parameter is close to 1 because it is more likely to loop, and there are other values tested, 0.5 and 1.3 (note that the code has two methods, depending on whether the parameter is below or above 1), so I think that having something different is better. If you want something more drastic, using 1.5 instead of 2.5 would reduce the probability of accidentaly passing the test by chance to about 20%, so it would fail 80% of the time. > We might as well drop zipfian from the test altogether and save > ourselves some buildfarm cycles. All 4 random functions are tested together on the same run, removing a particular one does not seem desirable to me. -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: >>> Here is a POC which defines an internal interface for a PRNG, and use it >>> within pgbench, with several possible implementations which default to >>> rand48. >> I seriously dislike this patch. pgbench's random support is quite >> overengineered already IMO, and this proposes to add a whole batch of >> new code and new APIs to fix a very small bug. > My intention is rather to discuss postgres' PRNG, in passing. Full success > on this point:-) Our immediate problem is to fix a portability failure, which we need to back-patch into at least one released branch, ergo conservatism is warranted. I had in mind something more like the attached. regards, tom lane diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index b5c75ce..d5b543f 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -185,7 +185,7 @@ int64 latency_limit = 0; char *tablespace = NULL; char *index_tablespace = NULL; -/* random seed used when calling srandom() */ +/* random seed used to initialize base_random_sequence */ int64 random_seed = -1; /* @@ -287,6 +287,9 @@ typedef struct RandomState unsigned short xseed[3]; } RandomState; +/* Various random sequences are initialized from this one. */ +static RandomState base_random_sequence; + /* * Connection state machine states. */ @@ -598,6 +601,8 @@ static const BuiltinScript builtin_script[] = /* Function prototypes */ +static void initRandomState(RandomState *random_state); +static int64 getrand(RandomState *random_state, int64 min, int64 max); static void setNullValue(PgBenchValue *pv); static void setBoolValue(PgBenchValue *pv, bool bval); static void setIntValue(PgBenchValue *pv, int64 ival); @@ -833,16 +838,28 @@ strtodouble(const char *str, bool errorOK, double *dv) /* * Initialize a random state struct. + * + * We derive the seed from base_random_sequence, which must be set up already. */ static void initRandomState(RandomState *random_state) { - random_state->xseed[0] = random(); - random_state->xseed[1] = random(); - random_state->xseed[2] = random(); + random_state->xseed[0] = (unsigned short) + getrand(&base_random_sequence, 0, 0xFFFF); + random_state->xseed[1] = (unsigned short) + getrand(&base_random_sequence, 0, 0xFFFF); + random_state->xseed[2] = (unsigned short) + getrand(&base_random_sequence, 0, 0xFFFF); } -/* random number generator: uniform distribution from min to max inclusive */ +/* + * Random number generator: uniform distribution from min to max inclusive. + * + * Although the limits are expressed as int64, you can't generate the full + * int64 range in one call, because the difference of the limits mustn't + * overflow int64. In practice it's unwise to ask for more than an int32 + * range, because of the limited precision of pg_erand48(). + */ static int64 getrand(RandomState *random_state, int64 min, int64 max) { @@ -5126,12 +5143,14 @@ printResults(TState *threads, StatsData *total, instr_time total_time, } } -/* call srandom based on some seed. NULL triggers the default behavior. */ +/* + * Set up a random seed according to seed parameter (NULL means default), + * and initialize base_random_sequence for use in initializing other sequences. + */ static bool set_random_seed(const char *seed) { - /* srandom expects an unsigned int */ - unsigned int iseed; + uint64 iseed; if (seed == NULL || strcmp(seed, "time") == 0) { @@ -5139,7 +5158,7 @@ set_random_seed(const char *seed) instr_time now; INSTR_TIME_SET_CURRENT(now); - iseed = (unsigned int) INSTR_TIME_GET_MICROSEC(now); + iseed = (uint64) INSTR_TIME_GET_MICROSEC(now); } else if (strcmp(seed, "rand") == 0) { @@ -5155,7 +5174,7 @@ set_random_seed(const char *seed) /* parse seed unsigned int value */ char garbage; - if (sscanf(seed, "%u%c", &iseed, &garbage) != 1) + if (sscanf(seed, UINT64_FORMAT "%c", &iseed, &garbage) != 1) { fprintf(stderr, "unrecognized random seed option \"%s\": expecting an unsigned integer, \"time\" or \"rand\"\n", @@ -5165,10 +5184,14 @@ set_random_seed(const char *seed) } if (seed != NULL) - fprintf(stderr, "setting random seed to %u\n", iseed); - srandom(iseed); - /* no precision loss: 32 bit unsigned int cast to 64 bit int */ + fprintf(stderr, "setting random seed to " UINT64_FORMAT "\n", iseed); random_seed = iseed; + + /* Fill base_random_sequence with low-order bits of seed */ + base_random_sequence.xseed[0] = iseed & 0xFFFF; + base_random_sequence.xseed[1] = (iseed >> 16) & 0xFFFF; + base_random_sequence.xseed[2] = (iseed >> 32) & 0xFFFF; + return true; } @@ -5862,10 +5885,9 @@ main(int argc, char **argv) /* set default seed for hash functions */ if (lookupVariable(&state[0], "default_seed") == NULL) { - uint64 seed = ((uint64) (random() & 0xFFFF) << 48) | - ((uint64) (random() & 0xFFFF) << 32) | - ((uint64) (random() & 0xFFFF) << 16) | - (uint64) (random() & 0xFFFF); + uint64 seed = + ((uint64) getrand(&base_random_sequence, 0, 0xFFFFFFFF)) | + (((uint64) getrand(&base_random_sequence, 0, 0xFFFFFFFF)) << 32); for (i = 0; i < nclients; i++) if (!putVariableInt(&state[i], "startup", "default_seed", (int64) seed))
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> I'm not following this argument. The test case is basically useless >> for its intended purpose with that parameter, because it's highly >> likely that the failure mode it's supposedly checking for will be >> masked by the "random" function's tendency to spit out the same >> value all the time. > The first value is taken about 75% of the time for N=1000 and s=2.5, which > means that a non deterministic implementation would succeed about 0.75² ~ > 56% of the time on that one. Right, that's about what we've been seeing on OpenBSD. > Also, the drawing procedure is less efficient when the parameter is close > to 1 because it is more likely to loop, That might be something to fix, but I agree it's a reason not to go overboard trying to flatten the test case's distribution right now. > If you want something more drastic, using 1.5 instead of 2.5 would reduce > the probability of accidentaly passing the test by chance to about 20%, so > it would fail 80% of the time. I think your math is off; 1.5 works quite well here. I saw one failure to produce distinct values in 20 attempts. It's not demonstrably slower than 2.5 either. (1.1 is measurably slower; probably not by enough for anyone to care, but 1.5 is good enough for me.) regards, tom lane
>> The first value is taken about 75% of the time for N=1000 and s=2.5, which >> means that a non deterministic implementation would succeed about 0.75² ~ >> 56% of the time on that one. > > Right, that's about what we've been seeing on OpenBSD. > >> Also, the drawing procedure is less efficient when the parameter is close >> to 1 because it is more likely to loop, > > That might be something to fix, but I agree it's a reason not to go > overboard trying to flatten the test case's distribution right now. Probably you would have to invent a new method to draw a zipfian distribution for that, which would be nice. >> If you want something more drastic, using 1.5 instead of 2.5 would reduce >> the probability of accidentaly passing the test by chance to about 20%, so >> it would fail 80% of the time. > > I think your math is off; Argh. Although I confirm my computation, ISTM that with 1.5 the first value as 39% chance of getting out so collision on 15% of cases, second value 14% so collision on 2%, ... total cumulated probability about 18%. > 1.5 works quite well here. I saw one failure to produce distinct values > in 20 attempts. For 3 failure expected, that is possible. > It's not demonstrably slower than 2.5 either. (1.1 is measurably > slower; probably not by enough for anyone to care, but 1.5 is good > enough for me.) Good if it fails quick enough for you. -- Fabien.
>> It's not demonstrably slower than 2.5 either. (1.1 is measurably slower; >> probably not by enough for anyone to care, but 1.5 is good enough for me.) > > Good if it fails quick enough for you. Attached a patch with the zipf doc update & the TAP test parameter change. -- Fabien.
Attachment
Hello Tom, >>>> Here is a POC which defines an internal interface for a PRNG, and use it >>>> within pgbench, with several possible implementations which default to >>>> rand48. > >>> I seriously dislike this patch. pgbench's random support is quite >>> overengineered already IMO, and this proposes to add a whole batch of >>> new code and new APIs to fix a very small bug. > >> My intention is rather to discuss postgres' PRNG, in passing. Full success >> on this point:-) > > Our immediate problem is to fix a portability failure, which we need to > back-patch into at least one released branch, ergo conservatism is > warranted. Sure, the patch I sent is definitely not for backpatching, it is for discussion. > I had in mind something more like the attached. Yep. I'm not too happy that it mixes API levels, and about the int/double/int path. Attached an updated version which relies on pg_jrand48 instead. Also, as the pseudo-random state is fully controlled, seeded test results are deterministic so the expected value can be fully checked. I did a few sanity tests which were all ok. I think that this version is appropriate for backpatching. I also think that it would be appropriate to consider having a better PRNG to replace rand48 in a future release. -- Fabien.
Attachment
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> I had in mind something more like the attached. > Yep. > I'm not too happy that it mixes API levels, and about the int/double/int > path. > Attached an updated version which relies on pg_jrand48 instead. Hm, I'm not sure that's really an improvement, but I pushed it like that (and the other change along with it). > Also, as > the pseudo-random state is fully controlled, seeded test results are > deterministic so the expected value can be fully checked. I found that the "expected value" was different in v11 than HEAD, which surprised me. It looks like the reason is that HEAD sets up more/different RandomStates from the same seed than v11 did. Not sure if it's a good thing for this behavior to change across versions. regards, tom lane
On 2019-Jan-24, Tom Lane wrote: > > Also, as > > the pseudo-random state is fully controlled, seeded test results are > > deterministic so the expected value can be fully checked. > > I found that the "expected value" was different in v11 than HEAD, > which surprised me. It looks like the reason is that HEAD sets up > more/different RandomStates from the same seed than v11 did. Not > sure if it's a good thing for this behavior to change across versions. The rationale behind this was that some internal uses of random numbers messed up the determinism of user-invoked random functions; 409231919443 commit message says While at it, use separate random state for thread administratrivia such as deciding which script to run, how long to delay for throttling, or whether to log a message when sampling; this not only makes these tasks independent of each other, but makes the actual thread run deterministic. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services