Re: [PATCH] random_normal function - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [PATCH] random_normal function |
Date | |
Msg-id | 67201.1673307519@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [PATCH] random_normal function (Dean Rasheed <dean.a.rasheed@gmail.com>) |
Responses |
Re: [PATCH] random_normal function
|
List | pgsql-hackers |
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > On Mon, 9 Jan 2023 at 18:52, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> (We could probably go further >> than this, like trying to verify distribution properties. But >> it's been too long since college statistics for me to want to >> write such tests myself, and I'm not real sure we need them.) > I played around with the Kolmogorov–Smirnov test, to test random() for > uniformity. The following passes roughly 99.9% of the time: Ah, cool, thanks for this code. > With a one-in-a-thousand chance of failing, if you wanted something > with around a one-in-a-billion chance of failing, you could just try > it 3 times: > SELECT ks_test_uniform_random() OR > ks_test_uniform_random() OR > ks_test_uniform_random(); > but it feels pretty hacky, and probably not really necessary. That seems like a good way, because I'm not satisfied with one-in-a-thousand odds if we want to remove the "ignore" marker. It's still plenty fast enough: on my machine, the v2 patch below takes about 19ms, versus 22ms for the script as it stands in HEAD. > Rigorous tests for other distributions are harder, but also probably > not necessary if we have confidence that the underlying PRNG is > uniform. Agreed. >> BTW, if this does bring the probability of failure down to the >> one-in-a-billion range, I think we could also nuke the whole >> "ignore:" business, simplifying pg_regress and allowing the >> random test to be run in parallel with others. > I didn't check the one-in-a-billion claim, but +1 for that. I realized that we do already run random in a parallel group; the "ignore: random" line in parallel_schedule just marks it as failure-ignorable, it doesn't schedule it. (The comment is a bit misleading about this, but I want to remove that not rewrite it.) Nonetheless, nuking the whole ignore-failures mechanism seems like good cleanup to me. Also, I tried this on some 32-bit big-endian hardware (NetBSD on macppc) to verify my thesis that the results of random() are now machine independent. That part works, but the random_normal() tests didn't; I saw low-order-bit differences from the results on x86_64 Linux. Presumably, that's because one or more of sqrt()/log()/sin() are rounding off a bit differently there. v2 attached deals with this by backing off to "extra_float_digits = 0" for that test. Once it hits the buildfarm we might find we have to reduce extra_float_digits some more, but that was enough to make NetBSD/macppc happy. regards, tom lane diff --git a/src/test/regress/expected/random.out b/src/test/regress/expected/random.out index 30bd866138..8785c88ad2 100644 --- a/src/test/regress/expected/random.out +++ b/src/test/regress/expected/random.out @@ -1,81 +1,146 @@ -- -- RANDOM --- Test the random function +-- Test random() and allies -- --- count the number of tuples originally, should be 1000 -SELECT count(*) FROM onek; - count -------- - 1000 -(1 row) - --- pick three random rows, they shouldn't match -(SELECT unique1 AS random - FROM onek ORDER BY random() LIMIT 1) -INTERSECT -(SELECT unique1 AS random - FROM onek ORDER BY random() LIMIT 1) -INTERSECT -(SELECT unique1 AS random - FROM onek ORDER BY random() LIMIT 1); - random --------- +-- Tests in this file may have a small probability of failure, +-- since we are dealing with randomness. Try to keep the failure +-- risk for any one test case under 1e-9. +-- +-- There should be no duplicates in 1000 random() values. +-- (Assuming 52 random bits in the float8 results, we could +-- take as many as 3000 values and still have less than 1e-9 chance +-- of failure, per https://en.wikipedia.org/wiki/Birthday_problem) +SELECT r, count(*) +FROM (SELECT random() r FROM generate_series(1, 1000)) ss +GROUP BY r HAVING count(*) > 1; + r | count +---+------- (0 rows) --- count roughly 1/10 of the tuples -CREATE TABLE RANDOM_TBL AS - SELECT count(*) AS random - FROM onek WHERE random() < 1.0/10; --- select again, the count should be different -INSERT INTO RANDOM_TBL (random) - SELECT count(*) - FROM onek WHERE random() < 1.0/10; --- select again, the count should be different -INSERT INTO RANDOM_TBL (random) - SELECT count(*) - FROM onek WHERE random() < 1.0/10; --- select again, the count should be different -INSERT INTO RANDOM_TBL (random) - SELECT count(*) - FROM onek WHERE random() < 1.0/10; --- now test that they are different counts -SELECT random, count(random) FROM RANDOM_TBL - GROUP BY random HAVING count(random) > 3; - random | count ---------+------- -(0 rows) +-- The range should be [0, 1). We can expect that at least one out of 2000 +-- random values is in the lowest or highest 1% of the range with failure +-- probability less than about 1e-9. +SELECT count(*) FILTER (WHERE r < 0 OR r >= 1) AS out_of_range, + (count(*) FILTER (WHERE r < 0.01)) > 0 AS has_small, + (count(*) FILTER (WHERE r > 0.99)) > 0 AS has_large +FROM (SELECT random() r FROM generate_series(1, 2000)) ss; + out_of_range | has_small | has_large +--------------+-----------+----------- + 0 | t | t +(1 row) -SELECT AVG(random) FROM RANDOM_TBL - HAVING AVG(random) NOT BETWEEN 80 AND 120; - avg ------ -(0 rows) +-- Check for uniform distribution using the Kolmogorov–Smirnov test. +CREATE FUNCTION ks_test_uniform_random() +RETURNS boolean AS +$$ +DECLARE + n int := 1000; -- Number of samples + c float8 := 1.94947; -- Critical value for 99.9% confidence + ok boolean; +BEGIN + ok := ( + WITH samples AS ( + SELECT random() r FROM generate_series(1, n) ORDER BY 1 + ), indexed_samples AS ( + SELECT (row_number() OVER())-1.0 i, r FROM samples + ) + SELECT max(abs(i/n-r)) < c / sqrt(n) FROM indexed_samples + ); + RETURN ok; +END +$$ +LANGUAGE plpgsql; +-- As written, ks_test_uniform_random() returns true about 99.9% +-- of the time. To get down to a roughly 1e-9 test failure rate, +-- just run it 3 times and accept if any one of them passes. +SELECT ks_test_uniform_random() OR + ks_test_uniform_random() OR + ks_test_uniform_random() AS uniform; + uniform +--------- + t +(1 row) -- now test random_normal() -TRUNCATE random_tbl; -INSERT INTO random_tbl (random) - SELECT count(*) - FROM onek WHERE random_normal(0, 1) < 0; -INSERT INTO random_tbl (random) - SELECT count(*) - FROM onek WHERE random_normal(0) < 0; -INSERT INTO random_tbl (random) - SELECT count(*) - FROM onek WHERE random_normal() < 0; -INSERT INTO random_tbl (random) - SELECT count(*) - FROM onek WHERE random_normal(stddev => 1, mean => 0) < 0; --- expect similar, but not identical values -SELECT random, count(random) FROM random_tbl - GROUP BY random HAVING count(random) > 3; - random | count ---------+------- +-- As above, there should be no duplicates in 1000 random_normal() values. +SELECT r, count(*) +FROM (SELECT random_normal() r FROM generate_series(1, 1000)) ss +GROUP BY r HAVING count(*) > 1; + r | count +---+------- (0 rows) --- approximately check expected distribution -SELECT AVG(random) FROM random_tbl - HAVING AVG(random) NOT BETWEEN 400 AND 600; - avg ------ -(0 rows) +-- ... unless we force the range (standard deviation) to zero. +-- This is a good place to check that the mean input does something, too. +SELECT r, count(*) +FROM (SELECT random_normal(10, 0) r FROM generate_series(1, 100)) ss +GROUP BY r; + r | count +----+------- + 10 | 100 +(1 row) + +SELECT r, count(*) +FROM (SELECT random_normal(-10, 0) r FROM generate_series(1, 100)) ss +GROUP BY r; + r | count +-----+------- + -10 | 100 +(1 row) + +-- setseed() should produce a reproducible series of random() values. +SELECT setseed(0.5); + setseed +--------- + +(1 row) + +SELECT random() FROM generate_series(1, 10); + random +--------------------- + 0.9851677175347999 + 0.825301858027981 + 0.12974610012450416 + 0.16356291958601088 + 0.6476186144084 + 0.8822771983038762 + 0.1404566845227775 + 0.15619865764623442 + 0.5145227426983392 + 0.7712969548127826 +(10 rows) + +-- Likewise for random_normal(); however, since its implementation relies +-- on libm functions that have different roundoff behaviors on different +-- machines, we have to round off the results a bit to get consistent output. +SET extra_float_digits = 0; +SELECT random_normal() FROM generate_series(1, 10); + random_normal +-------------------- + 0.208534644938377 + 0.264530240540963 + -0.606752467900428 + 0.825799427852654 + 1.70111611735357 + -0.223445463716189 + 0.249712419190998 + -1.2494722990669 + 0.125627152043677 + 0.475391614544013 +(10 rows) + +SELECT random_normal(mean => 1, stddev => 0.1) r FROM generate_series(1, 10); + r +------------------- + 1.00605972811732 + 1.09685453015002 + 1.02869206132007 + 0.909475676712336 + 0.983724763134265 + 0.939344549577623 + 1.18713500206363 + 0.962257684292933 + 0.914441206800407 + 0.964031055575433 +(10 rows) diff --git a/src/test/regress/sql/random.sql b/src/test/regress/sql/random.sql index 3104af46b7..ce1cc37176 100644 --- a/src/test/regress/sql/random.sql +++ b/src/test/regress/sql/random.sql @@ -1,68 +1,85 @@ -- -- RANDOM --- Test the random function +-- Test random() and allies +-- +-- Tests in this file may have a small probability of failure, +-- since we are dealing with randomness. Try to keep the failure +-- risk for any one test case under 1e-9. -- --- count the number of tuples originally, should be 1000 -SELECT count(*) FROM onek; - --- pick three random rows, they shouldn't match -(SELECT unique1 AS random - FROM onek ORDER BY random() LIMIT 1) -INTERSECT -(SELECT unique1 AS random - FROM onek ORDER BY random() LIMIT 1) -INTERSECT -(SELECT unique1 AS random - FROM onek ORDER BY random() LIMIT 1); - --- count roughly 1/10 of the tuples -CREATE TABLE RANDOM_TBL AS - SELECT count(*) AS random - FROM onek WHERE random() < 1.0/10; - --- select again, the count should be different -INSERT INTO RANDOM_TBL (random) - SELECT count(*) - FROM onek WHERE random() < 1.0/10; - --- select again, the count should be different -INSERT INTO RANDOM_TBL (random) - SELECT count(*) - FROM onek WHERE random() < 1.0/10; - --- select again, the count should be different -INSERT INTO RANDOM_TBL (random) - SELECT count(*) - FROM onek WHERE random() < 1.0/10; - --- now test that they are different counts -SELECT random, count(random) FROM RANDOM_TBL - GROUP BY random HAVING count(random) > 3; - -SELECT AVG(random) FROM RANDOM_TBL - HAVING AVG(random) NOT BETWEEN 80 AND 120; +-- There should be no duplicates in 1000 random() values. +-- (Assuming 52 random bits in the float8 results, we could +-- take as many as 3000 values and still have less than 1e-9 chance +-- of failure, per https://en.wikipedia.org/wiki/Birthday_problem) +SELECT r, count(*) +FROM (SELECT random() r FROM generate_series(1, 1000)) ss +GROUP BY r HAVING count(*) > 1; + +-- The range should be [0, 1). We can expect that at least one out of 2000 +-- random values is in the lowest or highest 1% of the range with failure +-- probability less than about 1e-9. + +SELECT count(*) FILTER (WHERE r < 0 OR r >= 1) AS out_of_range, + (count(*) FILTER (WHERE r < 0.01)) > 0 AS has_small, + (count(*) FILTER (WHERE r > 0.99)) > 0 AS has_large +FROM (SELECT random() r FROM generate_series(1, 2000)) ss; + +-- Check for uniform distribution using the Kolmogorov–Smirnov test. + +CREATE FUNCTION ks_test_uniform_random() +RETURNS boolean AS +$$ +DECLARE + n int := 1000; -- Number of samples + c float8 := 1.94947; -- Critical value for 99.9% confidence + ok boolean; +BEGIN + ok := ( + WITH samples AS ( + SELECT random() r FROM generate_series(1, n) ORDER BY 1 + ), indexed_samples AS ( + SELECT (row_number() OVER())-1.0 i, r FROM samples + ) + SELECT max(abs(i/n-r)) < c / sqrt(n) FROM indexed_samples + ); + RETURN ok; +END +$$ +LANGUAGE plpgsql; + +-- As written, ks_test_uniform_random() returns true about 99.9% +-- of the time. To get down to a roughly 1e-9 test failure rate, +-- just run it 3 times and accept if any one of them passes. +SELECT ks_test_uniform_random() OR + ks_test_uniform_random() OR + ks_test_uniform_random() AS uniform; -- now test random_normal() -TRUNCATE random_tbl; -INSERT INTO random_tbl (random) - SELECT count(*) - FROM onek WHERE random_normal(0, 1) < 0; -INSERT INTO random_tbl (random) - SELECT count(*) - FROM onek WHERE random_normal(0) < 0; -INSERT INTO random_tbl (random) - SELECT count(*) - FROM onek WHERE random_normal() < 0; -INSERT INTO random_tbl (random) - SELECT count(*) - FROM onek WHERE random_normal(stddev => 1, mean => 0) < 0; +-- As above, there should be no duplicates in 1000 random_normal() values. +SELECT r, count(*) +FROM (SELECT random_normal() r FROM generate_series(1, 1000)) ss +GROUP BY r HAVING count(*) > 1; --- expect similar, but not identical values -SELECT random, count(random) FROM random_tbl - GROUP BY random HAVING count(random) > 3; +-- ... unless we force the range (standard deviation) to zero. +-- This is a good place to check that the mean input does something, too. +SELECT r, count(*) +FROM (SELECT random_normal(10, 0) r FROM generate_series(1, 100)) ss +GROUP BY r; +SELECT r, count(*) +FROM (SELECT random_normal(-10, 0) r FROM generate_series(1, 100)) ss +GROUP BY r; --- approximately check expected distribution -SELECT AVG(random) FROM random_tbl - HAVING AVG(random) NOT BETWEEN 400 AND 600; +-- setseed() should produce a reproducible series of random() values. + +SELECT setseed(0.5); + +SELECT random() FROM generate_series(1, 10); + +-- Likewise for random_normal(); however, since its implementation relies +-- on libm functions that have different roundoff behaviors on different +-- machines, we have to round off the results a bit to get consistent output. +SET extra_float_digits = 0; + +SELECT random_normal() FROM generate_series(1, 10); +SELECT random_normal(mean => 1, stddev => 0.1) r FROM generate_series(1, 10);
pgsql-hackers by date: