Thread: BUG #13453: PostgreSQL 9.5dev pgbench exponential distribution bug? (when threshold is small)

The following bug has been logged on the website:

Bug reference:      13453
Logged by:          digoal
Email address:      digoal@126.com
PostgreSQL version: Unsupported/Unknown
Operating system:   CentOS 6.x x64
Description:

PostgreSQL 9.5 improve pgbench's random data generation.
but when i test the exp distribution, and use R drawn the probability,
picture in :
http://blog.163.com/digoal@126/blog/static/163877040201551891133433/
When threshold big, the distribute is exp like.
When threshold small, the distribute is not exp like.
Is it a bug?

EXP:
BIG threshold:
postgres=# truncate test;
TRUNCATE TABLE

pg95@db-172-16-3-150-> vi test.sql
\setrandom id 1 50000 exponential 50000
insert into test values (:id);

pg95@db-172-16-3-150-> pgbench -M prepared -n -r -f ./test.sql -P 2 -c 16 -j
16 -T 10
progress: 2.0 s, 94095.8 tps, lat 0.162 ms stddev 0.112
progress: 4.0 s, 95611.9 tps, lat 0.165 ms stddev 0.108
progress: 6.0 s, 94929.7 tps, lat 0.166 ms stddev 0.121
progress: 8.0 s, 96159.4 tps, lat 0.164 ms stddev 0.102
progress: 10.0 s, 95586.2 tps, lat 0.165 ms stddev 0.109
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 16
number of threads: 16
duration: 10 s
number of transactions actually processed: 952791
latency average: 0.164 ms
latency stddev: 0.111 ms
tps = 95271.583072 (including connections establishing)
tps = 95879.681335 (excluding connections establishing)
statement latencies in milliseconds:
        -0.002375       \setrandom id 1 50000 exponential 1.0
        0.160051        insert into test values (:id);

postgres=# analyze test;
ANALYZE

postgres=# select * from pg_stats where attname='id' and tablename='test';
-[ RECORD 1 ]----------+-----------------------------------------------
schemaname             | public
tablename              | test
attname                | id
inherited              | f
null_frac              | 0
avg_width              | 4
n_distinct             | 11
most_common_vals       | {1,2,3,4,5}
most_common_freqs      | {0.632533,0.231967,0.0867,0.0306333,0.0112333}
histogram_bounds       | {6,6,6,6,7,12}
correlation            | 0.458732
most_common_elems      |
most_common_elem_freqs |
elem_count_histogram   |


SMALL threshold:
pg95@db-172-16-3-150-> cat test.sql
\setrandom id 1 50000 exponential 0.0000001
insert into test values (:id);

pg95@db-172-16-3-150-> pgbench -M prepared -n -r -f ./test.sql -P 2 -c 16 -j
16 -T 10
progress: 2.0 s, 92637.1 tps, lat 0.163 ms stddev 0.113
progress: 4.0 s, 95241.1 tps, lat 0.166 ms stddev 0.108
progress: 6.0 s, 96215.2 tps, lat 0.164 ms stddev 0.100
progress: 8.0 s, 94753.8 tps, lat 0.167 ms stddev 0.113
progress: 10.0 s, 95137.5 tps, lat 0.166 ms stddev 0.110
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 16
number of threads: 16
duration: 10 s
number of transactions actually processed: 947987
latency average: 0.165 ms
latency stddev: 0.109 ms
tps = 94789.969844 (including connections establishing)
tps = 95580.465170 (excluding connections establishing)
statement latencies in milliseconds:
        -0.002310       \setrandom id 1 50000 exponential 0.0000001
        0.160256        insert into test values (:id);

postgres=# select count(*) from test;
 count
--------
 950281
(1 row)

postgres=# select id,count(*) from test group by id order by id limit 10
offset 0;
 id | count
----+-------
  1 |    22
  2 |    26
  3 |    23
  4 |    20
  5 |    12
  6 |    20
  7 |    21
  8 |    17
  9 |    20
 10 |    15
(10 rows)

postgres=# select id,count(*) from test group by id order by id limit 10
offset 49990;
  id   | count
-------+-------
 49991 |    17
 49992 |    16
 49993 |    16
 49994 |    25
 49995 |    22
 49996 |     9
 49997 |    19
 49998 |    21
 49999 |    20
 50000 |    26
(10 rows)

postgres=# select * from pg_stats where attname='id' and tablename='test';
-[ RECORD 1

]----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
schemaname             | public
tablename              | test
attname                | id
inherited              | f
null_frac              | 0
avg_width              | 4
n_distinct             | 48629
most_common_vals       |

{17193,26277,48402,5776,7702,12134,14558,17516,24763,25803,29786,31677,42740,46640,48128,216,282,759,1095,1143,1320,3580,4056,4280,4795,5147,5278,5347,5571,5924,6087,6497,7764,7938,8044,8157,8223,8452,9311,9667,10014,10192,10225,10409,10425,10464,10776,10893,11353,12763,13080,13557,13838,14068,14210,14416,14692,14758,15166,15279,15873,16540,16639,16997,17373,17809,17889,18319,19440,19754,20284,20679,20719,21090,21161,22446,22852,23249,23473,23810,24302,24674,24750,24926,25142,25660,25858,26025,26520,27380,27420,28124,28292,28408,28749,29285,30710,31808,31931,33011}
most_common_freqs      |

{0.0002,0.0002,0.0002,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333}
histogram_bounds       |

{1,512,991,1497,1972,2477,2886,3365,3890,4414,4896,5399,5909,6438,6921,7408,7876,8403,8926,9414,9912,10430,10904,11410,11959,12426,12956,13508,13991,14505,15013,15527,16005,16529,17007,17544,18068,18545,19036,19521,20081,20580,21044,21559,22088,22591,23126,23603,24091,24565,25089,25650,26139,26613,27088,27545,28053,28547,28999,29514,30025,30575,31083,31551,32050,32518,32989,33468,33984,34468,34955,35517,36034,36515,37016,37501,37967,38467,38988,39482,39932,40441,40943,41404,41921,42433,42952,43421,43913,44398,44883,45336,45889,46378,46852,47337,47860,48417,48928,49447,49999}
correlation            | 0.000980858
most_common_elems      |
most_common_elem_freqs |
elem_count_histogram   |
Hello,

> \setrandom id 1 50000 exponential 0.0000001
> insert into test values (:id);

Yes, a very small coefficient on a truncated exponential is pretty close
to a uniform distribution. I think that this is a mathematical fact, not a
bug.

--
Fabien.
Fabien COELHO wrote:
>
> Hello,
>
> >\setrandom id 1 50000 exponential 0.0000001
> >insert into test values (:id);
>
> Yes, a very small coefficient on a truncated exponential is pretty close to
> a uniform distribution. I think that this is a mathematical fact, not a bug.

It's a fact of life also ... you don't notice exponentially-growing
stuff until it's way out of hand :-)

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
At 2015-06-18 23:32:21, "Alvaro Herrera" <alvherre@2ndquadrant.com> wrote:
>Fabien COELHO wrote:
>> 
>> Hello,
>> 
>> >\setrandom id 1 50000 exponential 0.0000001
>> >insert into test values (:id);
>> 
>> Yes, a very small coefficient on a truncated exponential is pretty close to
>> a uniform distribution. I think that this is a mathematical fact, not a bug.
>
>It's a fact of life also ... you don't notice exponentially-growing
>stuff until it's way out of hand :-)
Thanks, great, it's my miss-understand. I use threshold=1,10,100,1000,10000 and test again, add plot pictures.
http://blog.163.com/digoal@126/blog/static/163877040201551891133433/
(The closer to 0 the threshold, the flatter (more uniform) the access distribution.)
> >-- >Álvaro Herrera http://www.2ndQuadrant.com/ >PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services