Re: Next Steps with Hash Indexes - Mailing list pgsql-hackers
From | Sadhuprasad Patro |
---|---|
Subject | Re: Next Steps with Hash Indexes |
Date | |
Msg-id | CAFF0-CEB1U27wmStSsexzsgf=X0RRHNjNA-PmUSdQxcSQhqcwQ@mail.gmail.com Whole thread Raw |
In response to | Re: Next Steps with Hash Indexes (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Next Steps with Hash Indexes
|
List | pgsql-hackers |
> > That's a significant difference. Have you checked via perf or some > other way what causes this difference? I have seen that sometimes > single client performance with pgbench is not stable, so can you > please once check with 4 clients or so and possibly with a larger > dataset as well. I have verified manually, without the PGBENCH tool also. I can see a significant difference for each query fired in both the versions of patch implemented. We can see as mentioned below, I have run the SAME query on the SAME dataset on both patches. We have a significant performance impact with Separate Hash values for multiple key columns. SingleHash_MultiColumn: postgres=# create table perftest(a int, b int, c int, d int, e int, f int); CREATE TABLE postgres=# insert into perftest values (generate_series(1, 10000000), generate_series(1, 10000000), generate_series(1, 10000000), 9, 7); INSERT 0 10000000 postgres=# create index idx on perftest using hash(a, b, c); CREATE INDEX postgres=# select * from perftest where a=5999 and b=5999 and c=5999; a | b | c | d | e | f ------+------+------+---+---+--- 5999 | 5999 | 5999 | 9 | 7 | (1 row) Time: 2.022 ms postgres=# select * from perftest where a=597989 and b=597989 and c=597989; a | b | c | d | e | f --------+--------+--------+---+---+--- 597989 | 597989 | 597989 | 9 | 7 | (1 row) Time: 0.867 ms postgres=# select * from perftest where a=6297989 and b=6297989 and c=6297989; a | b | c | d | e | f ---------+---------+---------+---+---+--- 6297989 | 6297989 | 6297989 | 9 | 7 | (1 row) Time: 1.439 ms postgres=# select * from perftest where a=6290798 and b=6290798 and c=6290798; a | b | c | d | e | f ---------+---------+---------+---+---+--- 6290798 | 6290798 | 6290798 | 9 | 7 | (1 row) Time: 1.013 ms postgres=# select * from perftest where a=6290791 and b=6290791 and c=6290791; a | b | c | d | e | f ---------+---------+---------+---+---+--- 6290791 | 6290791 | 6290791 | 9 | 7 | (1 row) Time: 0.903 ms postgres=# select * from perftest where a=62907 and b=62907 and c=62907; a | b | c | d | e | f -------+-------+-------+---+---+--- 62907 | 62907 | 62907 | 9 | 7 | (1 row) Time: 0.894 ms SeparateHash_MultiColumn: postgres=# create table perftest(a int, b int, c int, d int, e int, f int); CREATE TABLE postgres=# insert into perftest values (generate_series(1, 10000000), generate_series(1, 10000000), generate_series(1, 10000000), 9, 7); INSERT 0 10000000 postgres=# create index idx on perftest using hash(a, b, c); CREATE INDEX postgres=# select * from perftest where a=5999 and b=5999 and c=5999; a | b | c | d | e | f ------+------+------+---+---+--- 5999 | 5999 | 5999 | 9 | 7 | (1 row) Time: 2.915 ms postgres=# select * from perftest where a=597989 and b=597989 and c=597989; a | b | c | d | e | f --------+--------+--------+---+---+--- 597989 | 597989 | 597989 | 9 | 7 | (1 row) Time: 1.129 ms postgres=# select * from perftest where a=6297989 and b=6297989 and c=6297989; a | b | c | d | e | f ---------+---------+---------+---+---+--- 6297989 | 6297989 | 6297989 | 9 | 7 | (1 row) Time: 2.454 ms postgres=# select * from perftest where a=6290798 and b=6290798 and c=6290798; a | b | c | d | e | f ---------+---------+---------+---+---+--- 6290798 | 6290798 | 6290798 | 9 | 7 | (1 row) Time: 2.327 ms postgres=# select * from perftest where a=6290791 and b=6290791 and c=6290791; a | b | c | d | e | f ---------+---------+---------+---+---+--- 6290791 | 6290791 | 6290791 | 9 | 7 | (1 row) Time: 1.676 ms postgres=# select * from perftest where a=62907 and b=62907 and c=62907; a | b | c | d | e | f -------+-------+-------+---+---+--- 62907 | 62907 | 62907 | 9 | 7 | (1 row) Time: 2.614 ms If I do a test with 4 clients, then there is not much visible difference. I think this is because of contentions. And here our focus is single thread & single operation performance. > > One more thing to consider is that it seems that the planner requires > a condition for the first column of an index before considering an > indexscan plan. See Tom's email [1] in this regard. I think it would > be better to see what kind of work is involved there if you want to > explore a single hash value for all columns idea. > > [1] - https://www.postgresql.org/message-id/29263.1506483172%40sss.pgh.pa.us About this point, I will analyze further and update. Thanks & Regards SadhuPrasad EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: