Thread: [patch] gsoc, improving hash index v2
Hi, hackers.
I've post a hash patch in a previous thread http://archives.postgresql.org/pgsql-hackers/2008-07/msg00794.php
I do apologize for the bad readability of previous patch. Thank you all for your comments.
Here is a new patch which fixed some bugs in the previous one.
I post it here to get some feedback and further suggestion. Any comment is welcome.
Changes since v1:
- fix bug that it crashed in _h_spool when test big data set
- adjust the target-fillfactor calculation in _hash_metapinit
- remove the HASHVALUE_ONLY macro
- replace _create_hash_desc with _get_hash_desc to get a hard-coded hash index tuple.
- replace index_getattr with _hash_get_datum to get the hash key datum and avoid too many calls to _get_hash_desc and index_getattr
Here is what I intend to do.
Todo:
- get the statistics of block access i/o
- write unit tests using pgunitest to test the following:
(Josh Berkus suggested in this thread http://archives.postgresql.org/pgsql-hackers/2008-05/msg00535.php )
I'll make some test on bigger data set later.
using a word list of 3628800 unique words
The table size is 139MB.
Index BuildTime IndexSize
---- ---- ----
btree 51961.123 ms 93MB
hash 411069.264 ms 2048MB
hash-patch 36288.931 ms 128MB
dict=# SELECT * from hash-dict where word = '0234567891' ;
word
------------
0234567891
(1 row)
Time: 33.960 ms
dict=# SELECT * from btree-dict where word = '0234567891' ;
word
------------
0234567891
(1 row)
Time: 1.662 ms
dict=# SELECT * from hash2-dict where word = '0234567891' ;
word
------------
0234567891
(1 row)
Time: 1.457 ms
At last, there is a problem I encounter.
I'm confused by the function _hash_checkqual.
IMHO, the index tuple only store one column here and key->sk_attno should always be 1 here.
And scanKeySize should be 1 since we didn't support multi-column hash yet.
Do I make some misunderstanding?
/*
* _hash_checkqual -- does the index tuple satisfy the scan conditions?
*/
bool
_hash_checkqual(IndexScanDesc scan, IndexTuple itup)
{
TupleDesc tupdesc = RelationGetDescr(scan->indexRelation);
ScanKey key = scan->keyData;
int scanKeySize = scan->numberOfKeys;
IncrIndexProcessed();
while (scanKeySize > 0)
{
Datum datum;
bool isNull;
Datum test;
datum = index_getattr(itup,
key->sk_attno,
tupdesc,
&isNull);
/* assume sk_func is strict */
if (isNull)
return false;
if (key->sk_flags & SK_ISNULL)
return false;
test = FunctionCall2(&key->sk_func, datum, key->sk_argument);
if (!DatumGetBool(test))
return false;
key++;
scanKeySize--;
}
return true;
}
Hope to hear from you.
--
Best Regards,
Xiao Meng
DKERC, Harbin Institute of Technology, China
Gtalk: mx.cogito@gmail.com
MSN: cnEnder@live.com
http://xiaomeng.yo2.cn
I've post a hash patch in a previous thread http://archives.postgresql.org/pgsql-hackers/2008-07/msg00794.php
I do apologize for the bad readability of previous patch. Thank you all for your comments.
Here is a new patch which fixed some bugs in the previous one.
I post it here to get some feedback and further suggestion. Any comment is welcome.
Changes since v1:
- fix bug that it crashed in _h_spool when test big data set
- adjust the target-fillfactor calculation in _hash_metapinit
- remove the HASHVALUE_ONLY macro
- replace _create_hash_desc with _get_hash_desc to get a hard-coded hash index tuple.
- replace index_getattr with _hash_get_datum to get the hash key datum and avoid too many calls to _get_hash_desc and index_getattr
Here is what I intend to do.
Todo:
- get the statistics of block access i/o
- write unit tests using pgunitest to test the following:
(Josh Berkus suggested in this thread http://archives.postgresql.org/pgsql-hackers/2008-05/msg00535.php )
bulk load, both COPY and INSERT
single-row updates, inserts and deletes
batch update by key
batch update by other index
batch delete by key
batch delete by other index
concurrent index updates (64 connections insert/deleting concurrently)
I makes some simple test mentioned here (http://archives.postgresql.org/pgsql-hackers/2007-09/msg00208.php)single-row updates, inserts and deletes
batch update by key
batch update by other index
batch delete by key
batch delete by other index
concurrent index updates (64 connections insert/deleting concurrently)
I'll make some test on bigger data set later.
using a word list of 3628800 unique words
The table size is 139MB.
Index BuildTime IndexSize
---- ---- ----
btree 51961.123 ms 93MB
hash 411069.264 ms 2048MB
hash-patch 36288.931 ms 128MB
dict=# SELECT * from hash-dict where word = '0234567891' ;
word
------------
0234567891
(1 row)
Time: 33.960 ms
dict=# SELECT * from btree-dict where word = '0234567891' ;
word
------------
0234567891
(1 row)
Time: 1.662 ms
dict=# SELECT * from hash2-dict where word = '0234567891' ;
word
------------
0234567891
(1 row)
Time: 1.457 ms
At last, there is a problem I encounter.
I'm confused by the function _hash_checkqual.
IMHO, the index tuple only store one column here and key->sk_attno should always be 1 here.
And scanKeySize should be 1 since we didn't support multi-column hash yet.
Do I make some misunderstanding?
/*
* _hash_checkqual -- does the index tuple satisfy the scan conditions?
*/
bool
_hash_checkqual(IndexScanDesc scan, IndexTuple itup)
{
TupleDesc tupdesc = RelationGetDescr(scan->indexRelation);
ScanKey key = scan->keyData;
int scanKeySize = scan->numberOfKeys;
IncrIndexProcessed();
while (scanKeySize > 0)
{
Datum datum;
bool isNull;
Datum test;
datum = index_getattr(itup,
key->sk_attno,
tupdesc,
&isNull);
/* assume sk_func is strict */
if (isNull)
return false;
if (key->sk_flags & SK_ISNULL)
return false;
test = FunctionCall2(&key->sk_func, datum, key->sk_argument);
if (!DatumGetBool(test))
return false;
key++;
scanKeySize--;
}
return true;
}
Hope to hear from you.
--
Best Regards,
Xiao Meng
DKERC, Harbin Institute of Technology, China
Gtalk: mx.cogito@gmail.com
MSN: cnEnder@live.com
http://xiaomeng.yo2.cn
Attachment
<div dir="ltr">Hi, hackers. Here is some test I run on a bigger set.<br /><br />Use a word list of 39916800 unique words<br/>The table size is 1529MB.<br />Index BuildTime IndexSize<br />---- ---- ----<br /> btree 874470.339ms 1027MB<br /> hash-patch 513026.381 ms 1024MB<br /><br />I use pgbench to test the time of a customquery script.<br />There are 2000 statements in the script.<br />It looks like this:<br />select * from dict whereword='123456789a0'<br /> ...<br />The time of the two index is<br />btree: 1/0.174700=5.00250125<br />hash-patch: 1/0.199900=5.724098<br/><br />---------------btree------------------<br /> $ pgbench -n -f /tmp/query.sql dict<br />transactiontype: Custom query<br /> scaling factor: 1<br />query mode: simple<br />number of clients: 1<br />number oftransactions per client: 10<br />number of transactions actually processed: 10/10<br />tps = 0.174694 (including connectionsestablishing)<br />tps = 0.174700 (excluding connections establishing)<br clear="all" /><br />---------------hash patch-------------<br />$ pgbench -n -f /tmp/query.sql dict<br />transaction type: Custom query<br/>scaling factor: 1<br />query mode: simple<br />number of clients: 1<br />number of transactions per client: 10<br/> number of transactions actually processed: 10/10<br />tps = 0.199892 (including connections establishing)<br />tps= 0.199900 (excluding connections establishing)<br /><br />-- <br />Best Regards,<br />Xiao Meng<br /><br />DKERC, HarbinInstitute of Technology, China<br /> Gtalk: <a href="mailto:mx.cogito@gmail.com" target="_blank">mx.cogito@gmail.com</a><br/>MSN: <a href="mailto:cnEnder@live.com" target="_blank">cnEnder@live.com</a><br/><a href="http://xiaomeng.yo2.cn" target="_blank">http://xiaomeng.yo2.cn</a><br /></div>
<div dir="ltr">sorry, I made some mistake here.<br />The time of the script on two indexes should be<br />btree: 1/0.174700=5.724098s<br/>hash-patch: 1/0.199900=5.00250125s<br /><br /><div class="gmail_quote">On Wed, Aug 6, 2008 at 9:33AM, Xiao Meng <span dir="ltr"><<a href="mailto:mx.cogito@gmail.com">mx.cogito@gmail.com</a>></span> wrote:<br /><blockquoteclass="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left:1ex;"><div dir="ltr">Hi, hackers. Here is some test I run on a bigger set.<br /><br />Use a word list of 39916800unique words<br /> The table size is 1529MB.<br />Index BuildTime IndexSize<br />---- ---- ----<br /> btree 874470.339ms 1027MB<br /> hash-patch 513026.381 ms 1024MB<br /><br />I use pgbenchto test the time of a custom query script.<br />There are 2000 statements in the script.<br />It looks like this:<br/>select * from dict where word='123456789a0'<br /> ...<br />The time of the two index is<br />btree: 1/0.174700=5.00250125<br/>hash-patch: 1/0.199900=5.724098<br /><br />---------------btree------------------<br /> $ pgbench-n -f /tmp/query.sql dict<br />transaction type: Custom query<br /> scaling factor: 1<br />query mode: simple<br />numberof clients: 1<br />number of transactions per client: 10<br />number of transactions actually processed: 10/10<br/>tps = 0.174694 (including connections establishing)<br />tps = 0.174700 (excluding connections establishing)<brclear="all" /><br />---------------hash patch-------------<br />$ pgbench -n -f /tmp/query.sql dict<br />transactiontype: Custom query<br />scaling factor: 1<br />query mode: simple<br />number of clients: 1<br />number of transactionsper client: 10<br /> number of transactions actually processed: 10/10<br />tps = 0.199892 (including connectionsestablishing)<br />tps = 0.199900 (excluding connections establishing)<div class="Ih2E3d"><br /><br />-- <br />BestRegards,<br />Xiao Meng<br /><br />DKERC, Harbin Institute of Technology, China<br /> Gtalk: <a href="mailto:mx.cogito@gmail.com"target="_blank">mx.cogito@gmail.com</a><br />MSN: <a href="mailto:cnEnder@live.com" target="_blank">cnEnder@live.com</a><br/><a href="http://xiaomeng.yo2.cn" target="_blank">http://xiaomeng.yo2.cn</a><br /></div></div></blockquote></div><br/><br clear="all" /><br />-- <br />Best Regards,<br />Xiao Meng<br /><br />DKERC, HarbinInstitute of Technology, China<br />Gtalk: <a href="mailto:mx.cogito@gmail.com">mx.cogito@gmail.com</a><br />MSN: <ahref="mailto:cnEnder@live.com">cnEnder@live.com</a><br /><a href="http://xiaomeng.yo2.cn">http://xiaomeng.yo2.cn</a><br/></div>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Xiao Meng wrote: > Hi, hackers. Here is some test I run on a bigger set. > > The time of the two index is > btree: 1/0.174700=5.00250125 > hash-patch: 1/0.199900=5.724098 Just to bring it to attention of everybody: btree: 1/0.174700=5.724098 hash-patch: 1/0.199900=5.00250125 Hence the hash _is_ actually faster. > ---------------btree------------------ > $ pgbench -n -f /tmp/query.sql dict > ... > tps = 0.174694 (including connections establishing) > tps = 0.174700 (excluding connections establishing) > > ---------------hash patch------------- > $ pgbench -n -f /tmp/query.sql dict > transaction type: Custom query > ... > tps = 0.199892 (including connections establishing) > tps = 0.199900 (excluding connections establishing) Jens -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFImQEZzhchXT4RR5ARAi2nAJ98ujYi+ZOHZybSQaOw11JFpkilIACg5DGu 0Mo+UPGsdd2ZFTGirMplFm4= =Qj5C -----END PGP SIGNATURE-----