Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2) - Mailing list pgsql-hackers

From Kisung Kim
Subject Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)
Date
Msg-id CABF0Rr3zGQbLM4v7ahyZdUDQ-+fNOpHrpv7Ea+Q2WMts7Uc1Lg@mail.gmail.com
Whole thread Raw
In response to Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)  (Mark Kirkwood <mark.kirkwood@catalyst.net.nz>)
List pgsql-hackers
You're right. Reindex improves the performance of the benchmark workloads dramatically.
I'm gathering results and will announce them.

But I think we should notice that the results before Reindexing is poorer than MongoDB.
It seems that this is because of Btree bloating (not exact expression). 
The lookup performance for the Btree is most crucial for the results 
because the workload is select for primary key.
So larger Btree could mean less cache hits and slower query performance.
I think that PG doesn't pack leaf node to 90% but half for future insertion
and because of this PG's btree is larger than MongoDB 
(I turned off prefix compression of WiredTiger index and block compression for storage.)
But MongoDB (actually WiredTiger) seems to do differently.

Is my speculation is right? I'm not sure because I didn't review the btree code of PG yet.

And I want to point that the loading performance of MongoDB is better than PG.
If PG leaves more space for future insertion, then could we get at least faster loading performance?  
Then can we conclude that we have more chances to improve Btree of PG?

Best Regards, 



On Fri, Aug 12, 2016 at 5:40 PM, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote:
After examining the benchmark design - I see we are probably not being helped by the repeated insertion of keys all of form 'userxxxxxxx' leading to some page splitting.

However your index rebuild gets you from 5 to 3 GB - does that really help performance significantly?

regards

Mark


On 11/08/16 16:08, Kisung Kim wrote:
Thank you for your information.
Here is the result:

After insertions:

ycsb=# select * from pgstatindex('usertable_pkey');
 version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation
---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+--------------------
       2 |          3 | 5488721920 |         44337 |     4464 |     665545 |           0 |             0 |       52 |                 11
(1 row)

After rebuild:


ycsb=# select * from pgstatindex('usertable_pkey');
 version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation
---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+--------------------
       2 |          3 | 3154296832 |         41827 |         1899 |     383146 |           0 |             0 |            90.08 |                  0


It seems like that rebuild has an effect to reduce the number of internal and leaf_pages and make more dense leaf pages.







--

                                                                                                                                                       

Bitnine Global Inc., Kisung Kim, Ph.D
https://sites.google.com/site/kisungresearch/
E-mail : kskim@bitnine.net
Office phone : 070-4800-5890, 408-606-8602
US Mobile phone : 408-805-2192

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Add hint for function named "is"
Next
From: amul sul
Date:
Subject: Re: Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().