Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2) - Mailing list pgsql-hackers

From Mark Kirkwood
Subject Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)
Date
Msg-id ae92d93a-e00d-4fe7-ece3-f452bab73979@catalyst.net.nz
Whole thread Raw
In response to Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On 13/08/16 05:44, Jeff Janes wrote:
> On Fri, Aug 12, 2016 at 1:40 AM, Mark Kirkwood
>> However your index rebuild gets you from 5 to 3 GB - does that really help
>> performance significantly?
> It can make a big difference, depending on how much RAM you have.
>

Yeah - I suspect this is the issue - loading up a similar type of schema 
with records with a primary key of form 'userxxxxx' for (uniformly) 
randomly distributed xxxxx... (I was gonna use the Yahoo benchmark but 
it is soooo slow...). Also I'm using 10000000 rows instead of 100000000 
to avoid waiting a long time (10000000 should be enough to show the point):

prefix=# \d prefix           Table "public.prefix" Column |         Type          | Modifiers
--------+-----------------------+----------- uid    | character varying(30) | not null filler | character(255)
|
Indexes:    "prefix_pkey" PRIMARY KEY, btree (uid)

Doing an uncached indexed read by forcing a buffer cache clear:

# echo 3 > /proc/sys/vm/drop_caches

prefix=# SELECT 
relfilenode,relname,reltuples,pg_relation_size(oid)/1024/1024 AS mb
FROM pg_class WHERE relname LIKE 'prefix%'; relfilenode |   relname   | reltuples | mb
-------------+-------------+-----------+-----     6017817 | prefix      |     1e+07 | 422     6017819 | prefix_pkey |
 1e+07 | 391
 
(2 rows)

prefix=# EXPLAIN ANALYZE SELECT count(*)         FROM prefix WHERE uid='user10000';
                    QUERY PLAN
 

--------------------------------------------------------------------------------
----------------------------------------------- Aggregate  (cost=8.46..8.46 rows=1 width=0) (actual time=3.408..3.408 
rows=1 lo
ops=1)   ->  Index Only Scan using prefix_pkey on prefix (cost=0.43..8.45 
rows=1 widt
h=0) (actual time=3.406..3.406 rows=0 loops=1)         Index Cond: (uid = 'user10000'::text)         Heap Fetches: 0
Planningtime: 19.362 ms Execution time: 3.429 ms
 
(6 rows)

Repeating this after REINDEX:

# echo 3 > /proc/sys/vm/drop_caches

prefix=# SELECT 
relfilenode,relname,reltuples,pg_relation_size(oid)/1024/1024 AS mb
FROM pg_class WHERE relname LIKE 'prefix%'; relfilenode |   relname   | reltuples | mb
-------------+-------------+-----------+-----     6017817 | prefix      |     1e+07 | 422     6017819 | prefix_pkey |
 1e+07 | 300
 
(2 rows)

prefix=# EXPLAIN ANALYZE SELECT count(*)         FROM prefix WHERE uid='user10000';
                    QUERY PLAN
 

--------------------------------------------------------------------------------
----------------------------------------------- Aggregate  (cost=8.46..8.46 rows=1 width=0) (actual time=3.868..3.868 
rows=1 lo
ops=1)   ->  Index Only Scan using prefix_pkey on prefix (cost=0.43..8.45 
rows=1 widt
h=0) (actual time=3.866..3.866 rows=0 loops=1)         Index Cond: (uid = 'user10000'::text)         Heap Fetches: 0
Planningtime: 19.366 ms Execution time: 3.889 ms
 
(6 rows)

So certainly not significantly *slower* with the physically bigger 
index. This suggests that Jeff's analysis was spot on - likely that the 
larger index didn't fix in RAM.

Cheers

Mark



pgsql-hackers by date:

Previous
From: Venkata B Nagothi
Date:
Subject: patch proposal
Next
From: Artur Zakirov
Date:
Subject: Re: to_date_valid()