Re: Is it possible to have a "fast-write" Index? - Mailing list pgsql-hackers

From deavid
Subject Re: Is it possible to have a "fast-write" Index?
Date
Msg-id CAFR-75tcCnAKZsJ=6KfEJUt+_E1_GYkFOnqn9w_ng9Oomt0JRw@mail.gmail.com
Whole thread Raw
In response to Re: Is it possible to have a "fast-write" Index?  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Is it possible to have a "fast-write" Index?  (deavid <deavidsedice@gmail.com>)
Re: Is it possible to have a "fast-write" Index?  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
List pgsql-hackers
So I just ran a test case for hash, btree, gin_btree and brin indexes. Also without indexes, and without primary keys.
* Testing "deliverynotes" table. 
  - Definition and use case:
It is a table contaning real delivery note headers of several years
It consists of 300k rows, 128 columns, 63 indexes, 243Mb of data 
excluding indexes. Since is a table visible for users, almost every 
column can be searched so we need lots of indexes. We do not need 
searches to be the fastest possible, we only need to accelerate a 
bit our user searches; without harming too much writes.
  - Things to test:
- measure index creation times.
- measure index space.
- with indexes but without primary key
- with everything
- Create fully, delete everything and Insert again data in blocks
- Test updates for recent data

I attached the logs for every test, if anyone wants to see what i'm exactly testing.
This was tested on my i5 laptop with 4Gb of RAM and a 120Gb SSD (OCZ Agility 3). I'm trying to measure CPU time, not I/O time, so some configurations and tests are specific to avoid as much as IO as I can. 
I'm using a dev build for Postgresql 9.5 downloaded from git sources.

Conclusions:
- Gin_btree seems slower in almost every case. It's writes are marginally better than regular btrees even when using work_mem=160MB. (May be 20% faster than btree). They are smaller than I thought.
- BRIN indexes seem very fast for writes. For selects maybe is a blend between having indexes and don't having them. They don't recognize that some values are simply out of range of indexed values, and that's a pity. If the values we want are packed together I guess I would get even better results.
- Primary keys and uniqueness checks doesn't seem to make any difference here.
- Having no indexes at all is faster than I imagined. (Sometimes it beats BRIN or Btree) Maybe because the IO here is faster than usual.
- Hash indexes: i tried to do something, but they take too much time to build and i don't know why. If creates are slow, updates should be slow too. I'm not going to test them again.

And finally, don't know why but i couldn't vacuum or analyze tables. It always get stalled without doing anything; so i had to comment every vacuum. Maybe there is a bug in this dev version or i misconfigured something.

El vie., 12 jun. 2015 a las 7:27, Simon Riggs (<simon@2ndquadrant.com>) escribió:
On 5 June 2015 at 18:07, deavid <deavidsedice@gmail.com> wrote:
There are several use cases where I see useful an index, but adding it will slow too much inserts and updates.
For example, when we have 10 million rows on a table, and it's a table which has frequent updates, we need several index to speed up selects, but then we'll slow down updates a lot, specially when we have 10 or more indexes.
Other cases involve indexes for text search, which are used only for user search and aren't that important, so we want to have them, but we don't want the overload they put whenever we write on the table. 
I know different approaches that already solve some of those problems in some ways (table partitioning, partial indexes, etc), but i don't feel they are the solution to every problem of this kind.

Some people already asked for "delayed write" indexes, but the idea gets discarded because the index could get out of sync, so it can omit results and this is unacceptable. But i think maybe that could be fixed in several ways and we can have a fast and reliable index (but maybe not so fast on selects). 

This is exactly the use case and mechanism for BRIN indexes.
 
--
Simon Riggs                http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Time to fully remove heap_formtuple() and friends?
Next
From: Bruce Momjian
Date:
Subject: Re: 9.5 release notes