Re: Is it possible to have a "fast-write" Index? - Mailing list pgsql-hackers

From deavid
Subject Re: Is it possible to have a "fast-write" Index?
Date
Msg-id CAFR-75tQZL4xUL1fSFMYU-dygRuxT0bKq0fXsbKo0U-RxR-w7Q@mail.gmail.com
Whole thread Raw
In response to Re: Is it possible to have a "fast-write" Index?  ("ktm@rice.edu" <ktm@rice.edu>)
Responses Re: Is it possible to have a "fast-write" Index?  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-hackers
Hi again. I tried to do some test on my office computer, but after spending 2-3 hours I gave up. I'm going to need a real SSD disk to try these things. 100k rows of my "delivery notes" table use 100MB of disk; and 2Gb of RAM may be not enough to emulate a fast IO. (I was disabling fsync, activating write caches, etc)

I downloaded Postgresql 9.5-dev from git sources, compiled everything and restored there two client databases (10Gb each one). My goal was to test gin_btree and brin indexes as well, but i gave up before doing a complete test of gin.

Now I have a better plan, I'm going to use my laptop (intel i5, 4Gb of ram) and i will put here a spare SSD i wasn't using (OCZ Agility 2 120Gb). Hope this time I could get some figures closer to production.

By now, my results were a bit disappointing: (comparing gin_btree against regular btree for a column with very low cardinality) 
- create index and updates: about 10-20% faster (i had a primary key, so btree unique checks may be here blurring the results)
- selects: about 2-5 times slower
- index size: about 2 times smaller 
 
What i've found is, I was wrong on fillfactor. (Maybe something has changed here since postgresql 8.1). I believed a fillfactor lower than 80 will do more harm than good. At least that was the case 5 years ago. Now I could get a noticeable speedup with fillfactor=50 in the case of updating the whole table.

Hope i could setup this laptop soon and get those tests done.

El sáb., 6 jun. 2015 a las 13:07, ktm@rice.edu (<ktm@rice.edu>) escribió:
On Fri, Jun 05, 2015 at 11:54:01PM +0000, deavid wrote:
> Thanks to everybody for answering. I wasn't expecting this attention; this
> is a great community :-)
>
> Jim asked me about something real. Well, the problem is this showed up more
> than five years ago, and keeps popping from time to time since in different
> circumstances. I solved them in different ways each time, depending the
> exact use-case. I wanted to generalize, because seems a good feature for
> several situations; and I don't expect a solution for me as each time I hit
> with this I found some way to sort it out.
> As Jim said, we need here are figures for real examples, and i don't have
> yet. I'll do my "homework" and email back with exact problems with exact
> timing. Give me a week or two.
>
> Also, some of you are talking about IO. Well, it's hard to say without the
> figures here, but I'm pretty sure I'm hitting CPU time only. We use SSD on
> those big databases, and also in my tests i tried setting fsync=off.
>
> So the problem is: i see a low iowait, and CPU time for one core is at
> 80-90% most of the time. I can buy more ram, better disks, or cpu's with
> more cores. But one cpu core would have more-or-less the same speed no
> matter how much money you invest.
>
> When someone wants a delayed-write index is similar to setting
>  "synchronous_commit = off". We want to give an OK to the backend as soon
> as is possible and do this work in background. But we also want some
> reliability against crashes.
>
> Also, if the task is done in background it may be done from other backend,
> so probably several indexes could be packed at once using different backend
> processes. We could use the entire cpu if our index writes aren't tied to
> the session who wrote the row.
>
> PD: I'm very interested on existent approaches like GIN or BRIN (this one
> is new to me). Thanks a lot; i'll try them in my tests.

Hi David,

Here is an interesting read comparing LSM and Fractal Tree indexing:

http://highscalability.com/blog/2014/8/6/tokutek-white-paper-a-comparison-of-log-structured-merge-lsm.html

Regards,
Ken

pgsql-hackers by date:

Previous
From: Gurjeet Singh
Date:
Subject: Re: replication slot restart_lsn initialization
Next
From: Claudio Freire
Date:
Subject: Re: Is it possible to have a "fast-write" Index?