Re: Is it possible to have a "fast-write" Index? - Mailing list pgsql-hackers

From deavid
Subject Re: Is it possible to have a "fast-write" Index?
Date
Msg-id CAFR-75vVeU=fZrOf7kZEVyZcm67pq7oL6MmBks_Oy8S0hanH_w@mail.gmail.com
Whole thread Raw
In response to Re: Is it possible to have a "fast-write" Index?  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers


El vie., 19 jun. 2015 a las 15:06, Simon Riggs (<simon@2ndquadrant.com>) escribió:
It doesn't say anything about their being only one index buffer per table, nor do I think it would make sense to do it that way. So ISTM that the foreground process still has to insert serially into N index buffers, with each insert being WAL logged.

So the only saving for the foreground process is the random I/O from inserting into the indexes, which means the value of the technique is in the case where you have many very large secondary indexes - which is now covered by BRIN.

I'm still learning how postgresql, but, you're assuming when inserting in bulk into an insert would require the same amount of CPU cycles and the same amount of kB written compared to doing it row-by-row.

Most memory-based indexes have a bulk load technique that relies in having the data pre-sorted. Sorting pure random data and then bulk-inserting it into the index is faster than the classic insertion. (less CPU time, no idea about IO)

Database indexes are disk-based and there are some points (regarding IO performance) that are hard for me to fully understand. But seems logic that would be faster to scan the index only once from begin to end and do something like a "merge sort" between pre-sorted input and the index.

So I guess I missed something. Maybe is WAL logging the problem? If so, could this work for TEMP/UNLOGGED tables?

Lots of tables that are heavily written are materialized views (or they perform more or less the same), so they could be refreshed in case of server failure. I hope bulk inserts could double the performance; otherwise, this idea may not be worth it.

About BRIN indexes, i'm really impressed. They are several times faster than I could imagine. Also, on select they perform very well. I have to test them more, with more complex queries (they would work when used on JOIN clauses?). If select times are good enough even in those cases, then there's no need for doing bulk-inserts with btree.


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Insufficient locking for ALTER DEFAULT PRIVILEGES
Next
From: Robert Haas
Date:
Subject: Re: pg_stat_*_columns?