Re: GIN improvements part 1: additional information - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: GIN improvements part 1: additional information
Date
Msg-id CAPpHfdsj-hg9knfD+FXPS-+330qjXZKiuR-_a15EFCKiNVi7pQ@mail.gmail.com
Whole thread Raw
In response to Re: GIN improvements part 1: additional information  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: GIN improvements part 1: additional information  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
On Thu, Nov 14, 2013 at 3:00 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Thu, Nov 14, 2013 at 2:17 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Tue, Nov 5, 2013 at 9:49 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 04.11.2013 23:44, Alexander Korotkov wrote:
On Mon, Oct 21, 2013 at 11:12 PM, Alexander Korotkov
<aekorotkov@gmail.com>wrote:

Attached version of patch is debugged. I would like to note that number of
bugs was low and it wasn't very hard to debug. I've rerun tests on it. You
can see that numbers are improved as the result of your refactoring.

          event         |     period
-----------------------+-----------------
  index_build           | 00:01:45.416822
  index_build_recovery  | 00:00:06
  index_update          | 00:05:17.263606
  index_update_recovery | 00:01:22
  search_new            | 00:24:07.706482
  search_updated        | 00:26:25.364494
(6 rows)

      label      | blocks_mark
----------------+-------------
  search_new     |   847587636
  search_updated |   881210486
(2 rows)

      label     |   size
---------------+-----------
  new           | 419299328
  after_updates | 715915264
(2 rows)

Beside simple bugs, there was a subtle bug in incremental item indexes
update. I've added some more comments including ASCII picture about how
item indexes are shifted.

Now, I'm trying to implement support of old page format. Then we can
decide which approach to use.


Attached version of patch has support of old page format. Patch still needs
more documentation and probably refactoring, but I believe idea is clear
and it can be reviewed. In the patch I have to revert change of null
category placement for compatibility with old posting list format.

Thanks, just glanced at this quickly.

If I'm reading the patch correctly, old-style non-compressed posting tree leaf pages are not vacuumed at all; that's clearly not right.

Fixed. Now separate function handles uncompressed posting lists and compress them if as least one TID is deleted.
  
I argued earlier that we don't need to handle the case that compressing a page makes it larger, so that the existing items don't fit on the page anymore. I'm having some second thoughts on that; I didn't take into account the fact that the "mini-index" in the new page format takes up some space. I think it's still highly unlikely that there isn't enough space on a 8k page, but it's not totally crazy with a non-standard small page size. So at least that situation needs to be checked for with an ereport(), rather than Assert.
 
Now this situation is ereported before any change in page.

To handle splitting a non-compressed page, it seems that you're relying on the fact that before splitting, we try to insert, which compresses the page. The problem with that is that it's not correctly WAL-logged. The split record that follows includes a full copy of both page halves, but if the split fails for some reason, e.g you run out of disk space, there is no WAL record at all of the the compression. I'd suggest doing the compression in the insert phase on a temporary copy of the page, leaving the original page untouched if there isn't enough space for the insertion to complete. (You could argue that this case can't happen because the compression must always create enough space to insert one more item. maybe so, but at least there should be an explicit check for that.)

Good point. Yes, if we don't handle specially inserting item indexes, I see no point to do special handling for single TID which is much smaller. In the attached patch dataCompressLeafPage just reserves space for single TID. 

Also, many changes in comments and README.

Unfortunally, I didn't understand what is FIXME about in ginVacuumEntryPage. So, it's not fixed :)

Sorry, I posted buggy version of patch. Attached version is correct.

Another bug fix by report from Rod Taylor.

------
With best regards,
Alexander Korotkov.

Attachment

pgsql-hackers by date:

Previous
From: Sameer Thakur
Date:
Subject: Re: pg_stat_statements: calls under-estimation propagation
Next
From: Alexander Korotkov
Date:
Subject: Re: GIN improvements part2: fast scan