Re: WIP: Access method extendability - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: WIP: Access method extendability
Date
Msg-id 54730DFD.2060703@vmware.com
Whole thread Raw
In response to Re: WIP: Access method extendability  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: WIP: Access method extendability
List pgsql-hackers
On 11/10/2014 10:30 PM, Alexander Korotkov wrote:
> Don't allowing CREATE ACCESS METHOD command seems problematic for me. How
> could it work with pg_upgrade? pg_dump wouldn't dump extra pg_am records.
> So, pg_upgrade would break at creating operator classes on new cluster. So,
> I agree with dropping create am command only if we let pg_dump to dump
> extra pg_am records...

pg_dump would dump the CREATE EXTENSION command, and the extension's 
installation script inserts the row to pg_am. pg_dump doesn't dump 
objects that are part of an extension, so that's how this would work 
with the CREATE ACCESS METHOD command, too.

Backtracking a bit, to summarize the discussion so far:

* It would be nice to have indexes that are not WAL-logged, but are 
automatically re-created after a crash. But it's a completely different 
and orthogonal feature, so there's no need to discuss that further in 
this thread.

* If an extension is buggy, it can crash and corrupt the whole database. 
There isn't really anything we can do about that, and this patch doesn't 
make that any better or worse.

* CREATE ACCESS METHOD command isn't worth it.

Looking at the patch itself:

* It has bitrotted, thanks to my WAL format changes.

* The memcpy/memmove based API seems difficult to use correctly. Looking 
at the bloom filter example, it seems that you need a lot more code to 
implement WAL-logging with this, than you would with a rmgr-specific 
redo function. I was hoping this to make it simpler.

I think the API has to be more high-level. Or at least also provide a 
higher-level API, in addition to the low level one. Looking at the 
indexams we have, almost all pages use the standard page layout, with 
PageAddItem etc., plus a metapage, plus some indexam-specific metadata 
in the special area. The proposed API makes it pretty complicated to 
deal with that common case. After calling PageAddItem, you need intimate 
knowledge of what PageAddItem did, to create a correct WAL record for 
it. For PageIndexMultiDelete, it's next to impossible.

I think we'll have to accept that the WAL records with this generic API 
are going to be much bulkier than ones tailored for the AM. We could 
provide a compact representation for common operations like PageAddItem, 
but for any more complicated operations, just WAL-log a full page image, 
as it's too fiddly to track the changes at a finer level. For any 
serious performance critical stuff, you'll just have to write an 
old-fashioned rmgr.

(marking this as "returned with feedback" in the commitfest)

- Heikki




pgsql-hackers by date:

Previous
From: Jirayut Nimsaeng
Date:
Subject: BDR duplicate key value violates unique constraint error
Next
From: Heikki Linnakangas
Date:
Subject: Re: Sequence Access Method WIP