Re: [HACKERS] How to get the 'ctid' from a record type? - Mailing list pgsql-hackers

From Eric Ridge
Subject Re: [HACKERS] How to get the 'ctid' from a record type?
Date
Msg-id CANcm6wY4OFhD9s9j=Egt3Vrcm49_qMPVQj3nKDWp1gkE4YffNQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] How to get the 'ctid' from a record type?  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Sat, Mar 11, 2017 at 2:14 PM Andres Freund <andres@anarazel.de> wrote:
On 2017-03-11 04:31:16 +0000, Eric Ridge wrote:
> Well shoot.  That kinda spoils my plans.

I think you should elaborate on what you're trying to achieve -
otherwise our advice will be affected by the recent, widely reported,
crystal ball scarcity.

What I'm trying to do is port https://github.com/zombodb/zombodb to Postgres 9.6+.  It's an Access Method that stores full rows, encoded as JSON, in Elasticsearch instead of in local storage.  It was fairly straightforward to do the mechanical work to convert it to use 9.6's new AM API (which is very nice, btw!), but the fact that 9.6 also disallows including system columns (specifically ctid) has me turned upside down.

With <9.6, I was able to cook-up a scheme where it was able to answer queries from the remote Elasticsearch index even when Postgres decided to plan a sequential scan.  That hinged, mostly, on being able to create a multi-column index where the first column was a function call that included as an argument (among other things) the ctid system column.

The ability to answer sequential scans (and filters) using the remote ES index is pretty important as the knowledge of how to do that exists in Elasticsearch, not my custom operator function in Postgres.

Anyways, I've been trying to find a way to intuit the ctid system column value with 9.6 and it's clear now that that just isn't possible.  The closest I got was digging through ActivePortal->queryDesc->estate->es_tuple, but that only works when it's a real tuple, not one that's virtual or minimal.

I'm pretty sure that I need to be implementing a Custom Scan Provider instead, and I've been spending time with that API too.  There's a pretty steep learning curve for me, but I'll eventually get over that hump.

I could probably bore you with greater detail but basically, I want to take:
   CREATE INDEX idxfoo ON table USING zombodb (zdb(table), zdb_to_json(table)) WITH (url='http://remote.ip.addr:9200/'); 
   SELECT * FROM table WHERE zdb(table) ==> 'some full text query' OR id = 42;

And have the "zdb(table) ==> 'some full text query'" bit be answered by my extension, regardless of how PG wants to plan the query.  While I was able to hack something together for <9.6, I think that means a Custom Scan Provider now?

eric

pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: [HACKERS] [bug fix] dblink leaks unnamed connections
Next
From: Julien Rouhaud
Date:
Subject: [HACKERS] Preserving param location