Thread: [HACKERS] How to get the 'ctid' from a record type?
This is about Postgres 9.6...
I have a very simple 1-arg function, in C, that I want to return the ctid of the record passed in. Example:
CREATE OR REPLACE FUNCTION foo(record) RETURNS tid LANGUAGE c IMMUTABLE STRICT AS 'my_extension';
Its implementation is simply:
Datum foo(PG_FUNCTION_ARGS) {
HeapTupleHeader td = PG_GETARG_HEAPTUPLEHEADER(0);
PG_RETURN_POINTER(&td->t_ctid);
}
What I'm seeing is that the ctid returned from this function isn't always correct:
# select ctid, foo(table) from table limit 10;
ctid | foo
-------+-----------
(0,1) | (19195,1) -- not correct!
(0,2) | (0,2)
(0,3) | (0,3)
(0,4) | (0,4)
(0,5) | (0,5)
(0,6) | (0,6)
(0,7) | (0,7)
(1,1) | (1,1)
(1,2) | (1,2)
(1,3) | (1,3)
(10 rows)
I've spent hours tracing through the PG sources trying to figure out why and/or find a different way to do this, but I've got nothing.
Of the various examples in the PG sources that use PG_GETARG_HEAPTUPLEHEADER, or otherwise deal with a HeapTupleHeader, I can't find any that want to access a system column, so I'm starting to think I'm just doing it wrong entirely.
Can anyone point me in the right direction?
Thanks for your time!
eric
Eric Ridge <eebbrr@gmail.com> writes: > What I'm seeing is that the ctid returned from this function isn't always > correct: > # select ctid, foo(table) from table limit 10; > ctid | foo > -------+----------- > (0,1) | (19195,1) -- not correct! > (0,2) | (0,2) > (0,3) | (0,3) I suspect the tuple at (0,1) has been the subject of a failed update. Your problem here is that you're mistaking the t_ctid field of a tuple header for the tuple's address. It is not that; it's really just garbage normally, and is only useful to link forward to the next version of the row from an outdated tuple. I think we do initialize it to the tuple's own address during an INSERT, but either a completed or failed UPDATE would change it. I do not think there is any way to get the true address of a heap tuple out of a composite Datum manufactured from the tuple. Most of the other system columns can't be gotten from a composite Datum either, because of the field overlay in HeapTupleHeaderData's union t_choice. regards, tom lane
I suspect the tuple at (0,1) has been the subject of a failed update.
Yep.
Your problem here is that you're mistaking the t_ctid field of a tuple
header for the tuple's address. It is not that; it's really just garbage
normally, and is only useful to link forward to the next version of the
row from an outdated tuple. I think we do initialize it to the tuple's
own address during an INSERT, but either a completed or failed UPDATE
would change it.
Thanks. That helps clarify the comments in htup_details.h, actually.
I do not think there is any way to get the true address of a heap tuple
out of a composite Datum manufactured from the tuple. Most of the other
system columns can't be gotten from a composite Datum either, because of
the field overlay in HeapTupleHeaderData's union t_choice.
Well shoot. That kinda spoils my plans.
What about this? Is the tuple currently being evaluated (I suppose in the case of a sequential scan) available in the context of a function call?
Thanks again for your time! It's much appreciated.
eric
On 3/10/17 10:31 PM, Eric Ridge wrote: > What about this? Is the tuple currently being evaluated (I suppose in > the case of a sequential scan) available in the context of a function call? AFAIK that *very* specific case would work, because the executor would be handing you the raw tuple. Not a great bet to make though. Also, there should be a macro somewhere that will tell you whether you have a full tuple or not. You'd want to make sure to check that an throw an error if you weren't handed a full tuple. -- Jim Nasby, Chief Data Architect, OpenSCG http://OpenSCG.com
On 2017-03-11 04:31:16 +0000, Eric Ridge wrote: > Well shoot. That kinda spoils my plans. I think you should elaborate on what you're trying to achieve - otherwise our advice will be affected by the recent, widely reported, crystal ball scarcity. - Andres
On Sat, Mar 11, 2017 at 2:14 PM Andres Freund <andres@anarazel.de> wrote:
On 2017-03-11 04:31:16 +0000, Eric Ridge wrote:
> Well shoot. That kinda spoils my plans.
I think you should elaborate on what you're trying to achieve -
otherwise our advice will be affected by the recent, widely reported,
crystal ball scarcity.
What I'm trying to do is port https://github.com/zombodb/zombodb to Postgres 9.6+. It's an Access Method that stores full rows, encoded as JSON, in Elasticsearch instead of in local storage. It was fairly straightforward to do the mechanical work to convert it to use 9.6's new AM API (which is very nice, btw!), but the fact that 9.6 also disallows including system columns (specifically ctid) has me turned upside down.
With <9.6, I was able to cook-up a scheme where it was able to answer queries from the remote Elasticsearch index even when Postgres decided to plan a sequential scan. That hinged, mostly, on being able to create a multi-column index where the first column was a function call that included as an argument (among other things) the ctid system column.
The ability to answer sequential scans (and filters) using the remote ES index is pretty important as the knowledge of how to do that exists in Elasticsearch, not my custom operator function in Postgres.
Anyways, I've been trying to find a way to intuit the ctid system column value with 9.6 and it's clear now that that just isn't possible. The closest I got was digging through ActivePortal->queryDesc->estate->es_tuple, but that only works when it's a real tuple, not one that's virtual or minimal.
I'm pretty sure that I need to be implementing a Custom Scan Provider instead, and I've been spending time with that API too. There's a pretty steep learning curve for me, but I'll eventually get over that hump.
I could probably bore you with greater detail but basically, I want to take:
CREATE INDEX idxfoo ON table USING zombodb (zdb(table), zdb_to_json(table)) WITH (url='http://remote.ip.addr:9200/');
SELECT * FROM table WHERE zdb(table) ==> 'some full text query' OR id = 42;
And have the "zdb(table) ==> 'some full text query'" bit be answered by my extension, regardless of how PG wants to plan the query. While I was able to hack something together for <9.6, I think that means a Custom Scan Provider now?
eric