Thread: CTIDs invalidations and dropping columns.

CTIDs invalidations and dropping columns.

From
Tzahi Fadida
Date:
Hi,
First, i use CTIDs to immensely speed up my function which is inherently slow
because of the problem itself.

I have a question about CTID invalidation when you open a read only cursor
using SPI. Why does it at all happens? Why is it so important to invalidate a
ctid of a read only query (for example when using indices, casting,etc...)?

Specifically, i encountered something unexpected. i created a table:
(a2 int4, a0 int4) then i did alter table add column a5 int4, then update set
a5=a0, update set a0=a0+1, alter table drop column a0.

Now that i run a simple select * from SPI cursor query on this table and
look at the  t_data->t_ctid i see that the ctids are invalidated for some
unknown reason?
previously before the alter table it was ok.

I am using 8.1.4, can you tell me if it is a bug/feature/don't care about
ctids in spi... reason.

Is it such a difficult thing to return ctids if the query is read only. where
is it invalidated anyway?
I highly prefer not to use CTID as an attribute since it is going to greatly
lower the performance since it is sitting on a bottleneck.

--
Regards,
        Tzahi.
--
Tzahi Fadida
Blog: http://tzahi.blogsite.org | Home Site: http://tzahi.webhop.info
WARNING TO SPAMMERS:  see at
http://members.lycos.co.uk/my2nis/spamwarning.html


Re: CTIDs invalidations and dropping columns.

From
Martijn van Oosterhout
Date:
On Mon, Jul 10, 2006 at 11:47:23PM +0300, Tzahi Fadida wrote:
> Hi,
> First, i use CTIDs to immensely speed up my function which is inherently slow
> because of the problem itself.
>
> I have a question about CTID invalidation when you open a read only cursor
> using SPI. Why does it at all happens? Why is it so important to invalidate a
> ctid of a read only query (for example when using indices, casting,etc...)?

You're talking about "invalidation" as if it's something someone
deliberately does. That's incorrect. The t_ctid field is filled in if
and only if the tuple is exactly the on disk tuple. Otherwise it's a
new tuple, which by definition does not have a ctid (it doesn't exist
on disk).

> Specifically, i encountered something unexpected. i created a table:
> (a2 int4, a0 int4) then i did alter table add column a5 int4, then update set
> a5=a0, update set a0=a0+1, alter table drop column a0.
>
> Now that i run a simple select * from SPI cursor query on this table and
> look at the  t_data->t_ctid i see that the ctids are invalidated for some
> unknown reason?
> previously before the alter table it was ok.

This doesn't make any sense. What is invalidated? Is it blank or what?
I think you're going to have to provide some example code.

What do you mean by "invalidation" anyway?

> I highly prefer not to use CTID as an attribute since it is going to greatly
> lower the performance since it is sitting on a bottleneck.

You've measured this performance difference?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: CTIDs invalidations and dropping columns.

From
Tzahi Fadida
Date:
On Tuesday 11 July 2006 00:35, Martijn van Oosterhout wrote:
> On Mon, Jul 10, 2006 at 11:47:23PM +0300, Tzahi Fadida wrote:
> > Hi,
> > First, i use CTIDs to immensely speed up my function which is inherently
> > slow because of the problem itself.
> >
> > I have a question about CTID invalidation when you open a read only
> > cursor using SPI. Why does it at all happens? Why is it so important to
> > invalidate a ctid of a read only query (for example when using indices,
> > casting,etc...)?
>
> You're talking about "invalidation" as if it's something someone
> deliberately does. That's incorrect. The t_ctid field is filled in if
> and only if the tuple is exactly the on disk tuple. Otherwise it's a
> new tuple, which by definition does not have a ctid (it doesn't exist
> on disk).

As i understand rowids, i.e ctids, are supposed to allow for fast access to
the tables. I don't see the rational, for example, when casting some
attributes, to blank the ctid. So it is not exactly the same, but it still
came from the same tuple. What will happen if for read only SPI queries
it will not be blank?

>
> > Specifically, i encountered something unexpected. i created a table:
> > (a2 int4, a0 int4) then i did alter table add column a5 int4, then update
> > set a5=a0, update set a0=a0+1, alter table drop column a0.
> >
> > Now that i run a simple select * from SPI cursor query on this table and
> > look at the  t_data->t_ctid i see that the ctids are invalidated for some
> > unknown reason?
> > previously before the alter table it was ok.
>
> This doesn't make any sense. What is invalidated? Is it blank or what?
> I think you're going to have to provide some example code.

blank.

i am attaching a code. it is not supposed to run to completion but to print to
screen using elog.
i used this sql to declare the function but you'll need to alter it where it
says fdfuncs:
CREATE OR REPLACE FUNCTION ctest(text) RETURNS SETOF RECORD
AS 'fdfuncs','ctest'
LANGUAGE C STABLE STRICT;

anyway, run these commands:
create table ctest (a2 int4, a0 int4);
insert into ctest values (10,11);
insert into ctest values (12,13);
then run the function.
it should print 1 in posid and 0 in hi and lo.

Now run the following commands:
alter table ctest add column a5 int4;
alter table ctest DROP COLUMN a0;

and run the function.
it should print 0 on all three fields which means the ctid is blank.

>
> What do you mean by "invalidation" anyway?
>
> > I highly prefer not to use CTID as an attribute since it is going to
> > greatly lower the performance since it is sitting on a bottleneck.
>
> You've measured this performance difference?

Yes, i played with this in the past. since i can pass over a relation
potentially hundreds or thousands of times, this can be a bottleneck.
Full disjunctions is a difficult problem that in order to speed up uses the
tuple set concept where a set of tuples are represented as a set of CTIDs.

>
> Have a nice day,

--
Regards,
��������Tzahi.
--
Tzahi Fadida
Blog: http://tzahi.blogsite.org | Home Site: http://tzahi.webhop.info
WARNING TO SPAMMERS: �see at
http://members.lycos.co.uk/my2nis/spamwarning.html

Re: CTIDs invalidations and dropping columns.

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> You're talking about "invalidation" as if it's something someone
> deliberately does. That's incorrect. The t_ctid field is filled in if
> and only if the tuple is exactly the on disk tuple. Otherwise it's a
> new tuple, which by definition does not have a ctid (it doesn't exist
> on disk).

The only way that t_ctid would be "valid" in the result of a SELECT * FROM
is if the "physical tlist" optimization triggers to make the scan skip
the usual ExecProject call and just return a direct pointer to the
on-disk tuple.  That optimization never existed before 8.0 or 8.1
(I forget exactly, but it's pretty recent).  So the OP is depending
on an undocumented, recently added behavior that only applies in one
special case.  To be blunt, the OP's code is broken.  If you want to
know the on-disk tuple's CTID, select it explicitly:SELECT ctid, whatever-else-you-need FROM foo ...
        regards, tom lane


Re: CTIDs invalidations and dropping columns.

From
Martijn van Oosterhout
Date:
On Tue, Jul 11, 2006 at 01:50:40AM +0300, Tzahi Fadida wrote:
> As i understand rowids, i.e ctids, are supposed to allow for fast access to
> the tables. I don't see the rational, for example, when casting some
> attributes, to blank the ctid. So it is not exactly the same, but it still
> came from the same tuple. What will happen if for read only SPI queries
> it will not be blank?

Did you read the email Tom sent?

I worked out the exact issue with your example btw. It's because of the
DROP COLUMN. After dropping the column the tuples on disk have 3
columns and you only asked for 2, so an extra step has to be taken.
This extra step copies the two values, creating a new tuple, which has
no CTID.

If you're tying yourself this tightly to the backend, maybe you should
just use index_beginscan/heap_beginscan/etc which return actual tuples.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: CTIDs invalidations and dropping columns.

From
Tzahi Fadida
Date:
On Tuesday 11 July 2006 17:27, Martijn van Oosterhout wrote:
> On Tue, Jul 11, 2006 at 01:50:40AM +0300, Tzahi Fadida wrote:
> > As i understand rowids, i.e ctids, are supposed to allow for fast access
> > to the tables. I don't see the rational, for example, when casting some
> > attributes, to blank the ctid. So it is not exactly the same, but it
> > still came from the same tuple. What will happen if for read only SPI
> > queries it will not be blank?
>
> Did you read the email Tom sent?

yes, if it is potentially broken, i think i should better use the attribute
CTID even if there would be a performance drop.

>
> I worked out the exact issue with your example btw. It's because of the
> DROP COLUMN. After dropping the column the tuples on disk have 3
> columns and you only asked for 2, so an extra step has to be taken.
> This extra step copies the two values, creating a new tuple, which has
> no CTID.

10x. i c what you mean.

>
> If you're tying yourself this tightly to the backend, maybe you should
> just use index_beginscan/heap_beginscan/etc which return actual tuples.

I am considering it, however, it will also be accompanied with areas that
SPI/engine previously handled that i will have to manage myself.
I'll have to experiment a bit and see what is more important the Reuse of spi/
or the performance gains of heap_beginscan...

>
> Have a nice day,

--
Regards,
��������Tzahi.
--
Tzahi Fadida
Blog: http://tzahi.blogsite.org | Home Site: http://tzahi.webhop.info
WARNING TO SPAMMERS: �see at
http://members.lycos.co.uk/my2nis/spamwarning.html