Thread: CTIDs invalidations and dropping columns.
Hi, First, i use CTIDs to immensely speed up my function which is inherently slow because of the problem itself. I have a question about CTID invalidation when you open a read only cursor using SPI. Why does it at all happens? Why is it so important to invalidate a ctid of a read only query (for example when using indices, casting,etc...)? Specifically, i encountered something unexpected. i created a table: (a2 int4, a0 int4) then i did alter table add column a5 int4, then update set a5=a0, update set a0=a0+1, alter table drop column a0. Now that i run a simple select * from SPI cursor query on this table and look at the t_data->t_ctid i see that the ctids are invalidated for some unknown reason? previously before the alter table it was ok. I am using 8.1.4, can you tell me if it is a bug/feature/don't care about ctids in spi... reason. Is it such a difficult thing to return ctids if the query is read only. where is it invalidated anyway? I highly prefer not to use CTID as an attribute since it is going to greatly lower the performance since it is sitting on a bottleneck. -- Regards, Tzahi. -- Tzahi Fadida Blog: http://tzahi.blogsite.org | Home Site: http://tzahi.webhop.info WARNING TO SPAMMERS: see at http://members.lycos.co.uk/my2nis/spamwarning.html
On Mon, Jul 10, 2006 at 11:47:23PM +0300, Tzahi Fadida wrote: > Hi, > First, i use CTIDs to immensely speed up my function which is inherently slow > because of the problem itself. > > I have a question about CTID invalidation when you open a read only cursor > using SPI. Why does it at all happens? Why is it so important to invalidate a > ctid of a read only query (for example when using indices, casting,etc...)? You're talking about "invalidation" as if it's something someone deliberately does. That's incorrect. The t_ctid field is filled in if and only if the tuple is exactly the on disk tuple. Otherwise it's a new tuple, which by definition does not have a ctid (it doesn't exist on disk). > Specifically, i encountered something unexpected. i created a table: > (a2 int4, a0 int4) then i did alter table add column a5 int4, then update set > a5=a0, update set a0=a0+1, alter table drop column a0. > > Now that i run a simple select * from SPI cursor query on this table and > look at the t_data->t_ctid i see that the ctids are invalidated for some > unknown reason? > previously before the alter table it was ok. This doesn't make any sense. What is invalidated? Is it blank or what? I think you're going to have to provide some example code. What do you mean by "invalidation" anyway? > I highly prefer not to use CTID as an attribute since it is going to greatly > lower the performance since it is sitting on a bottleneck. You've measured this performance difference? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
On Tuesday 11 July 2006 00:35, Martijn van Oosterhout wrote: > On Mon, Jul 10, 2006 at 11:47:23PM +0300, Tzahi Fadida wrote: > > Hi, > > First, i use CTIDs to immensely speed up my function which is inherently > > slow because of the problem itself. > > > > I have a question about CTID invalidation when you open a read only > > cursor using SPI. Why does it at all happens? Why is it so important to > > invalidate a ctid of a read only query (for example when using indices, > > casting,etc...)? > > You're talking about "invalidation" as if it's something someone > deliberately does. That's incorrect. The t_ctid field is filled in if > and only if the tuple is exactly the on disk tuple. Otherwise it's a > new tuple, which by definition does not have a ctid (it doesn't exist > on disk). As i understand rowids, i.e ctids, are supposed to allow for fast access to the tables. I don't see the rational, for example, when casting some attributes, to blank the ctid. So it is not exactly the same, but it still came from the same tuple. What will happen if for read only SPI queries it will not be blank? > > > Specifically, i encountered something unexpected. i created a table: > > (a2 int4, a0 int4) then i did alter table add column a5 int4, then update > > set a5=a0, update set a0=a0+1, alter table drop column a0. > > > > Now that i run a simple select * from SPI cursor query on this table and > > look at the t_data->t_ctid i see that the ctids are invalidated for some > > unknown reason? > > previously before the alter table it was ok. > > This doesn't make any sense. What is invalidated? Is it blank or what? > I think you're going to have to provide some example code. blank. i am attaching a code. it is not supposed to run to completion but to print to screen using elog. i used this sql to declare the function but you'll need to alter it where it says fdfuncs: CREATE OR REPLACE FUNCTION ctest(text) RETURNS SETOF RECORD AS 'fdfuncs','ctest' LANGUAGE C STABLE STRICT; anyway, run these commands: create table ctest (a2 int4, a0 int4); insert into ctest values (10,11); insert into ctest values (12,13); then run the function. it should print 1 in posid and 0 in hi and lo. Now run the following commands: alter table ctest add column a5 int4; alter table ctest DROP COLUMN a0; and run the function. it should print 0 on all three fields which means the ctid is blank. > > What do you mean by "invalidation" anyway? > > > I highly prefer not to use CTID as an attribute since it is going to > > greatly lower the performance since it is sitting on a bottleneck. > > You've measured this performance difference? Yes, i played with this in the past. since i can pass over a relation potentially hundreds or thousands of times, this can be a bottleneck. Full disjunctions is a difficult problem that in order to speed up uses the tuple set concept where a set of tuples are represented as a set of CTIDs. > > Have a nice day, -- Regards, ��������Tzahi. -- Tzahi Fadida Blog: http://tzahi.blogsite.org | Home Site: http://tzahi.webhop.info WARNING TO SPAMMERS: �see at http://members.lycos.co.uk/my2nis/spamwarning.html
Martijn van Oosterhout <kleptog@svana.org> writes: > You're talking about "invalidation" as if it's something someone > deliberately does. That's incorrect. The t_ctid field is filled in if > and only if the tuple is exactly the on disk tuple. Otherwise it's a > new tuple, which by definition does not have a ctid (it doesn't exist > on disk). The only way that t_ctid would be "valid" in the result of a SELECT * FROM is if the "physical tlist" optimization triggers to make the scan skip the usual ExecProject call and just return a direct pointer to the on-disk tuple. That optimization never existed before 8.0 or 8.1 (I forget exactly, but it's pretty recent). So the OP is depending on an undocumented, recently added behavior that only applies in one special case. To be blunt, the OP's code is broken. If you want to know the on-disk tuple's CTID, select it explicitly:SELECT ctid, whatever-else-you-need FROM foo ... regards, tom lane
On Tue, Jul 11, 2006 at 01:50:40AM +0300, Tzahi Fadida wrote: > As i understand rowids, i.e ctids, are supposed to allow for fast access to > the tables. I don't see the rational, for example, when casting some > attributes, to blank the ctid. So it is not exactly the same, but it still > came from the same tuple. What will happen if for read only SPI queries > it will not be blank? Did you read the email Tom sent? I worked out the exact issue with your example btw. It's because of the DROP COLUMN. After dropping the column the tuples on disk have 3 columns and you only asked for 2, so an extra step has to be taken. This extra step copies the two values, creating a new tuple, which has no CTID. If you're tying yourself this tightly to the backend, maybe you should just use index_beginscan/heap_beginscan/etc which return actual tuples. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
On Tuesday 11 July 2006 17:27, Martijn van Oosterhout wrote: > On Tue, Jul 11, 2006 at 01:50:40AM +0300, Tzahi Fadida wrote: > > As i understand rowids, i.e ctids, are supposed to allow for fast access > > to the tables. I don't see the rational, for example, when casting some > > attributes, to blank the ctid. So it is not exactly the same, but it > > still came from the same tuple. What will happen if for read only SPI > > queries it will not be blank? > > Did you read the email Tom sent? yes, if it is potentially broken, i think i should better use the attribute CTID even if there would be a performance drop. > > I worked out the exact issue with your example btw. It's because of the > DROP COLUMN. After dropping the column the tuples on disk have 3 > columns and you only asked for 2, so an extra step has to be taken. > This extra step copies the two values, creating a new tuple, which has > no CTID. 10x. i c what you mean. > > If you're tying yourself this tightly to the backend, maybe you should > just use index_beginscan/heap_beginscan/etc which return actual tuples. I am considering it, however, it will also be accompanied with areas that SPI/engine previously handled that i will have to manage myself. I'll have to experiment a bit and see what is more important the Reuse of spi/ or the performance gains of heap_beginscan... > > Have a nice day, -- Regards, ��������Tzahi. -- Tzahi Fadida Blog: http://tzahi.blogsite.org | Home Site: http://tzahi.webhop.info WARNING TO SPAMMERS: �see at http://members.lycos.co.uk/my2nis/spamwarning.html