Thread: tid_le comparison for tuple id (ctid) values?
[PostgreSQL 8.1.0 on i686-pc-linux-gnu] I would like to suggest that there be a less-than (or greater-than) operator for the 'tid' type. I used to use oid's for finding and distinguishing duplicate data. Now that oid's are not included by default (and I do not quarrel with that change), I thought I could use ctid's instead. Suppose I have a table steps: create table steps(x text, y text, z text) but I want there to be a primary key(x,y). If I try to do: create table temp_steps(x text, y text, z text, primary key(x,y)) insert into temp_steps select * from steps; drop table steps; alter table temp_steps rename to steps; I get an error that "duplicate key violates unique constraint". Some of the rows in steps differ only in value of z. OK,I'll just fix the data... I thought I could force values of x to be distinct with: (I've done this several times in the past with oid's) update steps set x=x||'X' from steps s where steps.key1=s.key1 and steps.key2=s.key2 and step.ctid<s.ctid; But this fails because there is no less-than operator (or function) on type "tid". I tried casting the ctid to string butthat fails too. Using "not s.ctid=step.ctid" doesn't get me what I need(BTW, there's no != operator for tid either). I don't actually care which row of a pair gets changed. I just need a way to choose *one* in the update. [sometimes I doa delete with similar where clause] If there was some asymmetrical comparison defined on ctids, as there had been on oids, this would all work fine. A castto some type that has less-than would also work fine. Any suggestions? -- George -- "Are the gods not just?" "Oh no, child. What would become of us if they were?" (CSL)
george young <gry@ll.mit.edu> writes: > update steps set x=x||'X' from steps s where steps.key1=s.key1 and steps.key2=s.key2 and step.ctid<s.ctid; > But this fails because there is no less-than operator (or function) on > type "tid". Probably a good thing, too, since if there was it wouldn't have anything reliable to do with the age of the tuple. regards, tom lane
On Mon, 21 Nov 2005 16:19:28 -0500 Tom Lane <tgl@sss.pgh.pa.us> threw this fish to the penguins: > george young <gry@ll.mit.edu> writes: > > update steps set x=x||'X' from steps s where steps.key1=s.key1 and steps.key2=s.key2 and step.ctid<s.ctid; > > > But this fails because there is no less-than operator (or function) on > > type "tid". > > Probably a good thing, too, since if there was it wouldn't have anything > reliable to do with the age of the tuple. Well, I don't have any need for it to correlate with the age of the tuple. My use of step.ctid<s.ctid was not to get the earliest or latest row, but just to *choose* one. Perhaps there's some other query that would modify only one of each pair of equal-keyed rows? -- George -- "Are the gods not just?" "Oh no, child. What would become of us if they were?" (CSL)
george young wrote: > Well, I don't have any need for it to correlate with the age of the > tuple. My use of step.ctid<s.ctid was not to get the earliest or > latest row, but just to *choose* one. Perhaps there's some other > query that would modify only one of each pair of equal-keyed rows? How do you know there is only 1 duplicate? Anyway, if (x,y) are the same but (z) is not then you can compare against max(z) or min(z). Something like: SELECT t1.x AS update_me_x, t1.y AS update_me_y, t1.z AS update_me_z FROM test_tbl AS t1, ( SELECT x,y,max(z) AS max_z FROM test_tbl GROUP BY x,y ) AS t2 WHERE t1.x = t2.x AND t1.y = t2.y AND t1.z = t2.max_z -- Richard Huxton Archonet Ltd
Richard Huxton wrote: > george young wrote: > >> Well, I don't have any need for it to correlate with the age of the >> tuple. My use of step.ctid<s.ctid was not to get the earliest or >> latest row, but just to *choose* one. Perhaps there's some other >> query that would modify only one of each pair of equal-keyed rows? > > > How do you know there is only 1 duplicate? > > Anyway, if (x,y) are the same but (z) is not then you can compare > against max(z) or min(z). Something like: > > SELECT t1.x AS update_me_x, t1.y AS update_me_y, t1.z AS update_me_z > FROM > test_tbl AS t1, > ( > SELECT x,y,max(z) AS max_z > FROM test_tbl > GROUP BY x,y > ) AS t2 > WHERE > t1.x = t2.x AND t1.y = t2.y AND t1.z = t2.max_z Oh, of course the easiest way to do it is to add an additional column of type SERIAL to your temp-table. That way you have your rows nicely numbered as you import them. -- Richard Huxton Archonet Ltd