Thread: OIDs, CTIDs, updateable cursors and friends

OIDs, CTIDs, updateable cursors and friends

From
Shachar Shemesh
Date:
Hi all,

I'm in the process of writing an OLE DB provider for postgres. I am, 
right now, at the point where updating an entry becomes an issue. 
Ideally, I would open an updateable cursor for a table/view, and use 
that. Unfortunetly, Postgres doesn't seem to support those at all.

As an alternative, I was thinking of using some unique field to 
identify, with certanity, the line in question. This will allow me to 
use an "update" command to modify said line, in case there is a need 
with the OLE DB client. My quetion is this - what can I use?

I have tried to find some docs regarding OIDs and CTIDs, but the docs 
seem scarce. If I understand this correctly - OID is meant to identify a 
row in a table, though it is not guarenteed to be unique across tables, 
or even inside a given table. A CTID is meant to identify the physical 
location at which a row is stored. Are these correct?

Would adding "OID" to the rows returned by each "Select" call, and then 
doing "update blah where oid=xxx" when I'm requested to update the row 
sound like a reasonable stategy, in lieu of updateable cursors? Can 
anyone suggest a better way?
               Shachar

-- 
Shachar Shemesh
Lingnu Open Systems Consulting
http://www.lingnu.com/



Re: OIDs, CTIDs, updateable cursors and friends

From
"Dave Page"
Date:

> -----Original Message-----
> From: Shachar Shemesh [mailto:psql@shemesh.biz]
> Sent: 18 February 2004 13:18
> To: Hackers; PostgreSQL OLE DB development
> Subject: [HACKERS] OIDs, CTIDs, updateable cursors and friends
>
> Would adding "OID" to the rows returned by each "Select"
> call, and then doing "update blah where oid=xxx" when I'm
> requested to update the row sound like a reasonable stategy,
> in lieu of updateable cursors? Can anyone suggest a better way?
>

Ignoring potential OID wraparound problems (which we do in pgAdmin) this
should work, assuming there is an OID column. I would suggest trying the
following methods in sequence:

1) Use the tables primary key.
2) Use the OID (and check that only one record will be affected).
3) Build a where clause based on all known original values (and check
that only one record will be affected).
4) Fail with an appropriate error.

2 & 3 can potentially affect more than one record, but even Microsoft
code runs into that problem from time to time and fails with an
appropriate error message. In pgAdmin II we used to ask the user if they
wanted to update all matching rows, but of course that is not
appropriate in a driver.

Regards, Dave.


Re: OIDs, CTIDs, updateable cursors and friends

From
Shachar Shemesh
Date:
Dave Page wrote:

> 
>
>  
>
>>-----Original Message-----
>>From: Shachar Shemesh [mailto:psql@shemesh.biz] 
>>Sent: 18 February 2004 13:18
>>To: Hackers; PostgreSQL OLE DB development
>>Subject: [HACKERS] OIDs, CTIDs, updateable cursors and friends
>>
>>Would adding "OID" to the rows returned by each "Select" 
>>call, and then doing "update blah where oid=xxx" when I'm 
>>requested to update the row sound like a reasonable stategy, 
>>in lieu of updateable cursors? Can anyone suggest a better way?
>>
>>    
>>
>
>Ignoring potential OID wraparound problems (which we do in pgAdmin) this
>should work, assuming there is an OID column. I would suggest trying the
>following methods in sequence:
>
>1) Use the tables primary key.
>  
>
I would, except I'm not sure how many queries I would need in order to 
find what the primary key is. Also, what happens if the primary key is 
not a part of the fields returned by the query?

>2) Use the OID (and check that only one record will be affected).
>  
>
That may work. Do a query for "how many would be affected". Then again, 
I'm currently not inside a transaction. The plan was not to be inside a 
transaction unless I needed to. I'm not sure how safe this is to perform 
many queries.

>3) Build a where clause based on all known original values (and check
>that only one record will be affected).
>  
>
Again - what happens when I'm not inside a transaction?

>4) Fail with an appropriate error.
>
>2 & 3 can potentially affect more than one record, but even Microsoft
>code runs into that problem from time to time and fails with an
>appropriate error message. In pgAdmin II we used to ask the user if they
>wanted to update all matching rows, but of course that is not
>appropriate in a driver.
>
>Regards, Dave.
>  
>
The doc mentions something about making the OID column unique. Would 
that not cause other problems? What happens if I define the OID field as 
unique, and I get a wraparound and an attempt to put a new field in with 
existing value? Would the OID skip to the next unique per table, or 
would the insert fail?
               Shachar

-- 
Shachar Shemesh
Lingnu Open Systems Consulting
http://www.lingnu.com/



Re: OIDs, CTIDs, updateable cursors and friends

From
"Dave Page"
Date:

> -----Original Message-----
> From: Shachar Shemesh [mailto:psql@shemesh.biz]
> Sent: 18 February 2004 14:10
> To: Dave Page
> Cc: Hackers; PostgreSQL OLE DB development
> Subject: Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends
>
>
> I would, except I'm not sure how many queries I would need in
> order to find what the primary key is.

Well this is the only safe way to update a specific record. To find the
pkey, look for an index on the table in pg_index with indisprimary =
true. The indkey column holds an array of pg_attribute.attnum's that are
in the index iirc.

> Also, what happens if
> the primary key is not a part of the fields returned by the query?

Add them as you proprosed to do with the OID, or fall back to the next
method. ADO etc. normally fail to update rows if the programmer hasn't
included a suitable key in the recordset.

> >2) Use the OID (and check that only one record will be affected).
> >
> >
> That may work. Do a query for "how many would be affected".
> Then again, I'm currently not inside a transaction. The plan
> was not to be inside a transaction unless I needed to. I'm
> not sure how safe this is to perform many queries.

Should be perfectly safe.

> >3) Build a where clause based on all known original values
> (and check
> >that only one record will be affected).
> >
> >
> Again - what happens when I'm not inside a transaction?

You might find a new row that wasn;t there before but is now, or vice
versa.

> The doc mentions something about making the OID column
> unique. Would that not cause other problems? What happens if
> I define the OID field as unique, and I get a wraparound and
> an attempt to put a new field in with existing value? Would
> the OID skip to the next unique per table, or would the insert fail?

It is not the drivers place to mess with peoples schemas, but yes, it
could cause an insert to fail following wraparound.

Regards, Dave.


Re: OIDs, CTIDs, updateable cursors and friends

From
Shachar Shemesh
Date:
Dave Page wrote:

> 
>
>  
>
>>-----Original Message-----
>>From: Shachar Shemesh [mailto:psql@shemesh.biz] 
>>Sent: 18 February 2004 14:10
>>To: Dave Page
>>Cc: Hackers; PostgreSQL OLE DB development
>>Subject: Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends
>>
>>
>>I would, except I'm not sure how many queries I would need in 
>>order to find what the primary key is. 
>>    
>>
>
>Well this is the only safe way to update a specific record. To find the
>pkey, look for an index on the table in pg_index with indisprimary =
>true. The indkey column holds an array of pg_attribute.attnum's that are
>in the index iirc.
>
>  
>
I'll have a look at that. How would updateable cursors do it? By locking 
the row?

>>Also, what happens if 
>>the primary key is not a part of the fields returned by the query?
>>    
>>
>
>Add them as you proprosed to do with the OID, or fall back to the next
>method. ADO etc. normally fail to update rows if the programmer hasn't
>included a suitable key in the recordset.
>  
>
So, basically, I would not be able to update a table that has no primary 
key?

>  
>
>>>2) Use the OID (and check that only one record will be affected).
>>> 
>>>
>>>      
>>>
>>That may work. Do a query for "how many would be affected". 
>>Then again, I'm currently not inside a transaction. The plan 
>>was not to be inside a transaction unless I needed to. I'm 
>>not sure how safe this is to perform many queries.
>>    
>>
>
>Should be perfectly safe.
>
>  
>
What happens if I check how many would be updated, and get "1" as a 
result. I then actually do it, but between asking and performing, 
someone added a second row that matches the criteria?

>>>3) Build a where clause based on all known original values 
>>>      
>>>
>>(and check 
>>    
>>
>>>that only one record will be affected).
>>> 
>>>
>>>      
>>>
>>Again - what happens when I'm not inside a transaction?
>>    
>>
>
>You might find a new row that wasn;t there before but is now, or vice
>versa.
>
>  
>
But what if someone else changes some of the known values of my row?

>>The doc mentions something about making the OID column 
>>unique. Would that not cause other problems? What happens if 
>>I define the OID field as unique, and I get a wraparound and 
>>an attempt to put a new field in with existing value? Would 
>>the OID skip to the next unique per table, or would the insert fail?
>>    
>>
>
>It is not the drivers place to mess with peoples schemas, but yes, it
>could cause an insert to fail following wraparound.
>
>  
>
Then it's not a good enough solution, even if the driver did have the 
prorogative to change the table.

>Regards, Dave.
>  
>
Ok, it seems to me there are several options here.
1. Find out which is the primary key for the table. What happens if the 
primary key is a multi-row thing? What happens if there is no primary key?
2. If I'm in a transaction, use OID for the insert after checking with a 
select that I'm only affecting one row. If I'm not in a transaction - 
perform the update in a generated transaction, and roll it back if there 
is more than one row affected.

I like 1 better, frankly. Dillemas dillemas dillemas.

-- 
Shachar Shemesh
Lingnu Open Systems Consulting
http://www.lingnu.com/



Re: OIDs, CTIDs, updateable cursors and friends

From
"Dave Page"
Date:

> -----Original Message-----
> From: Shachar Shemesh [mailto:psql@shemesh.biz]
> Sent: 18 February 2004 14:56
> To: Dave Page
> Cc: Hackers; PostgreSQL OLE DB development
> Subject: Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends
>
> I'll have a look at that. How would updateable cursors do it?
> By locking the row?

Dunno, we don't have them!

> So, basically, I would not be able to update a table that has
> no primary key?

Yes, unless you feel back to the value matching type update.

Realistically though, how can anyone expect to edit data successfully
unless they have defined a key to identify rows with? Whilst it's nice
to get it to work 100% of the time no matter how brain dead the schema
it's not that practical.

> What happens if I check how many would be updated, and get
> "1" as a result. I then actually do it, but between asking
> and performing, someone added a second row that matches the criteria?

It'll update both rows unless you do it in one transaction.

> But what if someone else changes some of the known values of my row?

The update will fail to find any rows. This is almost certainly what
happens when MS Access starts pinting #DELETED# in rows of a linked
table.

> Ok, it seems to me there are several options here.
> 1. Find out which is the primary key for the table. What
> happens if the primary key is a multi-row thing? What happens
> if there is no primary key?

I guess you mean multicolumn? No different, you just need all columns in
your WHERE clause. If there is no pkey (and I would be inclined to say
if there is none in the user's query and not try to add it yourself)
then you fail with an error.

> 2. If I'm in a transaction, use OID for the insert after
> checking with a select that I'm only affecting one row. If
> I'm not in a transaction - perform the update in a generated
> transaction, and roll it back if there is more than one row affected.
>
> I like 1 better, frankly. Dillemas dillemas dillemas.

1 is definitely better and is the only way that is guaranteed to be
safe. Thinking about it more, that is almost certainly the position a
driver should take. In pgAdmin we can afford a little artistic licence
(no pun intended) because no one will be using pgAdmin as a driver to
connect another program to a database, plus we can ask the user what
action to take if we don't know if the result will be exactly what was
intended. You do not have that luxury in a driver of course.

Regards, Dave


Re: OIDs, CTIDs, updateable cursors and friends

From
Brett Schwarz
Date:
> > Ok, it seems to me there are several options here.
> > 1. Find out which is the primary key for the
> table. What 
> > happens if the primary key is a multi-row thing?
> What happens 
> > if there is no primary key?
> 
> I guess you mean multicolumn? No different, you just
> need all columns in
> your WHERE clause. If there is no pkey (and I would
> be inclined to say
> if there is none in the user's query and not try to
> add it yourself)
> then you fail with an error.
> 
> > 2. If I'm in a transaction, use OID for the insert
> after 
> > checking with a select that I'm only affecting one
> row. If 
> > I'm not in a transaction - perform the update in a
> generated 
> > transaction, and roll it back if there is more
> than one row affected.
> > 
> > I like 1 better, frankly. Dillemas dillemas
> dillemas.
> 
> 1 is definitely better and is the only way that is
> guaranteed to be
> safe. Thinking about it more, that is almost
> certainly the position a
> driver should take. In pgAdmin we can afford a
> little artistic licence
> (no pun intended) because no one will be using
> pgAdmin as a driver to
> connect another program to a database, plus we can
> ask the user what
> action to take if we don't know if the result will
> be exactly what was
> intended. You do not have that luxury in a driver of
> course.

Just as another datapoint, pgaccess does the same
thing (finding the pkey(s), and using those to
uniquely identify the row). It is kind of a PITA, but
as Dave says, it's the best way to do this.

HTH,
   --brett


__________________________________
Do you Yahoo!?
Yahoo! Mail SpamGuard - Read only the mail you want.
http://antispam.yahoo.com/tools


Re: OIDs, CTIDs, updateable cursors and friends

From
Tom Lane
Date:
"Dave Page" <dpage@vale-housing.co.uk> writes:
> Ignoring potential OID wraparound problems (which we do in pgAdmin) this
> should work, assuming there is an OID column. I would suggest trying the
> following methods in sequence:

> 1) Use the tables primary key.
> 2) Use the OID (and check that only one record will be affected).
> 3) Build a where clause based on all known original values (and check
> that only one record will be affected).
> 4) Fail with an appropriate error.

I don't think it's a good idea for driver-level code to depend on OIDs
for this; to do that you need the knowledge and cooperation of the
database designer.  The OID column may not exist at all (CREATE TABLE
... WITHOUT OIDS).  If it does exist, it's not guaranteed to be unique
unless someone put a unique index on it (and I agree with Dave that a
driver has no business installing such an index).  Furthermore, if
there's not an index on OID then an update specifying "WHERE oid = nnn"
is going to be very slow because it will have to seqscan the whole
table.

I believe the ODBC driver uses CTID for this sort of problem.  CTID is
guaranteed to exist and to be fast to access (since it's a physical
locator).  Against this you have the problem that concurrent updates
of the record will move it, leaving your CTID invalid.  However, that
could be a good thing, as it's debatable that you want to blindly apply
your update in such a case anyway.

If you are willing to hold open a transaction while the user edits the
record, you can lock the record with SELECT FOR UPDATE, and then your
CTID is guaranteed good for the duration of the transaction.

If you don't want to do that, I'd suggest reading both CTID and XMIN
when you initially read the tuple.  When you are ready to commit
changes, do this:
    BEGIN;    SELECT xmin FROM table WHERE ctid = whatever FOR UPDATE;    -- check that you get a record and its xmin
matches   -- what you had; if so, you can go ahead and do    UPDATE table SET ... WHERE ctid = whatever;    COMMIT;
 

If some other client changed the record behind your back, you'll know it
because the xmin changes, and you can then do whatever seems best to
cope.
        regards, tom lane


Re: OIDs, CTIDs, updateable cursors and friends

From
"Zeugswetter Andreas SB SD"
Date:
> I believe the ODBC driver uses CTID for this sort of problem.  CTID is
> guaranteed to exist and to be fast to access (since it's a physical
> locator).  Against this you have the problem that concurrent updates
> of the record will move it, leaving your CTID invalid.  However, that

IIRC the ctid access follows the chain up to the currently valid
tuple ? I thought the only enemy of ctid access was "vacuum" ?

Andreas


Re: OIDs, CTIDs, updateable cursors and friends

From
Tom Lane
Date:
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
> IIRC the ctid access follows the chain up to the currently valid 
> tuple ?

No.

I think Hiroshi or someone put in a function you can use to follow the
chain, but a simple "WHERE ctid = whatever" won't do it.  In any case,
if you're not holding an open transaction then you have to be prepared
to have the dead tuple vacuumed out from under you, in which case you'd
not be able to follow the chain anyway.
        regards, tom lane


Re: OIDs, CTIDs, updateable cursors and friends

From
Greg Stark
Date:
Shachar Shemesh <psql@shemesh.biz> writes:

> Would adding "OID" to the rows returned by each "Select" call, and then doing
> "update blah where oid=xxx" when I'm requested to update the row sound like a
> reasonable stategy, in lieu of updateable cursors? Can anyone suggest a better
> way?

If you're in control of the database schema and can ensure that all tables
will have OIDs enabled and you can add a unique index on OID on all these
tables then yes. But it's not ideal. If OID wraps around you'll get errors
from unique key violations.

A better strategy is to pull the primary key columns from information_schema
and use those columns. This would be more work but would work on any table
with a primary key.

This won't work for tables without primary keys, but in that case, arguably,
updating records doesn't really make sense anyways.

Something like this, though I'm not really very familiar with the
information_schema. 

db=> SELECT ordinal_position,column_name       FROM information_schema.table_constraints AS a       JOIN
information_schema.key_column_usageAS b USING (constraint_schema,constraint_name)      WHERE a.constraint_type =
'PRIMARYKEY'        AND a.table_schema = 'public'        AND a.table_name = 'country'      ORDER BY
ordinal_position;ordinal_position| column_name  
 
------------------+--------------               1 | country_code
(1 row)


-- 
greg