Thread: Updates on updates

Updates on updates

From
Shachar Shemesh
Date:
Hi list,


I had a meeting with my client (the one who wanted update support), and
here are the conclusions: It's a long way ahead.


It seems that much more important to the client than being able to
update from the result set is the ability to get most up to date queries
for each fetch. This is a requirement of a "dynaset" view
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore/html/_core_Dynaset.asp).
Unfortunately, PostgreSQL does not support cursors that are not of type
"sensitive". Even if it did, it does not support cursors that do not
live inside a transaction. It follows, then, that there is no choice for
an ODBC that wishes to implement a dynaset record but to perform a new
query for each and every fetch operation.


Things are slightly worse, in fact. Because the ordering of the records
in a dynaset are guarenteed to remain the same as when they were first
fetched, some way of determining whether a specific row has changed (or
was deleted) is required. It is at this point that I have to appologize
to Hiroshi if any tone of dismissivness could have been understood from
my previous email regarding his implementation. Making a dynaset work
without using oids is going to be damn tough.


There is one glimmer of hope. The link I gave above claims that we are
allowed to keep, in essence, local copy of the data from the dataset
until a "requery" is done. This means that implementing a dynaset
without huge performance losses is going to be possible (though not easy).


This email is mostly so I can update everyone on the current status. I
have not, yet, formed a concerete plan on what my personal actions on
this matter are going to be. I mostly interested to see whether an
interesting discussion forms as a result of these new discoveries.


Shachar


Re: Updates on updates

From
Marc Herbert
Date:
First thanks for your message, highly interesting (despite its title ;-)


Shachar Shemesh <psql@shemesh.biz> writes:

> Things are slightly worse, in fact. Because the ordering of the records
> in a dynaset are guarenteed to remain the same as when they were first
> fetched,

PostgreSQL transaction isolation implementation is based on snapshots,
so this looks sensible. Except I don't get the meaning of: "the
ordering" (so i just suppressed it from the sentence in order to
understand it :-)


But what happens if you lower the transaction isolation level to "read
committed"? do the assertions above still hold? Do later fetches
still see old data? If yes, is it because of the way the
driver/protocol is implemented or because of the engine itself? (the
latter would be surprising).




Re: Updates on updates

From
"Campbell, Greg"
Date:
You say
"It seems that much more important to the client than being able to
update from the result set is the ability to get most up to date queries
for each fetch."

Specify "fetch".
Is it the returning of one or more records at a time of a cursor?
Or is it, the returning of the entire recordset?

Also what client environment are in (VB6, Access with DAO, RDO, ADO, Delphi, C++...)(It can effect how I
understand you to be using the term Dynaset)?

Classically I think of a cursor as a pointer to the recordset I want. I open it and do FETCHes using it,
then close it. Then your underlying question is will you see changes made by other users while you have
this cursor open. The return question is why is your cursor open so long?

More casually one might speak of fetching a set of records. This is nowhere near as specific as the FETCH
mentioned previously. If perhaps you have a grid in an user application, and the client wants the data to
be "fresh". The underlying question is do you have to run the query again, repeatedly? Or is the question,
if you run the query again, will you get the "most up to date" information?

Are you treating you application like a local Access database, where you can use a bound form to attach to
a table of data, and keep it open and automatically refreshing as long as you like? Or are you treating
your application like a client and server application, where you expect to send requests, and get the data
in the most expeditions manner possible, keeping transactions short and  atomic,  almost treating the data
transfer as a highly expensive proposition? Perhaps I am out of date on my view of application design.
When is the issue ODBC driver capability, and when is it application design?





Shachar Shemesh wrote:
> Hi list,
>
>
> I had a meeting with my client (the one who wanted update support), and
> here are the conclusions: It's a long way ahead.
>
>
> It seems that much more important to the client than being able to
> update from the result set is the ability to get most up to date queries
> for each fetch. This is a requirement of a "dynaset" view
> (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore/html/_core_Dynaset.asp).
> Unfortunately, PostgreSQL does not support cursors that are not of type
> "sensitive". Even if it did, it does not support cursors that do not
> live inside a transaction. It follows, then, that there is no choice for
> an ODBC that wishes to implement a dynaset record but to perform a new
> query for each and every fetch operation.
>
>
> Things are slightly worse, in fact. Because the ordering of the records
> in a dynaset are guarenteed to remain the same as when they were first
> fetched, some way of determining whether a specific row has changed (or
> was deleted) is required. It is at this point that I have to appologize
> to Hiroshi if any tone of dismissivness could have been understood from
> my previous email regarding his implementation. Making a dynaset work
> without using oids is going to be damn tough.
>
>
> There is one glimmer of hope. The link I gave above claims that we are
> allowed to keep, in essence, local copy of the data from the dataset
> until a "requery" is done. This means that implementing a dynaset
> without huge performance losses is going to be possible (though not easy).
>
>
> This email is mostly so I can update everyone on the current status. I
> have not, yet, formed a concerete plan on what my personal actions on
> this matter are going to be. I mostly interested to see whether an
> interesting discussion forms as a result of these new discoveries.
>
>
> Shachar
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq

Attachment

Re: Updates on updates

From
"Campbell, Greg"
Date:
Thanks for continuing the dialog. I want to say I was not trying to argue with you, just getting
clarification of the semantics of your inquiry.

As I understand it now,
You have a client that wants the most update-to-date information at each (row) fetch. This means
implementing a cursor that at any fetch is up-to-date with any changes since cursor formation, made by
either that user or other users. And this cursor can span transactions of the user and transactions of
others. This updateable cursor is the very definition of the ODBC dynamic cursor.

I do not know whether or not this is possible with pgODBC.

I sought clarification, because my work for the past few years has been entirely with disconnected
recordsets. This "liveness" of data is only simulated by requerying. I believe the truly dynamic cursor
(dynaset) behavior requires a connected recordset. This runs somewhat contrary to the types of scalabity
and pooled resource behavior that I thought were becoming standard.

But you are correct, the ODBC standard defines the dynamic cursor in a certain way, and a compliant driver
should support it, or correctly respond to ::SQLGetInfo w/SQL_SCROLL_OPTIONS, and ::SQLGetFunctions with
SQL_API_SQLEXTENDEDFETCH regarding keyset and extended fetching ability.


Should an OID-less table, with a valid Primary Key, be viable for dynamic updating?



Shachar Shemesh wrote:

> Sorry for the late reply. Please CC me in the future.
>
>
> Campbell, Greg wrote:
>
>
>>You say
>>"It seems that much more important to the client than being able to
>>update from the result set is the ability to get most up to date queries
>>for each fetch."
>>
>>Specify "fetch".
>
>
> All definitions are taken from the ODBC manual by Microsoft. They define
> fetch as a row fetch.
>
>
>>Is it the returning of one or more records at a time of a cursor?
>
>
> Yes.
>
>
>>Also what client environment are in (VB6, Access with DAO, RDO, ADO,
>>Delphi, C++...)(It can effect how I understand you to be using the
>>term Dynaset)?
>
>
> Direct ODBC connection, no wrappers.
>
>
>>Classically I think of a cursor as a pointer to the recordset I want.
>
>
> That's nice of you, but that's a static cursor. A dynaset cursor doesn't
> act like that. Again, do not argue with me. Argue with the ODBC definition.
>
>
>>The underlying question is do you have to run the query again,
>>repeatedly? Or is the question, if you run the query again, will you
>>get the "most up to date" information?
>
>
> My reading of the standard is this. If you do not run the query again,
> you have to see all changes made by your application (not necessarily
> through this particular query). If you do run the query again (a command
> called "requery"), you will also see changes made by other users.
>
>
>>Are you treating you application like a local Access database,
>
>
> that's what the client is doing, and it seems to be supported by the
> ODBC definition, so we have to support that if we support dynasets.
>
>
>>where you can use a bound form to attach to a table of data, and keep
>>it open and automatically refreshing as long as you like?
>
>
> If the standard is all we have to implement, we may simply be able to
> keep the pg_query_results open, and define "overrides". It's more
> complicated then that, as it seems that when a view has been updated, we
> need to update ALL views of the same record.
>
> Hiroshi's solution was to silently add the tid and oid fields to all
> queries. Upon a new fetch, he would requery the database for that line.
> This was slow as hell, as it meant that scanning a cursor required a
> single query per each and every row of the database. It also meant that
> opening a dynaset for "select * from view" would fail, as a view doesn't
> have an oid and a tid field.
>
> What I thought about doing was this. When you run a query, we use the
> pqlib commands to get the oid of the table from which the results came
> (which is defined also for views). If they came from more then one view,
> we refuse to open it in dynaset mode. If they all come from the same
> table, we query the schema table and find out whether the table has a
> primary key, and whether it is prsent inside the columns returned by the
> table. This means one extra query per user query, which is a much
> smaller price to pay.
>
> If the user performs an update, we send this update back to the
> database. We also mark, to ourselves, the primary key for the table that
> changed, and the meaning of the update. Whenever the user performs a
> fetch, we check whether the specific table and primary key has a local
> update registered, and if does, we override the results we get from the
> result set. This way, we can implement dynaset with reasonable
> performance, while still being conforming to the standard.
>
> My only problem is that I usually use C++ (and STL) for keeping such
> hashes. I'll have to brush up my C database skills.
>
>
>>Or are you treating your application like a client and server
>>application, where you expect to send requests, and get the data in
>>the most expeditions manner possible, keeping transactions short and
>>atomic,  almost treating the data transfer as a highly expensive
>>proposition?
>
>
> Dynasets live outside of ANY transaction. Lifetime is, therefor, a
> non-issue for them.
>
>
>>Perhaps I am out of date on my view of application design. When is the
>>issue ODBC driver capability, and when is it application design?
>
>
> We work (I think - I have not gained enough credit on this particular
> project to speak on the project's behalf, so a formal view from Ludek or
> Dave would be appreciated) according to the specs. If the specs say we
> have to do something a certain way, that's how we have to do it.
>
> There is one disturbing thing, though. The client says (and I find it
> hard to believe, but it will be checked) that other ODBC providers show
> changes done by OTHER people running the client on another machine
> without doing a "requery". I simply cannot see how such a feature can be
> implemented without sacrificing performance, which is the reason I find
> it so hard to believe.
>
>           Shachar

Attachment

Re: Updates on updates

From
Marc Herbert
Date:
On Thu, Jan 19, 2006 at 05:49:30PM +0200, Shachar Shemesh wrote:
>
> >But what happens if you lower the transaction isolation level to "read
> >committed"? do the assertions above still hold?
> >
> Dynaset views should remain updated outside of ANY transaction. In any
> case, we do not control the transaction isolation level. We are just a
> driver.

Sorry in advance if I misunderstand you, but a high isolation level
and a dynamic cursor are incompatible with each other. You cannot be
both isolated and aware of every update that's going on at the same
time. So you cannot just let the user enable snapshot isolation if you
want to implement a dynamic cursor.





Re: Updates on updates

From
Ludek Finstrle
Date:
> Should an OID-less table, with a valid Primary Key, be viable for dynamic
> updating?

As I know Hiroshi add this behaviour to the 07.03.258. I don't try it.
BTW dynamic cursors isn't supported (are buggy - bad pointers) in 08.01.

Regards,

Luf