Thread: data integrity
Hi I'm using postgresql to manage a web site. Problem: How are people dealing with data integrity issues such as stale data when writing web-based apps? For instance: The value of X is 7. Step (1) User A selects X. Step (2) User B selects X. Step (3) User A updates X to 8. Step (4) User B updates X to 10 under the assumption that X is still 7. In some scenarios, user B should be prevented from updating. Solutions (that I'm considering): (1) Timestamp: Have a timestamp for every row in every table, which automatically changes when a row is inserted or updated, or (2) Cache: Similarly, cache the user's selected data on the server side (every tuple from the original select) and compare when the user attempts to update, or (3) Squash: modify only those fields in the row that need updating (instead of the entire row) and whatever happens happens. Has anyone solved this 'stale data' problem before? Does this fall into a family of problems that has an existing (trivial) solution? Also, is it the responsibility of the application server to deal with the scenario above or should the application itself be responsible? Thanks in advance ...Jelle
On Lun 05 Nov 2001 18:06, you wrote: > Hi > > I'm using postgresql to manage a web site. > > Problem: How are people dealing with data integrity issues such as stale > data when writing web-based apps? For instance: > > The value of X is 7. > > Step (1) User A selects X. > Step (2) User B selects X. > Step (3) User A updates X to 8. > Step (4) User B updates X to 10 under the assumption that X is still 7. Lock it with a SELECT ...... FOR UPDATE. The rows that are going to be updated will be locked, but not the entire table. :-) > In some scenarios, user B should be prevented from updating. > > Solutions (that I'm considering): > > (1) Timestamp: Have a timestamp for every row in every table, which > automatically changes when a row is inserted or updated, or > > (2) Cache: Similarly, cache the user's selected data on the server side > (every tuple from the original select) and compare when the user attempts > to update, or > > (3) Squash: modify only those fields in the row that need updating > (instead of the entire row) and whatever happens happens. > > Has anyone solved this 'stale data' problem before? > > Does this fall into a family of problems that has an existing (trivial) > solution? Also, is it the responsibility of the application server to deal > with the scenario above or should the application itself be responsible? saludos... ;-) -- Porqué usar una base de datos relacional cualquiera, si podés usar PostgreSQL? ----------------------------------------------------------------- Martín Marqués | mmarques@unl.edu.ar Programador, Administrador, DBA | Centro de Telematica Universidad Nacional del Litoral -----------------------------------------------------------------
On Mar 06 Nov 2001 08:24, Janning Vygen wrote: > Am Dienstag, 6. November 2001 00:17 schrieb Martín Marqués: > > On Lun 05 Nov 2001 18:06, you wrote: > > > Hi > > > > > > I'm using postgresql to manage a web site. > > > > > > Problem: How are people dealing with data integrity issues such > > > as stale data when writing web-based apps? For instance: > > > > > > The value of X is 7. > > > > > > Step (1) User A selects X. > > > Step (2) User B selects X. > > > Step (3) User A updates X to 8. > > > Step (4) User B updates X to 10 under the assumption that X is > > > still 7. > > > > Lock it with a SELECT ...... FOR UPDATE. The rows that are going to > > be updated will be locked, but not the entire table. :-) > > I dont think that FOR UPDATE helps in this situation, but it might be > a misunderstanding. > > FOR UPDATE locks the rows of the table for an update. right. but it > removes the lock if the transaction commits. > > In a web based application you cant enforce a transaction between the > sending of a form and the exceuting of a submit. > > Its an application problem: the data i see in my frontend are never > realtime. they might have change in the last seconds. But who cares. > If i want to have a value of 8 and another one wants a value of 10 > its a question of whom granting write access. Web based? Great! PHP? If you already have 7 saved, then make a CASE statment in your query, and depending on the IF you update or not. You may want to think about using plpgsql. Simple. :-) -- Porqué usar una base de datos relacional cualquiera, si podés usar PostgreSQL? ----------------------------------------------------------------- Martín Marqués | mmarques@unl.edu.ar Programador, Administrador, DBA | Centro de Telematica Universidad Nacional del Litoral -----------------------------------------------------------------
Am Dienstag, 6. November 2001 00:17 schrieb Martín Marqués: > On Lun 05 Nov 2001 18:06, you wrote: > > Hi > > > > I'm using postgresql to manage a web site. > > > > Problem: How are people dealing with data integrity issues such > > as stale data when writing web-based apps? For instance: > > > > The value of X is 7. > > > > Step (1) User A selects X. > > Step (2) User B selects X. > > Step (3) User A updates X to 8. > > Step (4) User B updates X to 10 under the assumption that X is > > still 7. > > Lock it with a SELECT ...... FOR UPDATE. The rows that are going to > be updated will be locked, but not the entire table. :-) I dont think that FOR UPDATE helps in this situation, but it might be a misunderstanding. FOR UPDATE locks the rows of the table for an update. right. but it removes the lock if the transaction commits. In a web based application you cant enforce a transaction between the sending of a form and the exceuting of a submit. Its an application problem: the data i see in my frontend are never realtime. they might have change in the last seconds. But who cares. If i want to have a value of 8 and another one wants a value of 10 its a question of whom granting write access. janning > > In some scenarios, user B should be prevented from updating. > > > > Solutions (that I'm considering): > > > > (1) Timestamp: Have a timestamp for every row in every table, > > which automatically changes when a row is inserted or updated, or > > > > (2) Cache: Similarly, cache the user's selected data on the > > server side (every tuple from the original select) and compare > > when the user attempts to update, or > > > > (3) Squash: modify only those fields in the row that need > > updating (instead of the entire row) and whatever happens > > happens. > > > > Has anyone solved this 'stale data' problem before? > > > > Does this fall into a family of problems that has an existing > > (trivial) solution? Also, is it the responsibility of the > > application server to deal with the scenario above or should the > > application itself be responsible? > > saludos... ;-) -- Planwerk 6 /websolutions Herzogstraße 86 40215 Düsseldorf fon 0211-6015919 fax 0211-6015917 http://www.planwerk6.de
I deal with it in an application specific manner. In one case- row contains FAQ style item - I mark the row as locked by a particular user/session. If other users try to update the same row, a copy is made. So the end users have to combine their work. In another case - rows are given an application level transaction ID, only rows from committed transactions are visible (did this in MySQL some years back). The transactions are only for a very limited controlled scope. Trying to fully implement application level MVCC/transaction might be possible but difficult coping with cases where a user opens multiple browser windows or jumps from state to state (history/back button), or goes for lunch... Can an application server really deal with this problem? If anyone can show how, I'd like to know! Keeping a database connection open between webpages won't work well (multiple windows = self conflicting transaction) or scale (many users/sessions per timeout period= run out of DB connections) . As a kludge you can hold open a web connection to keep track of stuff - a download that takes a very long time- displaying very little data at a time. But it's ugly and sometimes certain browsers just randomly disconnect after a while.. Cheerio, Link. At 04:06 PM 11/5/01 -0500, Jelle Ouwerkerk wrote: >Problem: How are people dealing with data integrity issues such as stale >data when writing web-based apps? For instance: > >The value of X is 7. > >Step (1) User A selects X. >Step (2) User B selects X. >Step (3) User A updates X to 8. >Step (4) User B updates X to 10 under the assumption that X is still 7. > >In some scenarios, user B should be prevented from updating. > >Solutions (that I'm considering): > >(1) Timestamp: Have a timestamp for every row in every table, which >automatically changes when a row is inserted or updated, or > >(2) Cache: Similarly, cache the user's selected data on the server side >(every tuple from the original select) and compare when the user attempts >to update, or > >(3) Squash: modify only those fields in the row that need updating >(instead of the entire row) and whatever happens happens. > >Has anyone solved this 'stale data' problem before? > >Does this fall into a family of problems that has an existing (trivial) >solution? Also, is it the responsibility of the application server to deal >with the scenario above or should the application itself be responsible?
On Mon, 5 Nov 2001, Jelle Ouwerkerk wrote: > Problem: How are people dealing with data integrity issues such as stale > data when writing web-based apps? For instance: [...] > In some scenarios, user B should be prevented from updating. Maybe you can do something like this: user A SELECTs the data, and the web app caches the fields. user B does the same thing. user B changes some fields and commits. user A changes some fields and tries to commit. His webapp SELECTs the data again and checks against its cache. Since the data is different, the webapp presents him the three versions and lets him choose. The idea is to check every time you are going to commit the data to see if it changed while you weren't looking. If it did, tell him so and let him decide. -- Alvaro Herrera (<alvherre[@]atentus.com>) "Ni aun el genio muy grande llegaria muy lejos si tuviera que sacarlo todo de su propio interior" (Goethe)
Keep reading, I have commented below. --- Janning Vygen <vygen@gmx.de> wrote: > Am Dienstag, 6. November 2001 00:17 schrieb Mart�n > Marqu�s: > > On Lun 05 Nov 2001 18:06, you wrote: > > > Hi > > > > > > I'm using postgresql to manage a web site. > > > > > > Problem: How are people dealing with data > integrity issues such > > > as stale data when writing web-based apps? For > instance: > > > > > > The value of X is 7. > > > > > > Step (1) User A selects X. > > > Step (2) User B selects X. > > > Step (3) User A updates X to 8. > > > Step (4) User B updates X to 10 under the > assumption that X is > > > still 7. > > > > Lock it with a SELECT ...... FOR UPDATE. The rows > that are going to > > be updated will be locked, but not the entire > table. :-) > > I dont think that FOR UPDATE helps in this > situation, but it might be > a misunderstanding. > > FOR UPDATE locks the rows of the table for an > update. right. but it > removes the lock if the transaction commits. > > In a web based application you cant enforce a > transaction between the > sending of a form and the exceuting of a submit. And even if you could, you probably wouldn't want to. If you did, it would allow folks to lock up rows in your database indefinitely simply by requestin a form and closing their web browser. Not only that, but it would require that each database user get their own PostgreSQL connection, and your middle ware layer would have to map requests to the correct connection. In the end it is *much* easier to simply make each request a transaction. If an error is raised, then everything is rolled back, if not then everything is committed and the page is served up. If you *really* need something like this it is better not to use PostgreSQL's transaction mechanism and instead create an extra table that holds the primary key of the records you want locked and a timestamp (and probably the user id or session id of the person who has created the lock). That way you can easily check this table for which records are being updated and you can also easily time out locks. This is especially useful in those cases where updating a record takes a lot of time. If all you are doing is updating a simple integer then a locking table is not likely to help. Each individual will hold the lock for such a short period of time that the chances of collision are low. > Its an application problem: the data i see in my > frontend are never > realtime. they might have change in the last > seconds. But who cares. > If i want to have a value of 8 and another one wants > a value of 10 > its a question of whom granting write access. This sort of a setup is certainly easiest. However, sometimes it is useful to alert the user to the fact that someone else is also working on this particular row. Creating a separate table to track which rows are being accessed is fairly straightforward and not particular expensive. Trying to use PostgreSQL's locking mechanisms (on the other hand) is a short road to disaster. Janning is certainly correct, however, when he says that ultimately it comes down to who has write access. Even using a locking table there is nothing to stop someone changing that row right after I finish changing it. They won't be able to change it *while* I am working on it, but they'll be able to change it the second I hit "submit." Jason __________________________________________________ Do You Yahoo!? Find a job, post your resume. http://careers.yahoo.com