Thread: Unanswered questions about Postgre
Sorry if this is a repost, but it appears the postgre mailing list filtered my original mail without warning. I have a bunch of questions, and I've looked through the documentation without any answers. I also looked in the FAQ.. I'm guessing that most of these questions -are- FAQs, so maybe they should be put there after some answers are assembled. Also, have you guys considered a searchable version of your documentation, or a version of your documentation all on one page, so it can be searched in the browser? 1) in psql, if I make a typeo, the transaction is automatically aborted. For example: ---- apollodemo=# \d List of relations Name | Type | Owner ------+-------+------- boo | table | kislo (1 row) apollodemo=# begin; BEGIN apollodemo=# create table moo (i int); CREATE apollodemo=# Oops this is a typeo; ERROR: parser: parse error at or near "oops" apollodemo=# insert into moo values (1); NOTICE: mdopen: couldn't open moo: No such file or directory NOTICE: RelationIdBuildRelation: smgropen(moo): No such file or directory NOTICE: mdopen: couldn't open moo: No such file or directory NOTICE: mdopen: couldn't open moo: No such file or directory NOTICE: mdopen: couldn't open moo: No such file or directory NOTICE: current transaction is aborted, queries ignored until end of transaction block *ABORT STATE* apollodemo=# \d NOTICE: current transaction is aborted, queries ignored until end of transaction block No relations found. apollodemo=# \d NOTICE: current transaction is aborted, queries ignored until end of transaction block No relations found. apollodemo=# commit; COMMIT apollodemo=# \d List of relations Name | Type | Owner ------+-------+------- boo | table | kislo (1 row) apollodemo=# --- Arr, that's really annoying when you're in Psql, and you just trashed your entire transaction because you made a type-o. It's even more rediculous when your application is acctually doing a transaction, and because an error occoured, the entire transaction is now trashed. Why exactly does Postgre abort the transaction is there is an error? It doesn't have to, and no other RDBMS's do it AFAIK (interbase/oracle). for example: if you start a transaction, do some work, then try to insert a record which violates a dataintegrity constraint, the ENTIRE transaction is now wiped out. Instead, the application should be able to parse the integrity contraint problem, and keep on going, or -choose- to rollback the transaction. I assume this -must- be user settable, how do I do this? And what was the rationale behind automatically aborting at the first sign of imperfection? 2) In the above example, when I issue the "insert into moo values(1);",PostGre spews out all manners of errors. Should one of the following happen?: a) Given that the transaction is ABORTed, shouldn't it just entirly ignore the insert request (as advertised), and thus NOT even bother querying the operating system for the database file? b) Given that the transaction has been ABORTed, it shouldn't bother querying the OS for the table, since it knows it doesn't exist c) Given that the table does not exist, and this is what you would expect, not spew operating system errors to the screen 3) You have some nice documentation describing transaction isolation; but you fail to follow through in saying how to CHANGE the transaction isolation mode. The dox are here: http://postgresql.readysetnet.com/users-lounge/docs/v7.0/user/mvcc4564.htm How do you change the isolation mode? 4) I was doing some testing of my Database Abstraction layer, and I was running the same code between MySQL, Postgresql, and Interbase. I made an interesting discovery. After running the test sequence once on Postgre, each sucsessive run of the test suite (or "benchmark" since I was timing it.. but it really wasn't a benchmark) postgre's number's got progressivly worse. Infact, on -each- sucsessive run, postgre's time-to-completion -doubled-. This was across the boards, on selects by key, selects not by key, inserts, updates... Dropping the table, and recreating it got postgre back to square one again... I assume this has to do with the fact that about 1000 records were inserted, and about a 1000 rows were deleted (over the course of the test)... Is this -normal- for postgre? I realize that maybe some sort of transaction audit needs to be stored, but these transactions are committed... Do we need that audit anymore? Plus, should that "audit" so -massivly- effect the entire database's operational speed? Why should running a query on a table be -so- -so- much slower if there have been repeated inserted and deleted records in it? We're talking about a massive slow down here, this test originally ran in about 7 seconds.. After a few runs, it was taking 2 minutes. There are no rows left in the table at the end... Shouldn't pgsql just overwrite the dead space, and why is the dead space effecting performance? ( I don't care about disk space concerns). I did notice vaccuming the database between runs kept postgre working at a reasonable pace. Do people need to vaccume their databases hourly? Can you vaccume while a database is in use? Any discussion on this curious phenomenon would be appreciated. It still boggles me. 5) BLOB Support. Ugh? I cannot find anything in any of the manuals about BLOBs, yet I know PGsql supports them because I've found lots of people in the mailing list yelling at people to use BLOBs when they complain that the max row size is 8k. Yet no dox (that I can find!). I did find -one- piece of documentation. However it only described how to load a BLOB from -the harddrive of the server-, and how to retrieve a BLOB -directly onto the harddrive of the server-. This, ofcourse, is entirly useless in a client/server application. Does PGSQL have BLOB support which can work over the wire? If so, where are the dox? If not, isn't this an incredibly massive limitation? How do people work around this? 6) Two things about the JDBC Driver for postgre. First, If you create a Prepared Statement, and are populating the fields, if you try to use setObject to store an object, the JDBC driver will crash if the object is null. So if you write something like this: void sillyFunction (Integer i, Integer ii) { [..] preparedStatement.setObject(1,i); preparedStatement.setObject(2,ii); } And some wise ass passes in a null Integer, the JDBC driver will crash with a NullPointerException. Instead you must first test to see if the object is null, and if not, then insert it using setObject, otherwise insert it using setNull. Is this right? I've used quite a few other JDBC drivers, and none of them seem to crash in this manner if you pass in a null. I've never had to use setNull explicitly before. Two, if you have a resultset, and try to fetch an int (or anything) from column 0, the entire JVM goes down. I am entirly baffled by how the entire JVM could crash (with a seg fault). Does your JDBC code use native code? I tried two different JVMs (IBM JDK1.3 and Blackdown 1.2.2), both with and without JITs active. I also tried the JDBC driver fresh off the site, aswell as the one shipped with postgre 7.0.3.. All combinations crash.. Do you guys know why this might be? What error checking do you have for this case? Thanks guys, -Joe
[re: question #4, speed/vacuuming] > Do > people need to vaccume their databases hourly? Can you vaccume while > a database is in use? Any discussion on this curious phenomenon would > be appreciated. It still boggles me. I vacuum twice a day, once in the dead of night, once around lunch. Yes, you can vacuum while the db is in use, but many locks (for updates, inserts, etc.) will hold up the vacuum. > 5) BLOB Support. Keep in mind the pgsql 7.1 (about to go beta tomorrow, if I heard right) will support much longer row sizes than 8k. Doesn't remove the needs for blobs for many of us, but fixed my problems. I believe the docs discussing the c-level interfaces talk about lo creation and such. Have you looked in the low-level docs in the programmer/developer manuals? I have only played w/blobs; others can speak better about their use/limitations, but if I have it correct: . blobs cannot be dumped . blobs are not normally vacuumed So, for most of us, I think the TOAST feature of 7.1 that allows >8k row sizes is much nicer. (Unless, of course, you really want to store binary data, not just long text fields.) Good luck, -- Joel Burton, Director of Information Systems -*- jburton@scw.org Support Center of Washington (www.scw.org)
on Nov 29, 2000, 19:17, Joel Burton std::cout'ed: [snip] | > 5) BLOB Support. | | Keep in mind the pgsql 7.1 (about to go beta tomorrow, if I heard | right) will support much longer row sizes than 8k. Doesn't remove | the needs for blobs for many of us, but fixed my problems. How _much_ longer? (Sorry if it's a FAQ, in that case, I'd appreciate a pointer/URL). ivr -- Intelligence est rélative. Par rapport à T*, c'est un génie. -- James Kanze sur "Smart Pointer"
On 30 Nov 2000, at 1:24, Igor V. Rafienko wrote: > on Nov 29, 2000, 19:17, Joel Burton std::cout'ed: > > [snip] > > | > 5) BLOB Support. > | > | Keep in mind the pgsql 7.1 (about to go beta tomorrow, if I heard | > right) will support much longer row sizes than 8k. Doesn't remove | > the needs for blobs for many of us, but fixed my problems. > > > How _much_ longer? (Sorry if it's a FAQ, in that case, I'd appreciate > a pointer/URL). Dunno, but I've been using 7.1devel for ~2 months, and so far, longer rows seem to work fine. More information on the TOAST project is at http://www.postgresql.org/projects/devel-toast.html -- Joel Burton, Director of Information Systems -*- jburton@scw.org Support Center of Washington (www.scw.org)
>> How _much_ longer? (Sorry if it's a FAQ, in that case, I'd appreciate >> a pointer/URL). > Dunno, but I've been using 7.1devel for ~2 months, and so far, > longer rows seem to work fine. > More information on the TOAST project is at > http://www.postgresql.org/projects/devel-toast.html I think I pontificated about this a month or two back, so check the archives; but the short answer is that the effective limit under TOAST is not on the total amount of data in a row, but just on the number of columns. The master copy of the row still has to fit into a block. Worst case, suppose every one of your columns is "wide" and so gets pushed out to BLOB storage. The BLOB pointer that still has to fit into the main row takes 32 bytes. With a maximum main row size of 8K, you can have about 250 columns. In practice, probably some of your columns would be ints or floats or booleans or something else that takes up less than 32 bytes, so the effective limit is probably order-of-magnitude-of 1000 columns in a table. If that seems too small, maybe you need to rethink your database design ;-) There's also a 1G limit on the size of an individual BLOB that can be part of a row. regards, tom lane
> That is what transactions are for. If any errors occur, then the > transacction is aborted. You are supposed to use transactions when you want > either everything to occur (the whole transaction), or nothing, if an error > occurs. Yes. There are certainly times when a transaction needs to be ABORTed. However, there are many reasons why the database should not abort a transaction if it does not need to. There is obviously no reason why a transaction needs to be aborted for syntax errors. There is obviously no reason why a transaction needs to be aborted for say, trying to insert a duplicate primary key. The -insert- can fail, report it as such, and the application can determine if a rollback is nessasary. If you don't believe me, here's two fully SQL-92 compliant databases, Oracle and interbase, which do not exhibit this behavior: -Oracle- SQL> create table test (i int, primary key (i)); Table created. SQL> insert into test values (1); 1 row created. SQL> insert into test values (1); insert into test values (1) * ERROR at line 1: ORA-00001: unique constraint (TEST.SYS_C001492) violated SQL> insert into test values (2); 1 row created. SQL> commit; Commit complete. SQL> select * from test; I ---------- 1 2 SQL> --Interbase-- SQL> create table test (i int not null, primary key (i)); SQL> insert into test values (1); SQL> insert into test values (1); Statement failed, SQLCODE = -803 violation of PRIMARY or UNIQUE KEY constraint "INTEG_156" on table "TEST" SQL> insert into test values (2); SQL> commit; SQL> select * from test; I ============ 1 2 SQL> > If you don't like this behaviour, then use auto-commit, and make every > separate statement a transaction in itself. That way if any of the > statements fails, the next one won't be aborted. This, however, depending > on the error you get could cause massive irreversible data corrpution. But > then again, if this is a risk, you should be using transactions which abort > the whole block on any error. Auto-commit is not the same thing though. That would make each statement a transaction. I don't want that, I want the ability of grouping a set of statements and commiting them or rolling them back as a whole. I do not, however, want the transaction aborted by the server when it does not need to be. Clearly in the above case, neither interbase nor oracle decided that the transaction had to be aborted. This has to be an option no? > Several ways. You can set up a shared network area, sort out unique > file-naming system (which shouldn't be too hard), and send some sort of a > "URL" as a pointer to the file. Ahhh. Unfortunatly that is an unacceptable solution :(. So that means there is no large binary storage available in postgre for me. > Alternatively, wait for v7.1 (develpment tree available), which will > support big record sizes (unlimited, AFAIK). Depending on what you are > trying to do, BLOBS may or may not be the ideal thing, but sometimes they > are the only way to store large chunks of data. 7.1 will hopefully sort > that out, as I have bounced my head off the record size limit a few times > myself. Hmm, I really hope the 7.1 implementation of blobs is a true blob, and not just a really long varchar or something. I need to store arbitrarily large binary data, and be able to retrieve it over the database connection. I'm really surprised there isn't a facility for this already... Our application couldn't run on postgre without it! Thanks for the reply, although I'm disappointed about the lack of blob support :( -Joe
> If that seems too small, maybe you need to rethink your database design > ;-) > > There's also a 1G limit on the size of an individual BLOB that can be > part of a row. Hmm, 1G is probably fine :) But is there going to be a blob type with toast? If I want to store a large binary object, and have the ability of retrieving it strictly over the postgre database connection, would I be retrieving a blob column, or a really long varchar column? Thanks, -Joe
Joe Kislo <postgre@athenium.com> writes: > Hmm, 1G is probably fine :) But is there going to be a blob type with > toast? If I want to store a large binary object, and have the ability > of retrieving it strictly over the postgre database connection, would I > be retrieving a blob column, or a really long varchar column? If you want binary (8-bit-clean) data, you need to use the 'bytea' datatype not 'varchar'. Our character datatypes don't cope with embedded nulls presently. This is primarily an issue of the external representation as a C string. Alternatively, you can keep using the old-style large-object support (lo_read, lo_write, etc). This may be handy if you are dealing with blobs large enough that you don't want to read or write the entire value on every access. We need to add that capability to bytea too, by defining some access functions that allow reading and writing portions of a bytea value --- but no one's gotten round to that yet, so I don't suppose it'll happen for 7.1. regards, tom lane
> > That is what transactions are for. If any errors occur, then the > > transacction is aborted. You are supposed to use transactions when you want > > either everything to occur (the whole transaction), or nothing, if an error > > occurs. > > Yes. There are certainly times when a transaction needs to be > ABORTed. However, there are many reasons why the database should not > abort a transaction if it does not need to. I disagree. You shouldn't be using transactions in the first place, if you didn't want the sequence to abort if an error occurs. > There is obviously no > reason why a transaction needs to be aborted for syntax errors. I beg to differ. For a start, invalid SQL are GREAT ways to irreversibly corrupt your data. And you should test your SQL to make sure it doesn't produce syntax errors before you get as far as putting it into a transaction. Here's an example: A bank is transferring money from one acount to another. Say the money leaves the first account (first update query), and then an error occurs when inserting the money into the second account (second update query). If you have debited the first account and committed the change despite the second error, the money would have left the first account, but it wouldn't have appeared in the second account. This would be irreversible, and would take lots of man-hours of following the paper trail (if there is one) to find where things went wrong, if it could be found at all. That is the whole point of transactions - they are used for an "all-or-nothing" approach. > There > is obviously no reason why a transaction needs to be aborted for say, > trying to insert a duplicate primary key. It is not obvious at all. In fact, I can see why it obviously shouldn't be done. Say you want a phone installed. Your phone number should be unique, and it is concievable that it can be the primary key for the database that stores phone numbers. The engineer types in the wrong number by accident. The transaction succeeds, and you end up with two phones with the same number. BAD thing. I could sit here and list examples endlessly, but let's not clog up the list with this too much. > The -insert- can fail, report > it as such, and the application can determine if a rollback is > nessasary. If you don't believe me, here's two fully SQL-92 compliant > databases, Oracle and interbase, which do not exhibit this behavior: [example snipped] So, what would you like to be the criteria for aborting or proceeding with a transaction? > > If you don't like this behaviour, then use auto-commit, and make every > > separate statement a transaction in itself. That way if any of the > > statements fails, the next one won't be aborted. This, however, depending > > on the error you get could cause massive irreversible data corrpution. But > > then again, if this is a risk, you should be using transactions which abort > > the whole block on any error. > > Auto-commit is not the same thing though. That would make each > statement a transaction. I don't want that, I want the ability of > grouping a set of statements and commiting them or rolling them back as > a whole. I do not, however, want the transaction aborted by the server > when it does not need to be. Clearly in the above case, neither > interbase nor oracle decided that the transaction had to be aborted. > > This has to be an option no? Hmm... Fair point. There might be an option for this. I don't know, as I never used transactions this way (or tried to, for that matter). I agree that it could be useful to have some sort of a "evaluation" stage before committing the transaction, where the application would see what (if any) errors have occured in the transaction stage, and upon that decide whether it really wants to commit or roll back. > > Several ways. You can set up a shared network area, sort out unique > > file-naming system (which shouldn't be too hard), and send some sort of a > > "URL" as a pointer to the file. > > Ahhh. Unfortunatly that is an unacceptable solution :(. So that means > there is no large binary storage available in postgre for me. I am not sure if there is another way. There might be, but I am not aware of it at the moment. > > Alternatively, wait for v7.1 (develpment tree available), which will > > support big record sizes (unlimited, AFAIK). Depending on what you are > > trying to do, BLOBS may or may not be the ideal thing, but sometimes they > > are the only way to store large chunks of data. 7.1 will hopefully sort > > that out, as I have bounced my head off the record size limit a few times > > myself. > > Hmm, I really hope the 7.1 implementation of blobs is a true blob, and > not just a really long varchar or something. I need to store > arbitrarily large binary data, and be able to retrieve it over the > database connection. I'm really surprised there isn't a facility for > this already... Our application couldn't run on postgre without it! That's fair enough. I myself got a feeling that BLOBs in 7.0 were a quick cludge rather than a permanent solution (not criticising anyone here!). But then again - what is the difference between an encoded varchar and a big binary type? If you get the data you want in a data object, what difference does it make how it happens? It's all just numbers to a computer anyway. ;-) (or am I wrong here?) Regards. Gordan
> > That is what transactions are for. If any errors occur, then the > > transacction is aborted. You are supposed to use > > transactions when you want either everything to occur > > (the whole transaction), or nothing, if an error occurs. > > Yes. There are certainly times when a transaction needs to be > ABORTed. However, there are many reasons why the database should not > abort a transaction if it does not need to. There is obviously no > reason why a transaction needs to be aborted for syntax errors. There > is obviously no reason why a transaction needs to be aborted for say, > trying to insert a duplicate primary key. The -insert- can > fail, report it as such, and the application can determine if a rollback > is nessasary. If you don't believe me, here's two fully SQL-92 > compliant databases, Oracle and interbase, which do not exhibit this behavior: Oracle & Interbase have savepoints. Hopefully PG will also have them in 7.2 Vadim
On 30 Nov 2000, at 11:58, Joe Kislo wrote: > If you don't believe me, here's two fully SQL-92 > compliant databases, Oracle and interbase, which do not exhibit this > behavior: Ummm... havings lots of experience w/it, I can say many things about Oracle, but "fully SQL-92 compliant" sure isn't one of them. :-) -- Joel Burton, Director of Information Systems -*- jburton@scw.org Support Center of Washington (www.scw.org)
Joel Burton wrote: > > On 30 Nov 2000, at 11:58, Joe Kislo wrote: > > If you don't believe me, here's two fully SQL-92 > > compliant databases, Oracle and interbase, which do not exhibit this > > behavior: > > Ummm... havings lots of experience w/it, I can say many things > about Oracle, but "fully SQL-92 compliant" sure isn't one of them. :-) Nice! I was just reading some article benchmarking Postgre against interbase.. They said interbase was fully SQL-92 compliant, so I just, well gosh, assumed it's expensive brotheren were compliant too :) Does anybody know of an article benchmarking interbase against postgre 7? The article I was reading was comparing it to postgre 6.5. I assume (hope) there have been some serious speed improvements since then? -Joe
> Joel Burton wrote: > > > > On 30 Nov 2000, at 11:58, Joe Kislo wrote: > > > If you don't believe me, here's two fully SQL-92 > > > compliant databases, Oracle and interbase, which do not exhibit > > > this behavior: > > > > Ummm... havings lots of experience w/it, I can say many things > > about Oracle, but "fully SQL-92 compliant" sure isn't one of them. > > :-) > > Nice! I was just reading some article benchmarking Postgre against > interbase.. They said interbase was fully SQL-92 compliant, so I > just, well gosh, assumed it's expensive brotheren were compliant too > :) Well, Oracle is "entry-level SQL92 compliant", but there's still lots of wiggle room for nonstandard, nonfunctional stuff. Not that it's a bad database--I always think of it warmly as a Soviet Tank (large, heavy, ugly, and gets the job done by flattening lots of stuff.) IMHO, it's *less* SQL compliant on many points that PostgreSQL. (See the discussion in the last month about NULL v empty-string handling in strings, where Oracle clearly fails an important SQL standard that we pass w/flying colors.) Although I think it's biased in many regards, MySQL's crash-me (www.mysql.com) has lots of direct comparison info about many different databases. They tend to overplay lots of minor things (like billions of obscure functions), and list critical things like, oh, transactions, subselects, procedural languages and such in one line, but, still, it has lots of info. What's nice about PostgreSQL is that, while it hasn't always had every SQL92 feature (like outer joins, etc.), it seems to have less legacy, nonstandard stuff wired in. :-) -- Joel Burton, Director of Information Systems -*- jburton@scw.org Support Center of Washington (www.scw.org)
Hi, ... > > > That is what transactions are for. If any errors occur, then the > > > transacction is aborted. You are supposed to use transactions when you want > > > either everything to occur (the whole transaction), or nothing, if an > > > error occurs. And thats wrong! The caller should have a change to handle the error. Like if a "insert" fails, you might want to use "update" instead. It should be the caller who decides if the transaction should be aborted ("rollback") or not. As it is now transactions are _totally_ useless with dba:s that serves more than one client. ... > > There is obviously no > > reason why a transaction needs to be aborted for syntax errors. Absolutely correct. It should be the caller who decides what he wants to do with the transaction (rollback, or just continue as nothing happened). ... > A bank is transferring money from one acount to another. Say the money > leaves the first account (first update query), and then an error occurs > when inserting the money into the second account (second update query). If ... Schematic code snipped: BEGIN; update table account set credit = credit + 100; if( error ) { insert into account (credit,debet) VALUES( 100,0 ); if( error ) { ROLLBACK; return FAILED; } } update table account set debet = debet + 100; if( error ) { insert into account (credit, debet) VALUES( 0, 100 ); if( error ) { ROLLBACK; return FAILED; } } COMMIT; That is the _correct_ way to do a bank transaction. And that is how transactions should work. ... > That is the whole point of transactions - they are used for an > "all-or-nothing" approach. Correct, but it should be the caller who decides what to do. Not the dba. ... > The transaction succeeds, and you end up with two phones with the same > number. BAD thing. Your still wrong about the correct dba behaviour. It should be the callers decision, not the dba. > > nessasary. If you don't believe me, here's two fully SQL-92 compliant > > databases, Oracle and interbase, which do not exhibit this behavior: I do not give a sh** about SQL9_. There are nothing that forbids a dba to be better than something. ... > So, what would you like to be the criteria for aborting or proceeding with > a transaction? dba should not try to guess what I want to do with a transaction. It should repport all errors to me (the caller) and let me decide what to do with the transaction, period. ... > > > If you don't like this behaviour, then use auto-commit, and make every And thats stupid. ... > > grouping a set of statements and commiting them or rolling them back as > > a whole. I do not, however, want the transaction aborted by the server Thats how it should be. ... > > when it does not need to be. Clearly in the above case, neither > > interbase nor oracle decided that the transaction had to be aborted. Neither does Sybase or MSSQL. // Jarmo
Joel Burton writes: > What's nice about PostgreSQL is that, while it hasn't always had > every SQL92 feature (like outer joins, etc.), it seems to have less > legacy, nonstandard stuff wired in. :-) Oh man, you have noooo idea. PostgreSQL is legacy headquarters. ;-) -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
> > What's nice about PostgreSQL is that, while it hasn't always had > > every SQL92 feature (like outer joins, etc.), it seems to have less > > legacy, nonstandard stuff wired in. :-) > > Oh man, you have noooo idea. PostgreSQL is legacy headquarters. ;-) Yes, yes, I know about *some* of them [8k limit springs to mind!] (C hackers no doubt no *lots* more.) But, in terms of, "as comes out in our SQL syntax", compared to Oracle, we're free and clear. -- Joel Burton, Director of Information Systems -*- jburton@scw.org Support Center of Washington (www.scw.org)
On Thu, Nov 30, 2000 at 12:16:39PM -0800, Mikheev, Vadim wrote: > Oracle & Interbase have savepoints. Hopefully PG will also have them in 7.2 A feature that I liked from using Faircom's Ctree (granted not an SQL based DB or one with built in relations) was auto-save points. So, if something failed, it could be rolled back to the previous auto-save point. Just food for thought on how Ctree works. A transaction, by default, will fail on the commit if there were any errors within the transaction (though it would happily process all of your commands after an error without additional failures, so it was less verbose than the original psql demonstration at the beginning of this thread). Also, by default, no auto-save points. One could turn on auto-save points. (If one wanted "normal" save-point activities, you would get the save point counter number and then rollback to that particular save point at some time). This was convenient if you wanted to just rollback the last operation that caused the error (this may have had the side effect of unmarking the fact than an error occured, but I don't think so. There was another command to clear the transaction error, with lots of disclaimers saying if you did that, you took your own responsibilities). Guess, in sort, what I'm saying is, if save points are added, might as well add auto-save points while at it, and give the ability to selectively clear the error and allow a transaction to commit anyway (keeping current method as default, of course). mrc -- Mike Castle Life is like a clock: You can work constantly dalgoda@ix.netcom.com and be right all the time, or not work at all www.netcom.com/~dalgoda/ and be right at least twice a day. -- mrc We are all of us living in the shadow of Manhattan. -- Watchmen
Joe Kislo wrote: > Yes. There are certainly times when a transaction needs to be > ABORTed. However, there are many reasons why the database should not > abort a transaction if it does not need to. There is obviously no > reason why a transaction needs to be aborted for syntax errors. There > is obviously no reason why a transaction needs to be aborted for say, > trying to insert a duplicate primary key. The -insert- can fail, report > it as such, and the application can determine if a rollback is > nessasary. If you don't believe me, here's two fully SQL-92 compliant > databases, Oracle and interbase, which do not exhibit this behavior: You're right. But it'd be (up to now) impossible to implement in Postgres. Postgres doesn't record any undo information during the execution of a transaction (like Oracle for example does in the rollback segments). The way Postgres works is not to overwrite existing tuples, but to stamp them outdated and insert new ones. In the case of a ROLLBACK, just the stamps made are flagged invalid (in pg_log). If you do a INSERT INTO t1 SELECT * FROM t2; there could occur a duplicate key error. But if it happens in the middle of all the rows inserted, the first half of rows is already in t1, with the stamp of this transaction to come alive. The only way to not let them show up is to invalidate the entire transaction. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
> Joel Burton writes: > > > What's nice about PostgreSQL is that, while it hasn't always had > > every SQL92 feature (like outer joins, etc.), it seems to have less > > legacy, nonstandard stuff wired in. :-) > > Oh man, you have noooo idea. PostgreSQL is legacy headquarters. ;-) I had a good laugh on this one. Yes, we are legacy headquarters sometimes, but we don't hesitate to rip things out to improve them. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> Joe Kislo <postgre@athenium.com> writes: > > Hmm, 1G is probably fine :) But is there going to be a blob type with > > toast? If I want to store a large binary object, and have the ability > > of retrieving it strictly over the postgre database connection, would I > > be retrieving a blob column, or a really long varchar column? > > If you want binary (8-bit-clean) data, you need to use the 'bytea' > datatype not 'varchar'. Our character datatypes don't cope with > embedded nulls presently. This is primarily an issue of the > external representation as a C string. > > Alternatively, you can keep using the old-style large-object support > (lo_read, lo_write, etc). This may be handy if you are dealing with > blobs large enough that you don't want to read or write the entire > value on every access. We need to add that capability to bytea too, > by defining some access functions that allow reading and writing > portions of a bytea value --- but no one's gotten round to that yet, > so I don't suppose it'll happen for 7.1. What I think we _really_ need is a large object interface to TOAST data. We already have a nice API, and even psql local large object handling. If I have a file that I want loaded in/out of a TOAST column, we really should make a set of functions to do it, just like we do with large objects. This an obvious way to load files in/out of TOAST columns, and I am not sure why it has not been done yet. I am afraid we are going to get critisized if we don't have it soon. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > What I think we _really_ need is a large object interface to TOAST data. Large-object-like, anyway. I dunno if we want to expose TOAST-value OIDs or not. But yes, we need to be able to read and write sections of a large TOASTed data value. > This an obvious way to load files in/out of TOAST columns, and I am not > sure why it has not been done yet. Because Jan said he'd deal with it, and then he's been distracted by moving and visa problems and so forth. But I expect he'll get it done for 7.2 ... regards, tom lane
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > > What I think we _really_ need is a large object interface to TOAST data. > > Large-object-like, anyway. I dunno if we want to expose TOAST-value > OIDs or not. But yes, we need to be able to read and write sections > of a large TOASTed data value. > > > This an obvious way to load files in/out of TOAST columns, and I am not > > sure why it has not been done yet. > > Because Jan said he'd deal with it, and then he's been distracted by > moving and visa problems and so forth. But I expect he'll get it done > for 7.2 ... My concern is that we are introducing a new feature, but not giving people a way to take full advantage of it. In my release message, I will make it clear that TOAST will be a real option for binary files _when_ we get the API working. Without that API, TOAST is really just for long text fields, and bytea for those who can encode their data. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> What I think we _really_ need is a large object interface to TOAST data. > We already have a nice API, and even psql local large object handling. > > If I have a file that I want loaded in/out of a TOAST column, we really > should make a set of functions to do it, just like we do with large > objects. > > This an obvious way to load files in/out of TOAST columns, and I am not > sure why it has not been done yet. I am afraid we are going to get > critisized if we don't have it soon. Okay, let me criticize you now then :) (just kidding) Over the past month I've been trying out postgre for two reasons. I've posted a number of questions to this mailing list, and the postgre community has been extremely responsive and helpful. Kudos to everybody working on postgre. Most of my questions have been along the line of asking why a particular feature works differently then in other databases, or why postgre seemed to act in an illogical fashion (such as corrupting my database). First, I was evaluating Postgre for a medium scale application I will working on for my current employer. Technically this is re-architecting a current application built on MySQL and Python. I plan to move the application to java servlets and some database other then MySQL, preferably opensource. Postgre, obviously with its' reputation, was the beginning of this short list of databases to look at. Unfortunately I quickly discovered this lack of BLOB support. I understand that the C API can read/write -files- off the server's filesystem and load them into the database. Unfortunately we would absolutely require true over-the-wire blob support through JDBC. AFAIK, even with these "toast" columns, it still wouldn't fill that need. The need here is to load binary data from the client, transfer it over the JDBC wire, and store it in the database. Some people before suggested a shared NFS partition, then have the server use the existing BLOB support to load the files off disk. That's really not an acceptable solution. So as for using postgre in this upcoming application, it's really a no-go at this point without that ability. I actually suspect a number of people also have a need to store BLOBs in a database, but maybe it's not as important as I think. The second reason why I've been working with Postgre is I'm about to release into the open source a java based object database abstraction layer. This layer maps java objects to a relational database by storing their primitives in database primitives, and using java reflection to reconstitute objects from the database. This allows you to perform complex joins and such in the -database- then map to the actual java objects. When you attach a particular class to a database, you choose the appropriate database adapter (such as one for oracle or postgre). These DBAdapters take care of all the DB specific things, such as native column types, handling auto incrementing columns (generators or "serial columns"), creating tables, altering tables when class definitions change, database independent indexing, and blobs. Programmers mostly work at the object layer, and don't really worry about the particulars of the underlying database. (although they can execute raw SQL if they really need to). So this truly allows an application to be written independent of any particular underlying database (and to my dismay, there appear to be very big differences between these databases!). This allows you to change your underlying database easily, which means you can choose the database server on it's merits, and not because it's been grandfathered into your application :) Anyway, when implementing the Postgre DBAdapter, I found postgre to be quite a nice database (and pretty fast too). But there were two issues which cripple the postgre DBAdapter from supporting the full feature set. 1) No blob support. As I described above, it needs to be possible to insert an arbitrarily large (or atleast up to say 5 megabytes) binary object into the database, and have it accessible by a particular column name in a table. AFAIK, this is not currently possible in postgre 2) Postgre does not record rollback segments. Which means transactions get ABORTed and rolled back for some odd reasons when they don't normally need to. For example, if you just send the SQL server some garbage SQL, (eg: ASDF;) your transaction gets aborted and rolled back; even though your garbage SQL didn't touch any rows. At the object layer in the aforementioned database layer, if you try insert an object into the database and doing so would violate a unique key (such as the primary key), a DuplicateKeyException will be thrown. No other database adapters I've implemented, such as MySQL, interbase or oracle, will *also* abort the transaction. So if at the object layer, a DuplicateKeyException is supposed to happen in that case, I would have to before every object is inserted into the database, look up the database schema for the table... Then confirm by issuing multiple SQL queries that no unique keys would be violated by the new record. If they are, throw the DuplicateKeyException, and if not, insert the record. But even that won't catch all cases because a different transaction could have have an uncommitted row with which the new record conflicts... In which case all my queries would say things are in the clear, but when I go to insert the record the insert would be blocked waiting on the other transaction. If that other transaction rollsback, we're in the clear.. If it commits, postgre says there's a key conflict, ABORTS the current transaction, and rolls it back. Eek. In which case, the database layer still isn't throwing a DuplicateKeyException, but a TransactionAborted exception. -GRANTED- that a transaction can be aborted at anytime, and the application programmer should plan for that, but I think this postgre "feature" will cause transactions to be aborted unnecessarily; especially if people migrate from another database to postgre. Ofcourse, people really shouldn't be inserting objects which already exist, but it would still be an inconsistency between Postgre and all the other DBAdapters. Thoughts? -Joe
> > What I think we _really_ need is a large object interface to TOAST data. > > We already have a nice API, and even psql local large object handling. > > > > If I have a file that I want loaded in/out of a TOAST column, we really > > should make a set of functions to do it, just like we do with large > > objects. > > > > This an obvious way to load files in/out of TOAST columns, and I am not > > sure why it has not been done yet. I am afraid we are going to get > > critisized if we don't have it soon. > > Okay, let me criticize you now then :) (just kidding) Over the past > month I've been trying out postgre for two reasons. I've posted a > number of questions to this mailing list, and the postgre community has > been extremely responsive and helpful. Kudos to everybody working on > postgre. Most of my questions have been along the line of asking why a > particular feature works differently then in other databases, or why > postgre seemed to act in an illogical fashion (such as corrupting my > database). Yes, this was my point. We now have TOAST, but by not going the extra mile to enable storage of binary files, we really aren't taking full advantage of our new TOAST feature. I can see people saying, "Wow, you can store rows of unlimited length now. Let me store this jpeg. Oh, I can't because it is binary!" -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Joe Kislo writes: > First, I was evaluating Postgre for a medium scale application I will I'm just wondering what this "Postgre" thing is you keep talking about... ;-) -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Joe Kislo <postgre@athenium.com> writes: > ... this lack of BLOB support. I understand that the C > API can read/write -files- off the server's filesystem and load them > into the database. Unfortunately we would absolutely require true > over-the-wire blob support through JDBC. AFAIK, even with these "toast" > columns, it still wouldn't fill that need. This is a misunderstanding. You can still use the old-style large objects (in fact 7.1 has an improved implementation of them too), and there's always been support for either over-the-wire or server-filesystem read and write of large objects. In fact the former is the preferred way; the latter is deprecated because of security issues. In a standard installation you can't do the server-filesystem bit at all unless you are superuser. The JDBC support for over-the-wire access to large objects used to have some bugs, but AFAIK those are cleaned up in current sources (right Peter?) Adding a similar feature for TOAST columns will certainly be a notational improvement, but it won't add any fundamental capability that isn't there already. > 2) Postgre does not record rollback segments. We know this is needed. But it will not happen for 7.1, and there's no point in complaining about that; 7.1 is overdue already. regards, tom lane
what is the tentative date for 7.1 release? what is the release date for replication? sandeep
> This is a misunderstanding. You can still use the old-style large > objects (in fact 7.1 has an improved implementation of them too), > and there's always been support for either over-the-wire or > server-filesystem read and write of large objects. In fact the former > is the preferred way; the latter is deprecated because of security > issues. In a standard installation you can't do the server-filesystem > bit at all unless you are superuser. I know we haven't talked about the TOAST/binary interface, but one idea I had was to load the binary into the large object interface, then automatically somehow transfer it to the TOAST column. Same for extracting large objects. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> Yes, this was my point. We now have TOAST, but by not going the extra > mile to enable storage of binary files, we really aren't taking full > advantage of our new TOAST feature. > > I can see people saying, "Wow, you can store rows of unlimited length > now. Let me store this jpeg. Oh, I can't because it is binary!" > Well, to me it seems that, when TOAST will be available (i.e. when the looooong awaited, most desired, more bloated, world- conquering 7.1 version will come-out...), 90% of the work it is already done to support also column-style BLOBs... at least for web applications, that are incidentally my focus. Any web programmer worth its salt could put up a simple layer that does base64 encode/decode and use "CLOBs" (I think TOAST columns could be called that way, right?)... and he should write anyway some interface for file uploading/downloading, since its client are using a browser as their frontend. Using PHP, it's no more than a few rows of code. Granted, base64 encode can waste a LOT of space, but it looks like a columbus' egg in this scenario. Maybe base64 could also be a quick way to write a binary "patch" for TOAST so it would be binary-compatible "natively"? Or am I saying a lot of bullsììt? :-) Just wanted to share some toughts. Merry Christmas to everybody... /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Fabrizio Ermini Alternate E-mail: C.so Umberto, 7 faermini@tin.it loc. Meleto Valdarno Mail on GSM: (keep it short!) 52020 Cavriglia (AR) faermini@sms.tin.it
Quoting Tom Lane <tgl@sss.pgh.pa.us>: > The JDBC support for over-the-wire access to large objects used to > have some bugs, but AFAIK those are cleaned up in current sources > (right Peter?) Yes except for DatabaseMetaData.getTables() but thats not directly to do with large objects. As long as things settle down here by Saturday, I'll be sorting out what's outstanding... Peter -- Peter Mount peter@retep.org.uk PostgreSQL JDBC Driver: http://www.retep.org.uk/postgres/ RetepPDF PDF library for Java: http://www.retep.org.uk/pdf/