Thread: Unanswered questions about Postgre

Unanswered questions about Postgre

From
Joe Kislo
Date:
Sorry if this is a repost, but it appears the postgre mailing list
filtered my original mail without warning.

I have a bunch of questions, and I've looked through the documentation
without any answers. I also looked in the FAQ.. I'm guessing that most
of these questions -are- FAQs, so maybe they should be put there after
some answers are assembled.  Also, have you guys considered a searchable
version of your documentation, or a version of your documentation all on
one page, so it can be searched in the browser?

1)  in psql, if I make a typeo, the transaction is automatically
aborted.  For example:

----

apollodemo=# \d
  List of relations
 Name | Type  | Owner
------+-------+-------
 boo  | table | kislo
(1 row)

apollodemo=# begin;
BEGIN
apollodemo=# create table moo (i int);
CREATE
apollodemo=# Oops this is a typeo;
ERROR:  parser: parse error at or near "oops"
apollodemo=# insert into moo values (1);
NOTICE:  mdopen: couldn't open moo: No such file or directory
NOTICE:  RelationIdBuildRelation: smgropen(moo): No such file or
directory
NOTICE:  mdopen: couldn't open moo: No such file or directory
NOTICE:  mdopen: couldn't open moo: No such file or directory
NOTICE:  mdopen: couldn't open moo: No such file or directory
NOTICE:  current transaction is aborted, queries ignored until end of
transaction block
*ABORT STATE*
apollodemo=# \d
NOTICE:  current transaction is aborted, queries ignored until end of
transaction block
No relations found.
apollodemo=# \d
NOTICE:  current transaction is aborted, queries ignored until end of
transaction block
No relations found.
apollodemo=# commit;
COMMIT
apollodemo=# \d
  List of relations
 Name | Type  | Owner
------+-------+-------
 boo  | table | kislo
(1 row)

apollodemo=#

---

        Arr, that's really annoying when you're in Psql, and you just
trashed your entire transaction because you made a type-o.  It's even
more rediculous when your application is acctually doing a transaction,
and because an error occoured, the entire transaction is now trashed.
Why exactly does Postgre abort the transaction is there is an error?  It
doesn't have to, and no other RDBMS's do it AFAIK (interbase/oracle).
for example:  if you start a transaction, do some work, then try to
insert a record which violates a dataintegrity constraint, the ENTIRE
transaction is now wiped out.  Instead, the application should be able
to parse the integrity contraint problem, and keep on going, or -choose-
to rollback the transaction.  I assume this -must- be user settable, how
do I do this?  And what was the rationale behind automatically aborting
at the first sign of imperfection?

2) In the above example, when I issue the "insert into moo
values(1);",PostGre spews out all manners of errors.  Should one of the
following happen?:
   a) Given that the transaction is ABORTed, shouldn't it just entirly
ignore the insert request (as advertised), and thus NOT even bother
querying the operating system for the database file?
   b) Given that the transaction has been ABORTed, it shouldn't bother
querying the OS for the table, since it knows it doesn't exist
   c) Given that the table does not exist, and this is what you would
expect, not spew operating system errors to the screen

3)  You have some nice documentation describing transaction isolation;
but you fail to follow through in saying how to CHANGE the transaction
isolation mode.  The dox are here:
http://postgresql.readysetnet.com/users-lounge/docs/v7.0/user/mvcc4564.htm

How do you change the isolation mode?

4) I was doing some testing of my Database Abstraction layer, and I was
running the same code between MySQL, Postgresql, and Interbase.  I made
an interesting discovery.  After running the test sequence once on
Postgre, each sucsessive run of the test suite (or "benchmark" since I
was timing it.. but it really wasn't a benchmark) postgre's number's got
progressivly worse.  Infact, on -each- sucsessive run, postgre's
time-to-completion -doubled-.  This was across the boards, on selects by
key, selects not by key,  inserts, updates...  Dropping the table, and
recreating it got postgre back to square one again... I assume this has
to do with the fact that about 1000 records were inserted, and about a
1000 rows were deleted (over the course of the test)... Is this -normal-
for postgre?  I realize that maybe some sort of transaction audit needs
to be stored, but these transactions are committed... Do we need that
audit anymore?  Plus, should that "audit" so -massivly- effect the
entire database's operational speed?  Why should running a query on a
table be -so- -so- much slower if there have been repeated inserted and
deleted records in it?  We're talking about a massive slow down here,
this test originally ran in about 7 seconds.. After a few runs, it was
taking 2 minutes.  There are no rows left in the table at the end...
Shouldn't pgsql just overwrite the dead space, and why is the dead space
effecting performance?  ( I don't care about disk space concerns).  I
did notice vaccuming the database between runs kept postgre working at a
reasonable pace.  Do people need to vaccume their databases hourly?  Can
you vaccume while a database is in use?  Any discussion on this curious
phenomenon would be appreciated.  It still boggles me.

5) BLOB Support.  Ugh?  I cannot find anything in any of the manuals
about BLOBs, yet I know PGsql supports them because I've found lots of
people in the mailing list yelling at people to use BLOBs when they
complain that the max row size is 8k.  Yet no dox (that I can find!).  I
did find -one- piece of documentation.  However it only described how to
load a BLOB from -the harddrive of the server-, and how to retrieve a
BLOB -directly onto the harddrive of the server-.  This, ofcourse, is
entirly useless in a client/server application.  Does PGSQL have BLOB
support which can work over the wire?  If so, where are the dox?  If
not, isn't this an incredibly massive limitation?  How do people work
around this?

6) Two things about the JDBC Driver for postgre.  First, If you create a
Prepared Statement, and are populating the fields, if you try to use
setObject to store an object, the JDBC driver will crash if the object
is null.
So if you write something like this:

void sillyFunction  (Integer i, Integer ii) {
[..]
  preparedStatement.setObject(1,i);
  preparedStatement.setObject(2,ii);
}

And some wise ass passes in a null Integer, the JDBC driver will crash
with a NullPointerException.  Instead you must first test to see if the
object is null, and if not, then insert it using setObject, otherwise
insert it using setNull.  Is this right? I've used quite a few other
JDBC drivers, and none of them seem to crash in this manner if you pass
in a null.  I've never had to use setNull explicitly before.

Two, if you have a resultset, and try to fetch an int (or anything) from
column 0, the entire JVM goes down.  I am entirly baffled by how the
entire JVM could crash (with a seg fault).  Does your JDBC code use
native code?  I tried two different JVMs (IBM JDK1.3 and Blackdown
1.2.2), both with and without JITs active.  I also tried the JDBC driver
fresh off the site, aswell as the one shipped with postgre 7.0.3.. All
combinations crash..  Do you guys know why this might be?  What error
checking do you have for this case?


Thanks guys,

-Joe

Re: Unanswered questions about Postgre

From
"Joel Burton"
Date:
[re: question #4, speed/vacuuming]

> Do
> people need to vaccume their databases hourly?  Can you vaccume while
> a database is in use?  Any discussion on this curious phenomenon would
> be appreciated.  It still boggles me.

I vacuum twice a day, once in the dead of night, once around
lunch. Yes, you can vacuum while the db is in use, but many locks
(for updates, inserts, etc.) will hold up the vacuum.

> 5) BLOB Support.

Keep in mind the pgsql 7.1 (about to go beta tomorrow, if I heard
right) will support much longer row sizes than 8k. Doesn't remove
the needs for blobs for many of us, but fixed my problems.

I believe the docs discussing the c-level interfaces talk about lo
creation and such. Have you looked in the low-level docs in the
programmer/developer manuals?

I have only played w/blobs; others can speak better about their
use/limitations, but if I have it correct:
. blobs cannot be dumped
. blobs are not normally vacuumed

So, for most of us, I think the TOAST feature of 7.1 that allows >8k
row sizes is much nicer. (Unless, of course, you really want to store
binary data, not just long text fields.)


Good luck,

--
Joel Burton, Director of Information Systems -*- jburton@scw.org
Support Center of Washington (www.scw.org)

Re: Unanswered questions about Postgre

From
"Igor V. Rafienko"
Date:
on Nov 29, 2000, 19:17, Joel Burton std::cout'ed:

[snip]

| > 5) BLOB Support.
|
| Keep in mind the pgsql 7.1 (about to go beta tomorrow, if I heard
| right) will support much longer row sizes than 8k. Doesn't remove
| the needs for blobs for many of us, but fixed my problems.


How _much_ longer? (Sorry if it's a FAQ, in that case, I'd appreciate a
pointer/URL).






ivr
--
Intelligence est rélative. Par rapport à T*, c'est un génie.
             --    James Kanze sur "Smart Pointer"


Re: Unanswered questions about Postgre

From
"Joel Burton"
Date:

On 30 Nov 2000, at 1:24, Igor V. Rafienko wrote:

> on Nov 29, 2000, 19:17, Joel Burton std::cout'ed:
>
> [snip]
>
> | > 5) BLOB Support.
> |
> | Keep in mind the pgsql 7.1 (about to go beta tomorrow, if I heard |
> right) will support much longer row sizes than 8k. Doesn't remove |
> the needs for blobs for many of us, but fixed my problems.
>
>
> How _much_ longer? (Sorry if it's a FAQ, in that case, I'd appreciate
> a pointer/URL).

Dunno, but I've been using 7.1devel for ~2 months, and so far,
longer rows seem to work fine.

More information on the TOAST project is at
http://www.postgresql.org/projects/devel-toast.html

--
Joel Burton, Director of Information Systems -*- jburton@scw.org
Support Center of Washington (www.scw.org)

Re: Unanswered questions about Postgre

From
Tom Lane
Date:
>> How _much_ longer? (Sorry if it's a FAQ, in that case, I'd appreciate
>> a pointer/URL).

> Dunno, but I've been using 7.1devel for ~2 months, and so far,
> longer rows seem to work fine.

> More information on the TOAST project is at
> http://www.postgresql.org/projects/devel-toast.html

I think I pontificated about this a month or two back, so check the
archives; but the short answer is that the effective limit under TOAST
is not on the total amount of data in a row, but just on the number of
columns.  The master copy of the row still has to fit into a block.
Worst case, suppose every one of your columns is "wide" and so gets
pushed out to BLOB storage.  The BLOB pointer that still has to fit
into the main row takes 32 bytes.  With a maximum main row size of 8K,
you can have about 250 columns.  In practice, probably some of your
columns would be ints or floats or booleans or something else that
takes up less than 32 bytes, so the effective limit is probably
order-of-magnitude-of 1000 columns in a table.

If that seems too small, maybe you need to rethink your database design
;-)

There's also a 1G limit on the size of an individual BLOB that can be
part of a row.

            regards, tom lane

Re: Unanswered questions about Postgre

From
Joe Kislo
Date:
> That is what transactions are for. If any errors occur, then the
> transacction is aborted. You are supposed to use transactions when you want
> either everything to occur (the whole transaction), or nothing, if an error
> occurs.

    Yes.  There are certainly times when a transaction needs to be
ABORTed.  However, there are many reasons why the database should not
abort a transaction if it does not need to.  There is obviously no
reason why a transaction needs to be aborted for syntax errors.  There
is obviously no reason why a transaction needs to be aborted for say,
trying to insert a duplicate primary key.  The -insert- can fail, report
it as such, and the application can determine if a rollback is
nessasary.   If you don't believe me, here's two fully SQL-92 compliant
databases, Oracle and interbase, which do not exhibit this behavior:

-Oracle-


SQL> create table test (i int, primary key (i));

Table created.

SQL> insert into test values (1);

1 row created.

SQL> insert into test values (1);
insert into test values (1)
*
ERROR at line 1:
ORA-00001: unique constraint (TEST.SYS_C001492) violated

SQL> insert into test values (2);

1 row created.

SQL> commit;

Commit complete.

SQL> select * from test;

     I
----------
     1
     2

SQL>


--Interbase--

SQL> create table test (i int not null, primary key (i));
SQL> insert into test values (1);
SQL> insert into test values (1);
Statement failed, SQLCODE = -803

violation of PRIMARY or UNIQUE KEY constraint "INTEG_156" on table
"TEST"
SQL> insert into test values (2);
SQL> commit;
SQL> select * from test;

           I
============

           1
           2

SQL>


> If you don't like this behaviour, then use auto-commit, and make every
> separate statement a transaction in itself. That way if any of the
> statements fails, the next one won't be aborted. This, however, depending
> on the error you get could cause massive irreversible data corrpution. But
> then again, if this is a risk, you should be using transactions which abort
> the whole block on any error.

    Auto-commit is not the same thing though.  That would make each
statement a transaction.  I don't want that, I want the ability of
grouping a set of statements and commiting them or rolling them back as
a whole.  I do not, however, want the transaction aborted by the server
when it does not need to be.  Clearly in the above case, neither
interbase nor oracle decided that the transaction had to be aborted.

    This has to be an option no?

> Several ways. You can set up a shared network area, sort out unique
> file-naming system (which shouldn't be too hard), and send some sort of a
> "URL" as a pointer to the file.

    Ahhh.  Unfortunatly that is an unacceptable solution :(.  So that means
there is no large binary storage available in postgre for me.


> Alternatively, wait for v7.1 (develpment tree available), which will
> support big record sizes (unlimited, AFAIK). Depending on what you are
> trying to do, BLOBS may or may not be the ideal thing, but sometimes they
> are the only way to store large chunks of data. 7.1 will hopefully sort
> that out, as I have bounced my head off the record size limit a few times
> myself.

    Hmm, I really hope the 7.1 implementation of blobs is a true blob, and
not just a really long varchar or something.  I need to store
arbitrarily large binary data, and be able to retrieve it over the
database connection.  I'm really surprised there isn't a facility for
this already...  Our application couldn't run on postgre without it!

Thanks for the reply, although I'm disappointed about the lack of blob
support :(

-Joe

Re: Unanswered questions about Postgre

From
Joe Kislo
Date:
> If that seems too small, maybe you need to rethink your database design
> ;-)
>
> There's also a 1G limit on the size of an individual BLOB that can be
> part of a row.

    Hmm, 1G is probably fine :)   But is there going to be a blob type with
toast?  If I want to store a large binary object, and have the ability
of retrieving it strictly over the postgre database connection, would I
be retrieving a blob column, or a really long varchar column?

Thanks,

-Joe

Re: Unanswered questions about Postgre

From
Tom Lane
Date:
Joe Kislo <postgre@athenium.com> writes:
>     Hmm, 1G is probably fine :)   But is there going to be a blob type with
> toast?  If I want to store a large binary object, and have the ability
> of retrieving it strictly over the postgre database connection, would I
> be retrieving a blob column, or a really long varchar column?

If you want binary (8-bit-clean) data, you need to use the 'bytea'
datatype not 'varchar'.  Our character datatypes don't cope with
embedded nulls presently.  This is primarily an issue of the
external representation as a C string.

Alternatively, you can keep using the old-style large-object support
(lo_read, lo_write, etc).  This may be handy if you are dealing with
blobs large enough that you don't want to read or write the entire
value on every access.  We need to add that capability to bytea too,
by defining some access functions that allow reading and writing
portions of a bytea value --- but no one's gotten round to that yet,
so I don't suppose it'll happen for 7.1.

            regards, tom lane

Re: Unanswered questions about Postgre

From
"Gordan Bobic"
Date:
> > That is what transactions are for. If any errors occur, then the
> > transacction is aborted. You are supposed to use transactions when you
want
> > either everything to occur (the whole transaction), or nothing, if an
error
> > occurs.
>
> Yes.  There are certainly times when a transaction needs to be
> ABORTed.  However, there are many reasons why the database should not
> abort a transaction if it does not need to.

I disagree. You shouldn't be using transactions in the first place, if you
didn't want the sequence to abort if an error occurs.

> There is obviously no
> reason why a transaction needs to be aborted for syntax errors.

I beg to differ. For a start, invalid SQL are GREAT ways to irreversibly
corrupt your data. And you should test your SQL to make sure it doesn't
produce syntax errors before you get as far as putting it into a
transaction.

Here's an example:

A bank is transferring money from one acount to another. Say the money
leaves the first account (first update query), and then an error occurs
when inserting the money into the second account (second update query). If
you have debited the first account and committed the change despite the
second error, the money would have left the first account, but it wouldn't
have appeared in the second account. This would be irreversible, and would
take lots of man-hours of following the paper trail (if there is one) to
find where things went wrong, if it could be found at all.

That is the whole point of transactions - they are used for an
"all-or-nothing" approach.

> There
> is obviously no reason why a transaction needs to be aborted for say,
> trying to insert a duplicate primary key.

It is not obvious at all. In fact, I can see why it obviously shouldn't be
done. Say you want a phone installed. Your phone number should be unique,
and it is concievable that it can be the primary key for the database that
stores phone numbers. The engineer types in the wrong number by accident.
The transaction succeeds, and you end up with two phones with the same
number. BAD thing.

I could sit here and list examples endlessly, but let's not clog up the
list with this too much.

> The -insert- can fail, report
> it as such, and the application can determine if a rollback is
> nessasary.   If you don't believe me, here's two fully SQL-92 compliant
> databases, Oracle and interbase, which do not exhibit this behavior:

[example snipped]

So, what would you like to be the criteria for aborting or proceeding with
a transaction?

> > If you don't like this behaviour, then use auto-commit, and make every
> > separate statement a transaction in itself. That way if any of the
> > statements fails, the next one won't be aborted. This, however,
depending
> > on the error you get could cause massive irreversible data corrpution.
But
> > then again, if this is a risk, you should be using transactions which
abort
> > the whole block on any error.
>
> Auto-commit is not the same thing though.  That would make each
> statement a transaction.  I don't want that, I want the ability of
> grouping a set of statements and commiting them or rolling them back as
> a whole.  I do not, however, want the transaction aborted by the server
> when it does not need to be.  Clearly in the above case, neither
> interbase nor oracle decided that the transaction had to be aborted.
>
> This has to be an option no?

Hmm... Fair point. There might be an option for this. I don't know, as I
never used transactions this way (or tried to, for that matter). I agree
that it could be useful to have some sort of a "evaluation" stage before
committing the transaction, where the application would see what (if any)
errors have occured in the transaction stage, and upon that decide whether
it really wants to commit or roll back.

> > Several ways. You can set up a shared network area, sort out unique
> > file-naming system (which shouldn't be too hard), and send some sort of
a
> > "URL" as a pointer to the file.
>
> Ahhh.  Unfortunatly that is an unacceptable solution :(.  So that means
> there is no large binary storage available in postgre for me.

I am not sure if there is another way. There might be, but I am not aware
of it at the moment.

> > Alternatively, wait for v7.1 (develpment tree available), which will
> > support big record sizes (unlimited, AFAIK). Depending on what you are
> > trying to do, BLOBS may or may not be the ideal thing, but sometimes
they
> > are the only way to store large chunks of data. 7.1 will hopefully sort
> > that out, as I have bounced my head off the record size limit a few
times
> > myself.
>
> Hmm, I really hope the 7.1 implementation of blobs is a true blob, and
> not just a really long varchar or something.  I need to store
> arbitrarily large binary data, and be able to retrieve it over the
> database connection.  I'm really surprised there isn't a facility for
> this already...  Our application couldn't run on postgre without it!

That's fair enough. I myself got a feeling that BLOBs in 7.0 were a quick
cludge rather than a permanent solution (not criticising anyone here!).

But then again - what is the difference between an encoded varchar and a
big binary type? If you get the data you want in a data object, what
difference does it make how it happens? It's all just numbers to a computer
anyway. ;-)
(or am I wrong here?)

Regards.

Gordan


RE: Unanswered questions about Postgre

From
"Mikheev, Vadim"
Date:
> > That is what transactions are for. If any errors occur, then the
> > transacction is aborted. You are supposed to use
> > transactions when you want either everything to occur
> > (the whole transaction), or nothing, if an error occurs.
>
>     Yes.  There are certainly times when a transaction needs to be
> ABORTed.  However, there are many reasons why the database should not
> abort a transaction if it does not need to.  There is obviously no
> reason why a transaction needs to be aborted for syntax errors.  There
> is obviously no reason why a transaction needs to be aborted for say,
> trying to insert a duplicate primary key.  The -insert- can
> fail, report it as such, and the application can determine if a rollback
> is nessasary. If you don't believe me, here's two fully SQL-92
> compliant databases, Oracle and interbase, which do not exhibit this
behavior:

Oracle & Interbase have savepoints. Hopefully PG will also have them in 7.2

Vadim

Re: Unanswered questions about Postgre

From
"Joel Burton"
Date:

On 30 Nov 2000, at 11:58, Joe Kislo wrote:
> If you don't believe me, here's two fully SQL-92
> compliant databases, Oracle and interbase, which do not exhibit this
> behavior:

Ummm... havings lots of experience w/it, I can say many things
about Oracle, but "fully SQL-92 compliant" sure isn't one of them. :-)

--
Joel Burton, Director of Information Systems -*- jburton@scw.org
Support Center of Washington (www.scw.org)

Re: Unanswered questions about Postgre

From
Joe Kislo
Date:
Joel Burton wrote:
>
> On 30 Nov 2000, at 11:58, Joe Kislo wrote:
> > If you don't believe me, here's two fully SQL-92
> > compliant databases, Oracle and interbase, which do not exhibit this
> > behavior:
>
> Ummm... havings lots of experience w/it, I can say many things
> about Oracle, but "fully SQL-92 compliant" sure isn't one of them. :-)

    Nice!  I was just reading some article benchmarking Postgre against
interbase..  They said interbase was fully SQL-92 compliant, so I just,
well gosh, assumed it's expensive brotheren were compliant too :)

    Does anybody know of an article benchmarking interbase against postgre
7?  The article I was reading was comparing it to postgre 6.5.  I assume
(hope) there have been some serious speed improvements since then?

-Joe

Re: Unanswered questions about Postgre

From
"Joel Burton"
Date:
> Joel Burton wrote:
> >
> > On 30 Nov 2000, at 11:58, Joe Kislo wrote:
> > > If you don't believe me, here's two fully SQL-92
> > > compliant databases, Oracle and interbase, which do not exhibit
> > > this behavior:
> >
> > Ummm... havings lots of experience w/it, I can say many things
> > about Oracle, but "fully SQL-92 compliant" sure isn't one of them.
> > :-)
>
>  Nice!  I was just reading some article benchmarking Postgre against
> interbase..  They said interbase was fully SQL-92 compliant, so I
> just, well gosh, assumed it's expensive brotheren were compliant too
> :)

Well, Oracle is "entry-level SQL92 compliant", but there's still lots of
wiggle room for nonstandard, nonfunctional stuff. Not that it's a bad
database--I always think of it warmly as a Soviet Tank (large,
heavy, ugly, and gets the job done by flattening lots of stuff.)
IMHO, it's *less* SQL compliant on many points that PostgreSQL.
(See the discussion in the last month about NULL v empty-string
handling in strings, where Oracle clearly fails an important SQL
standard that we pass w/flying colors.)

Although I think it's biased in many regards, MySQL's crash-me
(www.mysql.com) has lots of direct comparison info about many
different databases. They tend to overplay lots of minor things (like
billions of obscure functions), and list critical things like, oh,
transactions, subselects, procedural languages and such in one line,
but, still, it has lots of info.

What's nice about PostgreSQL is that, while it hasn't always had
every SQL92 feature (like outer joins, etc.), it seems to have less
legacy, nonstandard stuff wired in. :-)

--
Joel Burton, Director of Information Systems -*- jburton@scw.org
Support Center of Washington (www.scw.org)

SV: Unanswered questions about Postgre

From
"Jarmo Paavilainen"
Date:
Hi,

...
> > > That is what transactions are for. If any errors occur, then the
> > > transacction is aborted. You are supposed to use transactions when you
want
> > > either everything to occur (the whole transaction), or nothing, if an
> > > error occurs.

And thats wrong!

The caller should have a change to handle the error. Like if a "insert"
fails, you might want to use "update" instead. It should be the caller who
decides if the transaction should be aborted ("rollback") or not.

As it is now transactions are _totally_ useless with dba:s that serves more
than one client.

...
> > There is obviously no
> > reason why a transaction needs to be aborted for syntax errors.

Absolutely correct. It should be the caller who decides what he wants to do
with the transaction (rollback, or just continue as nothing happened).

...
> A bank is transferring money from one acount to another. Say the money
> leaves the first account (first update query), and then an error occurs
> when inserting the money into the second account (second update query). If
...

Schematic code snipped:

BEGIN;
update table account set credit = credit + 100;
if( error )
{
    insert into account (credit,debet) VALUES( 100,0 );
    if( error )
    {
        ROLLBACK;
        return FAILED;
    }
}
update table account set debet = debet + 100;
if( error )
{
    insert into account (credit, debet) VALUES( 0, 100 );
    if( error )
    {
        ROLLBACK;
        return FAILED;
    }
}
COMMIT;

That is the _correct_ way to do a bank transaction. And that is how
transactions should work.

...
> That is the whole point of transactions - they are used for an
> "all-or-nothing" approach.

Correct, but it should be the caller who decides what to do. Not the dba.

...
> The transaction succeeds, and you end up with two phones with the same
> number. BAD thing.

Your still wrong about the correct dba behaviour. It should be the callers
decision, not the dba.

> > nessasary.   If you don't believe me, here's two fully SQL-92 compliant
> > databases, Oracle and interbase, which do not exhibit this behavior:

I do not give a sh** about SQL9_. There are nothing that forbids a dba to be
better than something.

...
> So, what would you like to be the criteria for aborting or proceeding with
> a transaction?

dba should not try to guess what I want to do with a transaction. It should
repport all errors to me (the caller) and let me decide what to do with the
transaction, period.

...
> > > If you don't like this behaviour, then use auto-commit, and make every

And thats stupid.

...
> > grouping a set of statements and commiting them or rolling them back as
> > a whole.  I do not, however, want the transaction aborted by the server

Thats how it should be.

...
> > when it does not need to be.  Clearly in the above case, neither
> > interbase nor oracle decided that the transaction had to be aborted.

Neither does Sybase or MSSQL.

// Jarmo


Re: Unanswered questions about Postgre

From
Peter Eisentraut
Date:
Joel Burton writes:

> What's nice about PostgreSQL is that, while it hasn't always had
> every SQL92 feature (like outer joins, etc.), it seems to have less
> legacy, nonstandard stuff wired in. :-)

Oh man, you have noooo idea.  PostgreSQL is legacy headquarters.  ;-)

--
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/


Re: Unanswered questions about Postgre

From
"Joel Burton"
Date:
> > What's nice about PostgreSQL is that, while it hasn't always had
> > every SQL92 feature (like outer joins, etc.), it seems to have less
> > legacy, nonstandard stuff wired in. :-)
>
> Oh man, you have noooo idea.  PostgreSQL is legacy headquarters.  ;-)

Yes, yes, I know about *some* of them [8k limit springs to mind!]
(C hackers no doubt no *lots* more.) But, in terms of, "as comes
out in our SQL syntax", compared to Oracle, we're free and clear.

--
Joel Burton, Director of Information Systems -*- jburton@scw.org
Support Center of Washington (www.scw.org)

Re: Unanswered questions about Postgre

From
Mike Castle
Date:
On Thu, Nov 30, 2000 at 12:16:39PM -0800, Mikheev, Vadim wrote:
> Oracle & Interbase have savepoints. Hopefully PG will also have them in 7.2

A feature that I liked from using Faircom's Ctree (granted not an SQL based
DB or one with built in relations) was auto-save points.  So, if something
failed, it could be rolled back to the previous auto-save point.

Just food for thought on how Ctree works.

A transaction, by default, will fail on the commit if there were any errors
within the transaction (though it would happily process all of your
commands after an error without additional failures, so it was less verbose
than the original psql demonstration at the beginning of this thread).
Also, by default, no auto-save points.

One could turn on auto-save points.  (If one wanted "normal" save-point
activities, you would get the save point counter number and then rollback
to that particular save point at some time).  This was convenient if you
wanted to just rollback the last operation that caused the error (this may
have had the side effect of unmarking the fact than an error occured, but I
don't think so.  There was another command to clear the transaction error,
with lots of disclaimers saying if you did that, you took your own
responsibilities).

Guess, in sort, what I'm saying is, if save points are added, might as well
add auto-save points while at it, and give the ability to selectively clear
the error and allow a transaction to commit anyway (keeping current method
as default, of course).

mrc
--
       Mike Castle       Life is like a clock:  You can work constantly
  dalgoda@ix.netcom.com  and be right all the time, or not work at all
www.netcom.com/~dalgoda/ and be right at least twice a day.  -- mrc
    We are all of us living in the shadow of Manhattan.  -- Watchmen

Re: Unanswered questions about Postgre

From
Jan Wieck
Date:
Joe Kislo wrote:
>    Yes.  There are certainly times when a transaction needs to be
> ABORTed.  However, there are many reasons why the database should not
> abort a transaction if it does not need to.  There is obviously no
> reason why a transaction needs to be aborted for syntax errors.  There
> is obviously no reason why a transaction needs to be aborted for say,
> trying to insert a duplicate primary key.  The -insert- can fail, report
> it as such, and the application can determine if a rollback is
> nessasary.   If you don't believe me, here's two fully SQL-92 compliant
> databases, Oracle and interbase, which do not exhibit this behavior:

    You're right.

    But  it'd be (up to now) impossible to implement in Postgres.
    Postgres doesn't  record  any  undo  information  during  the
    execution  of  a transaction (like Oracle for example does in
    the rollback segments). The way  Postgres  works  is  not  to
    overwrite  existing  tuples,  but  to stamp them outdated and
    insert new ones. In the case of a ROLLBACK, just  the  stamps
    made are flagged invalid (in pg_log).

    If you do a

        INSERT INTO t1 SELECT * FROM t2;

    there could occur a duplicate key error. But if it happens in
    the middle of all the rows inserted, the first half  of  rows
    is  already in t1, with the stamp of this transaction to come
    alive. The only way to not let them show up is to  invalidate
    the entire transaction.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: Unanswered questions about Postgre

From
Bruce Momjian
Date:
> Joel Burton writes:
>
> > What's nice about PostgreSQL is that, while it hasn't always had
> > every SQL92 feature (like outer joins, etc.), it seems to have less
> > legacy, nonstandard stuff wired in. :-)
>
> Oh man, you have noooo idea.  PostgreSQL is legacy headquarters.  ;-)

I had a good laugh on this one.  Yes, we are legacy headquarters
sometimes, but we don't hesitate to rip things out to improve them.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Unanswered questions about Postgre

From
Bruce Momjian
Date:
> Joe Kislo <postgre@athenium.com> writes:
> >     Hmm, 1G is probably fine :)   But is there going to be a blob type with
> > toast?  If I want to store a large binary object, and have the ability
> > of retrieving it strictly over the postgre database connection, would I
> > be retrieving a blob column, or a really long varchar column?
>
> If you want binary (8-bit-clean) data, you need to use the 'bytea'
> datatype not 'varchar'.  Our character datatypes don't cope with
> embedded nulls presently.  This is primarily an issue of the
> external representation as a C string.
>
> Alternatively, you can keep using the old-style large-object support
> (lo_read, lo_write, etc).  This may be handy if you are dealing with
> blobs large enough that you don't want to read or write the entire
> value on every access.  We need to add that capability to bytea too,
> by defining some access functions that allow reading and writing
> portions of a bytea value --- but no one's gotten round to that yet,
> so I don't suppose it'll happen for 7.1.

What I think we _really_ need is a large object interface to TOAST data.
We already have a nice API, and even psql local large object handling.

If I have a file that I want loaded in/out of a TOAST column, we really
should make a set of functions to do it, just like we do with large
objects.

This an obvious way to load files in/out of TOAST columns, and I am not
sure why it has not been done yet.  I am afraid we are going to get
critisized if we don't have it soon.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Unanswered questions about Postgre

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> What I think we _really_ need is a large object interface to TOAST data.

Large-object-like, anyway.  I dunno if we want to expose TOAST-value
OIDs or not.  But yes, we need to be able to read and write sections
of a large TOASTed data value.

> This an obvious way to load files in/out of TOAST columns, and I am not
> sure why it has not been done yet.

Because Jan said he'd deal with it, and then he's been distracted by
moving and visa problems and so forth.  But I expect he'll get it done
for 7.2 ...

            regards, tom lane

Re: Unanswered questions about Postgre

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > What I think we _really_ need is a large object interface to TOAST data.
>
> Large-object-like, anyway.  I dunno if we want to expose TOAST-value
> OIDs or not.  But yes, we need to be able to read and write sections
> of a large TOASTed data value.
>
> > This an obvious way to load files in/out of TOAST columns, and I am not
> > sure why it has not been done yet.
>
> Because Jan said he'd deal with it, and then he's been distracted by
> moving and visa problems and so forth.  But I expect he'll get it done
> for 7.2 ...

My concern is that we are introducing a new feature, but not giving
people a way to take full advantage of it.  In my release message, I
will make it clear that TOAST will be a real option for binary files
_when_ we get the API working.  Without that API, TOAST is really just
for long text fields, and bytea for those who can encode their data.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Unanswered questions about Postgre

From
Joe Kislo
Date:
> What I think we _really_ need is a large object interface to TOAST data.
> We already have a nice API, and even psql local large object handling.
>
> If I have a file that I want loaded in/out of a TOAST column, we really
> should make a set of functions to do it, just like we do with large
> objects.
>
> This an obvious way to load files in/out of TOAST columns, and I am not
> sure why it has not been done yet.  I am afraid we are going to get
> critisized if we don't have it soon.

    Okay, let me criticize you now then :)  (just kidding)  Over the past
month I've been trying out postgre for two reasons.  I've posted a
number of questions to this mailing list, and the postgre community has
been extremely responsive and helpful.  Kudos to everybody working on
postgre.  Most of my questions have been along the line of asking why a
particular feature works differently then in other databases, or why
postgre seemed to act in an illogical fashion (such as corrupting my
database).

    First, I was evaluating Postgre for a medium scale application I will
working on for my current employer.  Technically this is re-architecting
a current application built on MySQL and Python.  I plan to move the
application to java servlets and some database other then MySQL,
preferably opensource.  Postgre, obviously with its' reputation, was the
beginning of this short list of databases to look at.  Unfortunately I
quickly discovered this lack of BLOB support.  I understand that the C
API can read/write -files- off the server's filesystem and load them
into the database.  Unfortunately we would absolutely require true
over-the-wire blob support through JDBC.  AFAIK, even with these "toast"
columns, it still wouldn't fill that need.  The need here is to load
binary data from the client, transfer it over the JDBC wire, and store
it in the database.  Some people before suggested a shared NFS
partition, then have the server use the existing BLOB support to load
the files off disk.  That's really not an acceptable solution.  So as
for using postgre in this upcoming application, it's really a no-go at
this point without that ability.  I actually suspect a number of people
also have a need to store BLOBs in a database, but maybe it's not as
important as I think.

    The second reason why I've been working with Postgre is I'm about to
release into the open source a java based object database abstraction
layer.  This layer maps java objects to a relational database by storing
their primitives in database primitives, and using java reflection to
reconstitute objects from the database.  This allows you to perform
complex joins and such in the -database- then map to the actual java
objects.
    When you attach a particular class to a database, you choose the
appropriate database adapter (such as one for oracle or postgre).  These
DBAdapters take care of all the DB specific things, such as native
column types, handling auto incrementing columns (generators or "serial
columns"), creating tables, altering tables when class definitions
change, database independent indexing, and blobs.  Programmers mostly
work at the object layer, and don't really worry about the particulars
of the underlying database.  (although they can execute raw SQL if they
really need to).  So this truly allows an application to be written
independent of any particular underlying database (and to my dismay,
there appear to be very big differences between these databases!).  This
allows you to change your underlying database easily, which means you
can choose the database server on it's merits, and not because it's been
grandfathered into your application :)

    Anyway, when implementing the Postgre DBAdapter, I found postgre to be
quite a nice database (and pretty fast too).  But there were two issues
which cripple the postgre DBAdapter from supporting the full feature
set.

    1) No blob support.  As I described above, it needs to be possible to
insert an arbitrarily large (or atleast up to say 5 megabytes) binary
object into the database, and have it accessible by a particular column
name in a table.  AFAIK, this is not currently possible in postgre

    2) Postgre does not record rollback segments.  Which means transactions
get ABORTed and rolled back for some odd reasons when they don't
normally need to.  For example, if you just send the SQL server some
garbage SQL, (eg: ASDF;) your transaction gets aborted and rolled back;
even though your garbage SQL didn't touch any rows.  At the object layer
in the aforementioned database layer, if you try insert an object into
the database and doing so would violate a unique key (such as the
primary key), a DuplicateKeyException will be thrown.  No other database
adapters I've implemented, such as MySQL, interbase or oracle, will
*also* abort the transaction.
    So if at the object layer, a DuplicateKeyException is supposed to
happen in that case, I would have to before every object is inserted
into the database, look up the database schema for the table... Then
confirm by issuing multiple SQL queries that no unique keys would be
violated by the new record.  If they are, throw the
DuplicateKeyException, and if not, insert the record.  But even that
won't catch all cases because a different transaction could have have an
uncommitted row with which the new record conflicts... In which case all
my queries would say things are in the clear, but when I go to insert
the record the insert would be blocked waiting on the other
transaction.  If that other transaction rollsback, we're in the clear..
If it commits, postgre says there's a key conflict, ABORTS the current
transaction, and rolls it back.  Eek.  In which case, the database layer
still isn't throwing a DuplicateKeyException, but a TransactionAborted
exception.

    -GRANTED- that a transaction can be aborted at anytime, and the
application programmer should plan for that, but I think this postgre
"feature" will cause transactions to be aborted unnecessarily;
especially if people migrate from another database to postgre.
Ofcourse, people really shouldn't be inserting objects which already
exist, but it would still be an inconsistency between Postgre and all
the other DBAdapters.

Thoughts?

-Joe

Re: Unanswered questions about Postgre

From
Bruce Momjian
Date:
> > What I think we _really_ need is a large object interface to TOAST data.
> > We already have a nice API, and even psql local large object handling.
> >
> > If I have a file that I want loaded in/out of a TOAST column, we really
> > should make a set of functions to do it, just like we do with large
> > objects.
> >
> > This an obvious way to load files in/out of TOAST columns, and I am not
> > sure why it has not been done yet.  I am afraid we are going to get
> > critisized if we don't have it soon.
>
>     Okay, let me criticize you now then :)  (just kidding)  Over the past
> month I've been trying out postgre for two reasons.  I've posted a
> number of questions to this mailing list, and the postgre community has
> been extremely responsive and helpful.  Kudos to everybody working on
> postgre.  Most of my questions have been along the line of asking why a
> particular feature works differently then in other databases, or why
> postgre seemed to act in an illogical fashion (such as corrupting my
> database).

Yes, this was my point.  We now have TOAST, but by not going the extra
mile to enable storage of binary files, we really aren't taking full
advantage of our new TOAST feature.

I can see people saying, "Wow, you can store rows of unlimited length
now.  Let me store this jpeg.  Oh, I can't because it is binary!"

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Unanswered questions about Postgre

From
Peter Eisentraut
Date:
Joe Kislo writes:

>     First, I was evaluating Postgre for a medium scale application I will

I'm just wondering what this "Postgre" thing is you keep talking about...
;-)

--
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/


Re: Unanswered questions about Postgre

From
Tom Lane
Date:
Joe Kislo <postgre@athenium.com> writes:
> ... this lack of BLOB support.  I understand that the C
> API can read/write -files- off the server's filesystem and load them
> into the database.  Unfortunately we would absolutely require true
> over-the-wire blob support through JDBC.  AFAIK, even with these "toast"
> columns, it still wouldn't fill that need.

This is a misunderstanding.  You can still use the old-style large
objects (in fact 7.1 has an improved implementation of them too),
and there's always been support for either over-the-wire or
server-filesystem read and write of large objects.  In fact the former
is the preferred way; the latter is deprecated because of security
issues.  In a standard installation you can't do the server-filesystem
bit at all unless you are superuser.

The JDBC support for over-the-wire access to large objects used to
have some bugs, but AFAIK those are cleaned up in current sources
(right Peter?)

Adding a similar feature for TOAST columns will certainly be a
notational improvement, but it won't add any fundamental capability
that isn't there already.

>     2) Postgre does not record rollback segments.

We know this is needed.  But it will not happen for 7.1, and there's
no point in complaining about that; 7.1 is overdue already.

            regards, tom lane

Re: Unanswered questions about Postgre

From
Sandeep Joshi
Date:
what is the tentative date for 7.1 release?
what is the release date for replication?

sandeep


Re: Unanswered questions about Postgre

From
Bruce Momjian
Date:
> This is a misunderstanding.  You can still use the old-style large
> objects (in fact 7.1 has an improved implementation of them too),
> and there's always been support for either over-the-wire or
> server-filesystem read and write of large objects.  In fact the former
> is the preferred way; the latter is deprecated because of security
> issues.  In a standard installation you can't do the server-filesystem
> bit at all unless you are superuser.

I know we haven't talked about the TOAST/binary interface, but one idea
I had was to load the binary into the large object interface, then
automatically somehow transfer it to the TOAST column.  Same for
extracting large objects.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Unanswered questions about Postgre

From
fabrizio.ermini@sysdat.it
Date:
> Yes, this was my point.  We now have TOAST, but by not going the extra
> mile to enable storage of binary files, we really aren't taking full
> advantage of our new TOAST feature.
>
> I can see people saying, "Wow, you can store rows of unlimited length
> now.  Let me store this jpeg.  Oh, I can't because it is binary!"
>
Well, to me it seems that, when TOAST will be available (i.e. when
the looooong awaited, most desired, more bloated, world-
conquering 7.1 version will come-out...), 90% of the work it is
already done to support also column-style BLOBs... at least for
web applications, that are incidentally my focus.
Any web programmer worth its salt could put up a simple layer that
does base64 encode/decode and use "CLOBs" (I think TOAST
columns could be called that way, right?)... and he should write
anyway some interface for file uploading/downloading, since its
client are using a browser as their frontend. Using PHP, it's no
more than a few rows of code.

Granted, base64 encode can waste a LOT of space, but it looks
like a columbus' egg in this scenario.

Maybe base64 could also be a quick way to write a binary "patch"
for TOAST so it would be binary-compatible "natively"?

Or am I saying a lot of bullsììt? :-)

Just wanted to share some toughts.
Merry Christmas to everybody...


/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

Fabrizio Ermini               Alternate E-mail:
C.so Umberto, 7               faermini@tin.it
loc. Meleto Valdarno          Mail on GSM: (keep it short!)
52020 Cavriglia (AR)          faermini@sms.tin.it

Re: Unanswered questions about Postgre

From
Peter T Mount
Date:
Quoting Tom Lane <tgl@sss.pgh.pa.us>:

> The JDBC support for over-the-wire access to large objects used to
> have some bugs, but AFAIK those are cleaned up in current sources
> (right Peter?)

Yes except for DatabaseMetaData.getTables() but thats not directly to do with
large objects.

As long as things settle down here by Saturday, I'll be sorting out what's
outstanding...

Peter

--
Peter Mount peter@retep.org.uk
PostgreSQL JDBC Driver: http://www.retep.org.uk/postgres/
RetepPDF PDF library for Java: http://www.retep.org.uk/pdf/