Thread: Re: [pgsql-advocacy] Need concrete "Why Postgres not MySQL" bullet list

Re: [pgsql-advocacy] Need concrete "Why Postgres not MySQL" bullet list

From
"Shridhar Daithankar"
Date:
On 21 Aug 2003 at 0:22, Ian Barwick wrote:
> * DDL
> - Data definition language (table creation statements etc.) in MySQL
> are not transaction based and cannot be rolled back.

Just wondering, what other databases has transactable DDLs? oracle seems to
have autonomous transactions which is arthogonal.

If we are going to compare it, we are going to need it against other databases
as well.

Personally I find transactable DDL's a big plus of postgresql. It allows real
funcky application design at times..:-)

Bye
 Shridhar

--
drug, n:    A substance that, injected into a rat, produces a scientific paper.


Re: [pgsql-advocacy] Need concrete "Why Postgres not MySQL"

From
Andreas Pflug
Date:
Shridhar Daithankar wrote:

>On 21 Aug 2003 at 0:22, Ian Barwick wrote:
>
>
>>* DDL
>>- Data definition language (table creation statements etc.) in MySQL
>>are not transaction based and cannot be rolled back.
>>
>>
>
>Just wondering, what other databases has transactable DDLs? oracle seems to
>have autonomous transactions which is arthogonal.
>
M$ SQL2000 has (and previous versions had too, I believe)

Regards,
Andreas



Re: [pgsql-advocacy] Need concrete "Why Postgres not MySQL"

From
Greg Stark
Date:
Andreas Pflug <pgadmin@pse-consulting.de> writes:

> Shridhar Daithankar wrote:
> 
> >On 21 Aug 2003 at 0:22, Ian Barwick wrote:
> >
> >>* DDL
> >>- Data definition language (table creation statements etc.) in MySQL
> >>are not transaction based and cannot be rolled back.
> >
> > Just wondering, what other databases has transactable DDLs? oracle seems to
> > have autonomous transactions which is arthogonal.
> >
> M$ SQL2000 has (and previous versions had too, I believe)

In Oracle DDL (including truncate!) was special and wasn't in a transaction.
I always just assumed that was just the way it had to be.

-- 
greg



Re: [pgsql-advocacy] Need concrete "Why Postgres not MySQL"

From
"Shridhar Daithankar"
Date:
On 21 Aug 2003 at 9:21, Greg Stark wrote:

> Andreas Pflug <pgadmin@pse-consulting.de> writes:
> 
> > Shridhar Daithankar wrote:
> > 
> > >On 21 Aug 2003 at 0:22, Ian Barwick wrote:
> > >
> > >>* DDL
> > >>- Data definition language (table creation statements etc.) in MySQL
> > >>are not transaction based and cannot be rolled back.
> > >
> > > Just wondering, what other databases has transactable DDLs? oracle seems to
> > > have autonomous transactions which is arthogonal.
> > >
> > M$ SQL2000 has (and previous versions had too, I believe)

Any pointers to documentation?

ByeShridhar

--
divorce, n:    A change of wife.



Re: [pgsql-advocacy] Need concrete "Why Postgres not MySQL"

From
"Shridhar Daithankar"
Date:
On 21 Aug 2003 at 9:21, Greg Stark wrote:
> Andreas Pflug <pgadmin@pse-consulting.de> writes:
> > Shridhar Daithankar wrote:
> > > Just wondering, what other databases has transactable DDLs? oracle seems to
> > > have autonomous transactions which is arthogonal.
> > >
> > M$ SQL2000 has (and previous versions had too, I believe)
> 
> In Oracle DDL (including truncate!) was special and wasn't in a transaction.
> I always just assumed that was just the way it had to be.

That is the autonomous transaction. Right now I am looking at 
interbase/firebird documentation. It looks like it has automous DDL transaction 
as well..

ByeShridhar

--
Collaboration, n.:    A literary partnership based on the false assumption that 
the    other fellow can spell.



Re: [pgsql-advocacy] Need concrete "Why Postgres not MySQL" bullet list

From
Manfred Koizar
Date:
On Thu, 21 Aug 2003 14:45:03 +0530, "Shridhar Daithankar"
<shridhar_daithankar@persistent.co.in> wrote:
>Just wondering, what other databases has transactable DDLs?

Firebird.

Servus
 Manfred

Re: [pgsql-advocacy] Need concrete "Why Postgres not MySQL"

From
Andreas Pflug
Date:
Shridhar Daithankar wrote:

>On 21 Aug 2003 at 9:21, Greg Stark wrote:
>
>  
>
>>Andreas Pflug <pgadmin@pse-consulting.de> writes:
>>
>>    
>>
>>>Shridhar Daithankar wrote:
>>>
>>>      
>>>
>>>>On 21 Aug 2003 at 0:22, Ian Barwick wrote:
>>>>
>>>>        
>>>>
>>>>>* DDL
>>>>>- Data definition language (table creation statements etc.) in MySQL
>>>>>are not transaction based and cannot be rolled back.
>>>>>          
>>>>>
>>>>Just wondering, what other databases has transactable DDLs? oracle seems to
>>>>have autonomous transactions which is arthogonal.
>>>>
>>>>        
>>>>
>>>M$ SQL2000 has (and previous versions had too, I believe)
>>>      
>>>
>
>Any pointers to documentation?
>  
>
No, just looked at the doc and didn't find anything, it's assumed 
implicitely at some locations though.
DDL statement will create a Schema Modification lock (Sch-M), i.e. DDL 
statements pending in a transaction will lock a table exclusively. After 
commit or rollback, schema modification is committed or undone as 
expected, I verified this.

Regards,
Andreas




Re: [GENERAL] [pgsql-advocacy] Need concrete "Why Postgres not MySQL" bullet list

From
Manfred Koizar
Date:
On Thu, 21 Aug 2003 15:05:52 +0200, I wrote:
>>Just wondering, what other databases has transactable DDLs?
>
>Firebird.

Stop!  I withdraw that statement.  I must have mis-read some feature
list :-(

Tests with InterBase 6 showed that you can change metadata within a
transaction, but when you ROLLBACK, metadata changes persist.

Servus
 Manfred

On Thursday 21 August 2003 21:30, Manfred Koizar wrote:
> On Thu, 21 Aug 2003 15:05:52 +0200, I wrote:
> >>Just wondering, what other databases has transactable DDLs?
> >
> >Firebird.
>
> Stop!  I withdraw that statement.  I must have mis-read some feature
> list :-(
>
> Tests with InterBase 6 showed that you can change metadata within a
> transaction, but when you ROLLBACK, metadata changes persist.

Aha. I was just about to ask about that, because I was experimenting
with a 1.5 beta version without success. Doesn't seem to work there
(though as I have little experience and virtually no docs I might
be missing something).


Ian Barwick
barwick@gmx.net

On Thursday 21 August 2003 11:15, Shridhar Daithankar wrote:
> On 21 Aug 2003 at 0:22, Ian Barwick wrote:
> > * DDL
> > - Data definition language (table creation statements etc.) in MySQL
> > are not transaction based and cannot be rolled back.
>
> Just wondering, what other databases has transactable DDLs? oracle seems to
> have autonomous transactions which is arthogonal.

DB2 8.1 seems to support transaction-capable DDL. At least, a rollback
following a CREATE TABLE causes the table to disappear. Haven't gone
into it in any depth.


Ian Barwick
barwick@gmx.net




Re: [GENERAL] [pgsql-advocacy] Need concrete "Why Postgres not MySQL" bullet list

From
"Shridhar Daithankar"
Date:
On 21 Aug 2003 at 21:30, Manfred Koizar wrote:

> On Thu, 21 Aug 2003 15:05:52 +0200, I wrote:
> >>Just wondering, what other databases has transactable DDLs?
> >
> >Firebird.
>
> Stop!  I withdraw that statement.  I must have mis-read some feature
> list :-(
>
> Tests with InterBase 6 showed that you can change metadata within a
> transaction, but when you ROLLBACK, metadata changes persist.

Well, isql documentation mentions that DDLs don't go to database unless you
commit and autoddl parameter defaults to true.

Looks like there definition of transactable does not include a rollback case.
Oops!

BTW any comments on storing an entire database in single file? I don't trust
any file system for performance and data integrity if I have single 100GB file.
I would rather have multiple of them..

Bye
 Shridhar

--
Moore's Constant:    Everybody sets out to do something, and everybody    does
something, but no one does what he sets out to do.


Re: [GENERAL] [pgsql-advocacy] Need concrete "Why Postgres

From
Andrew Dunstan
Date:

Shridhar Daithankar wrote:

>BTW any comments on storing an entire database in single file? I don't trust 
>any file system for performance and data integrity if I have single 100GB file. 
>I would rather have multiple of them..
>

I don't see why not. Entire file systems are stored within a single file 
sometimes. Examples: vmware, and IIRC UserMode Linux.

cheers

andrew



Re: [GENERAL] [pgsql-advocacy] Need concrete "Why Postgres

From
Shridhar Daithankar
Date:
On Friday 22 August 2003 13:59, Andrew Dunstan wrote:
> Shridhar Daithankar wrote:
> >BTW any comments on storing an entire database in single file? I don't
> > trust any file system for performance and data integrity if I have single
> > 100GB file. I would rather have multiple of them..
>
> I don't see why not. Entire file systems are stored within a single file
> sometimes. Examples: vmware, and IIRC UserMode Linux.

Well, half the day that I have spent on interbase documnetation, I didn't see 
any WAL type logs. If transactions directly go to database and entire 
database is file, I seriously doubt about performance and recovery.

UML and VMware are emulators. You don't want to use them in production right?

I would really love if UML allowed access to filesystems. A Jail type feature 
including access control right in memory. That would really rock but it's a 
different story..
Shridhar



Re: [GENERAL] [pgsql-advocacy] Need concrete 'Why Postgres

From
"Andrew Dunstan"
Date:
> On Friday 22 August 2003 13:59, Andrew Dunstan wrote:
>> Shridhar Daithankar wrote:
>> >BTW any comments on storing an entire database in single file? I
>> >don't
>> > trust any file system for performance and data integrity if I have
>> > single 100GB file. I would rather have multiple of them..
>>
>> I don't see why not. Entire file systems are stored within a single
>> file sometimes. Examples: vmware, and IIRC UserMode Linux.
>
> Well, half the day that I have spent on interbase documnetation, I
> didn't see  any WAL type logs. If transactions directly go to database
> and entire  database is file, I seriously doubt about performance and
> recovery.
>

I am not saying I would do it that way, merely that I could see it
working. I agree about logs, though. I could see it working as 2 files,
one for the base db and one for the log.

> UML and VMware are emulators. You don't want to use them in production
> right?
>

I know companies using VMware extensively. It makes some sense in a multi-
platform environment. If it had lots of corruption problems people
wouldn't use it. (Personally I prefer to use VNC in such an environment).

(Interesting as this is it's probably OT for hackers, though).

cheers

andrew




Re: Single-file DBs WAS: Need concrete "Why Postgres

From
Josh Berkus
Date:
Guys,

> >BTW any comments on storing an entire database in single file? I don't
> > trust any file system for performance and data integrity if I have single
> > 100GB file. I would rather have multiple of them..
>
> I don't see why not. Entire file systems are stored within a single file
> sometimes. Examples: vmware, and IIRC UserMode Linux.

Several database systems use a "single file" for data storage.   The problem 
with this is that it's not really a single file .... it's a proprietary file 
system on top of the host file system.   This sort of design makes a couple 
assumptions:

1) That the database is better than the host filesystem/OS and storage system 
at regulating its use of, and storage of, data files;
2) that your data file will not exceed the maximum file size for the host OS.

Both of these assumptions are, IMHO, based on antiquated data (or on Windows).  
Modern *nix filesystems and RAID are very, very efficient at file access and 
only a database with the development budget of Oracle could hope to keep up.  
Additionally, databases larger than 2GB are becoming increasingly common.

Single-file databases also introduce a number of problems:

1) The database file is extremely vulnerable to corruption, and if corruption 
occurs it is usually not localized but destroys the entire database due to 
corruption of the internal file structure.  Recovery of raw data out of a 
damaged single-file database inevitably requires specialized tools if it is 
possible at all.
2) Often DBAs are prevented from using normal file operations on the database 
files to maintain their systems.  For example, try moving a partition on an 
MS SQL Server installation.  Go on, I double-dog dare you.
3) Due to the necessity of maintaining not only data and metadata, but a file 
partitioning structure as well, maintenance on single-file databases is often 
more time-consuming but at the same time more crucial (to prevent #1).

-- 
Josh Berkus
Aglio Database Solutions
San Francisco


Re: Single-file DBs WAS: Need concrete "Why Postgres

From
Jan Wieck
Date:
Some well known database that is very popular amongst people who care 
more for their data than for license fees uses few very big files that 
are statically allocated (if using files instead of raw devices).

Sure does Oracle internally maintain some sort of filesystem. But this 
is more due to other reasons.

If a filesystem contains only very few big files (and nothing else) and 
these files do not grow or shrink during normal operation and are really 
fully allocated in the block tables, then said filesystems metadata does 
not change and that means that the filesystem will never ever be corrupt 
from the OS's point of view (except due to hardware failure). Plus, an 
FSCK on a filesystem with very few huge files is fast, really *fast*. So 
in the case of an OS crash, your system is up in no time again, no 
matter how big your database is.
From there the DB itself maintains it's own metadata and has control 
with it's WAL and other mechanisms over what needs to be redone, undone 
and turned around to get back into a consistent state.


Jan

Josh Berkus wrote:

> Guys,
> 
>> >BTW any comments on storing an entire database in single file? I don't
>> > trust any file system for performance and data integrity if I have single
>> > 100GB file. I would rather have multiple of them..
>>
>> I don't see why not. Entire file systems are stored within a single file
>> sometimes. Examples: vmware, and IIRC UserMode Linux.
> 
> Several database systems use a "single file" for data storage.   The problem 
> with this is that it's not really a single file .... it's a proprietary file 
> system on top of the host file system.   This sort of design makes a couple 
> assumptions:
> 
> 1) That the database is better than the host filesystem/OS and storage system 
> at regulating its use of, and storage of, data files;
> 2) that your data file will not exceed the maximum file size for the host OS.
> 
> Both of these assumptions are, IMHO, based on antiquated data (or on Windows).  
> Modern *nix filesystems and RAID are very, very efficient at file access and 
> only a database with the development budget of Oracle could hope to keep up.  
> Additionally, databases larger than 2GB are becoming increasingly common.
> 
> Single-file databases also introduce a number of problems:
> 
> 1) The database file is extremely vulnerable to corruption, and if corruption 
> occurs it is usually not localized but destroys the entire database due to 
> corruption of the internal file structure.  Recovery of raw data out of a 
> damaged single-file database inevitably requires specialized tools if it is 
> possible at all.
> 2) Often DBAs are prevented from using normal file operations on the database 
> files to maintain their systems.  For example, try moving a partition on an 
> MS SQL Server installation.  Go on, I double-dog dare you.
> 3) Due to the necessity of maintaining not only data and metadata, but a file 
> partitioning structure as well, maintenance on single-file databases is often 
> more time-consuming but at the same time more crucial (to prevent #1).
> 


-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: Single-file DBs WAS: Need concrete "Why Postgres

From
Andrew Rawnsley
Date:
On Friday, August 22, 2003, at 12:07 PM, Josh Berkus wrote:

>
> Single-file databases also introduce a number of problems:
>
> 1) The database file is extremely vulnerable to corruption, and if 
> corruption
> occurs it is usually not localized but destroys the entire database 
> due to
> corruption of the internal file structure.  Recovery of raw data out 
> of a
> damaged single-file database inevitably requires specialized tools if 
> it is
> possible at all.
>
<snip>

Having fallen victim to Oracle crapping in its own nest and doing this 
exact thing, and having to drop some stupid amount of $$ to Oracle for 
them to use their "specialized tool" to try to recover data (which they 
really didn't do much of), I concur with this statement.

Boy, was that a lousy experience.

> -- 
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>
> ---------------------------(end of 
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to 
> majordomo@postgresql.org
>
--------------------

Andrew Rawnsley
President
The Ravensfield Digital Resource Group, Ltd.
(740) 587-0114
www.ravensfield.com



Re: Single-file DBs WAS: Need concrete "Why Postgres

From
Josh Berkus
Date:
Jan,

> If a filesystem contains only very few big files (and nothing else) and
> these files do not grow or shrink during normal operation and are really
> fully allocated in the block tables, then said filesystems metadata does
> not change and that means that the filesystem will never ever be corrupt
> from the OS's point of view (except due to hardware failure). Plus, an
> FSCK on a filesystem with very few huge files is fast, really *fast*. So
> in the case of an OS crash, your system is up in no time again, no
> matter how big your database is.

I'm not talking about problems with the host filesystem.  I'm talking about
problems with the data file itself.   From my perspective, the length of time
it takes to do an FSCK is inconsequential, because I do one maybe once every
two years.

It does you little good, though, to have the host OS reporting that the files
are OK, when the database won't run.

>  From there the DB itself maintains it's own metadata and has control
> with it's WAL and other mechanisms over what needs to be redone, undone
> and turned around to get back into a consistent state.

Yes, but you've just added a significant amount to the work the DB system
needs to do in recovery.   PostgreSQL just needs to check for, and recover
from, issues with LSN headers and transactions.   Single-file DBs, like SQL
Server, need to also check and audit the internal file partitioning.

In my experience (a lot of MS SQL, more MS Access than I want to talk about,
and a little Oracle) corruption failures on single-file databases are more
frequent than databases which depend on the host OS, and such failures are
much more severe when the occur.

--
-Josh BerkusAglio Database SolutionsSan Francisco



Re: Single-file DBs WAS: Need concrete "Why Postgres

From
Mike Mascari
Date:
Josh Berkus wrote:

> Jan,
>
> In my experience (a lot of MS SQL, more MS Access than I want to talk about, 
> and a little Oracle) corruption failures on single-file databases are more 
> frequent than databases which depend on the host OS, and such failures are 
> much more severe when the occur.
>

Vadim seemed to think differently:


http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=00030722102200.00601%40lorc.wgcr.org&rnum=9&prev=/groups%3Fhl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3DUTF-8%26q%3DVadim%2Bsingle%2Bfile%2Bpostgres

In addition to Jan's points, using a single pre-allocated file also
reduces file descriptor consumption, although I don't know what the
costs are regarding maintaining the LRU of file descriptors, the price
of opens and closes, the price of having a high upper limit of file
descriptors, etc.

Just because Oracle and MS do something doesn't necessary make it
wrong. :-)


Mike Mascari
mascarm@mascari.com










Re: Single-file DBs WAS: Need concrete "Why Postgres

From
Josh Berkus
Date:
Mike,
> Vadim seemed to think differently:
>

http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=00030722102200.00601%40lorc.wgcr.org&rnum=9&prev=/groups%3Fhl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3DUTF-8%26q%3DVadim%2Bsingle%2Bfile%2Bpostgres

Bad link.  This gives me a post by Lamar Owen talking about usng OIDs to name
files.

--
-Josh BerkusAglio Database SolutionsSan Francisco



Re: Single-file DBs WAS: Need concrete "Why Postgres

From
Lamar Owen
Date:
On Friday 22 August 2003 18:42, Josh Berkus wrote:
> Bad link.  This gives me a post by Lamar Owen talking about usng OIDs to
> name files.

I think he may be referring to the last paragraph.  Vadim had said that the 
tablenames would go to OIDs.  They have always been individual files.  Been a 
long time since I wrote that e-mail....
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute


Re: Single-file DBs WAS: Need concrete "Why Postgres

From
Jeff
Date:
On Fri, 22 Aug 2003, Mike Mascari wrote:

> In addition to Jan's points, using a single pre-allocated file also
> reduces file descriptor consumption, although I don't know what the
> costs are regarding maintaining the LRU of file descriptors, the price
> of opens and closes, the price of having a high upper limit of file
> descriptors, etc.
>

another thought I had - I wonder how much fragmentation affects postgres
(and others).  On Informix when you use "cooked files" it needs to
preallocate all the space.  I figured part of it was to 1. make the file
seem more like a device 2. reduce fragmentation 3. guarantee that 2GB
"chunk" can never have an out of space issue.

On a dedicated pg box there is probably not much fragmentation (and thanks
to today's modern filesystems, it isn'ttoo big of a deal in any case)..
but it would still be interesting to have numbers..

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/




Re: Single-file DBs WAS: Need concrete 'Why Postgres

From
"Andrew Dunstan"
Date:
Is anyone seriously suggesting that postgres should support either raw
devices or use some sort of virtual file system? If not, this whole
discussion is way off topic. And if they are my response would be that it
would at best be a serious waste of time - there is far more important
work to do.

cheers

andrew




Re: Single-file DBs WAS: Need concrete 'Why Postgres

From
Tom Lane
Date:
"Andrew Dunstan" <andrew@dunslane.net> writes:
> Is anyone seriously suggesting that postgres should support either raw
> devices or use some sort of virtual file system? If not, this whole
> discussion is way off topic.

I have zero interest in actually doing it.  However, it'd be nice if the
existing "storage manager" API were clean enough that our response to
this type of question could be "sure, go implement it, and when you're
done let us know what performance improvement you see".  We've allowed
the smgr API to degenerate over the years.  CREATE/DROP DATABASE both
bypass it, and the support for alternate database locations messes up
the API pretty thoroughly (not that there's anything clean about that
feature at all), and I think there are some other issues with specific
commands bypassing the smgr abstractions.

I think it would be reasonable to fix this as part of the "tablespaces"
work that people keep wanting to do.
        regards, tom lane


Re: Single-file DBs WAS: Need concrete 'Why Postgres

From
"Andrew Dunstan"
Date:
This makes sense to me. I sense a TODO item :-)

(My dim and possibly incorrect memory of administering Ingres around 10
years ago was that it supported both raw devices and file system based
databases. We opted for a file system base, for reasons others have
mentioned here, but I seem to recall we used a raw device for the
transaction log for performance reasons. But, as the saying goes, that was
a long time ago, and in another country.)

andrew


Tom wrote
> "Andrew Dunstan" <andrew@dunslane.net> writes:
>> Is anyone seriously suggesting that postgres should support either raw
>> devices or use some sort of virtual file system? If not, this whole
>> discussion is way off topic.
>
> I have zero interest in actually doing it.  However, it'd be nice if
> the existing "storage manager" API were clean enough that our response
> to this type of question could be "sure, go implement it, and when
> you're done let us know what performance improvement you see".  We've
> allowed the smgr API to degenerate over the years.  CREATE/DROP
> DATABASE both bypass it, and the support for alternate database
> locations messes up the API pretty thoroughly (not that there's
> anything clean about that feature at all), and I think there are some
> other issues with specific commands bypassing the smgr abstractions.
>
> I think it would be reasonable to fix this as part of the "tablespaces"
> work that people keep wanting to do.
>
>             regards, tom lane