Thread: Re: [GENERAL] PostgreSQL vs. MySQL

Re: [GENERAL] PostgreSQL vs. MySQL

From
Chris Bitmead
Date:
Louis Bertrand wrote:
> 
> There's a benchmark/review comparing PostgreSQL and MySQL on PHP Builder:
>  http://www.phpbuilder.com/columns/tim20000705.php3

I'm wondering about the comments that postgres is slower in connection
time, could this be related to that libpq always uses asynchronous
sockets to connect? It always turns off blocking and then goes through a
state machine to go through the various stages of connect, instead of
just calling connect() and waiting for the kernel to do its thing. Of
course asynchronous connecting is a benefit when you want it. Or is the
overhead elsewhere, and I'm just being paranoid?


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Bruce Momjian
Date:
> Louis Bertrand wrote:
> > 
> > There's a benchmark/review comparing PostgreSQL and MySQL on PHP Builder:
> >  http://www.phpbuilder.com/columns/tim20000705.php3
> 
> I'm wondering about the comments that postgres is slower in connection
> time, could this be related to that libpq always uses asynchronous
> sockets to connect? It always turns off blocking and then goes through a
> state machine to go through the various stages of connect, instead of
> just calling connect() and waiting for the kernel to do its thing. Of
> course asynchronous connecting is a benefit when you want it. Or is the
> overhead elsewhere, and I'm just being paranoid?

The truth is, we really don't know what it is.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Tom Lane
Date:
Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
> I'm wondering about the comments that postgres is slower in connection
> time, could this be related to that libpq always uses asynchronous
> sockets to connect? It always turns off blocking and then goes through a
> state machine to go through the various stages of connect, instead of
> just calling connect() and waiting for the kernel to do its thing.

I think you'd be wasting your time to "improve" that.  A couple of
kernel calls are not enough to explain the problem.  Moreover, we
had complaints about slow startup even back when libpq had never heard
of async anything.

I believe that the problem is on the backend side: there's an awful lot
of cache-initialization and so forth that happens each time a backend
is started.  It's quick enough to be hard to profile accurately,
however, so getting the info needed to speed it up is not so easy.
        regards, tom lane


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Philip Warner
Date:
At 22:59 9/07/00 -0400, Tom Lane wrote:
>Chris Bitmead <chrisb@nimrod.itg.telstra.com.au> writes:
>> I'm wondering about the comments that postgres is slower in connection
>> time, could this be related to that libpq always uses asynchronous
>> sockets to connect? It always turns off blocking and then goes through a
>> state machine to go through the various stages of connect, instead of
>> just calling connect() and waiting for the kernel to do its thing.
>
>I believe that the problem is on the backend side: there's an awful lot
>of cache-initialization and so forth that happens each time a backend
>is started.  It's quick enough to be hard to profile accurately,
>however, so getting the info needed to speed it up is not so easy.
>

You could pre-start servers (ala Apache), then when a connection request
comes in, the connection should be pretty fast. This would involve
defining, for each database, the number of servers to prestart (default 0),
and perhaps the minimum number of free servers to maintain (ie. when all
free servers are used up, automatically create some new ones). You would
definitely need to make this dynamic to allow for clean database shutdowns.




----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


pg_backup symlink?

From
Philip Warner
Date:
Does anyone have a philosophical objection to a symlink from pg_dump to
(new) pg_backup?

The reason I ask is that with the new BLOB support, to do a proper backup
of the database one has to type:
   pg_dump --blob -Fc ...etc

where --blob tells it to dump BLOBs and -Fc tells it to use the custon file
format, which at the moment is the only one that supports BLOB storage.

The idea would be for pg_dump to look at it's name, and make --blob and -Fc
defaults if it is called as pg_backup. These can of course be overridden
when binary blob load direct into psql is supported (maybe 'LO_COPY from
stdin Length {len}'?)

I know someone (Tom?) objected to symlinked files drastically changing
command behaviour, but this is not a drastic change, so I live in hope.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> At 22:59 9/07/00 -0400, Tom Lane wrote:
>> I believe that the problem is on the backend side: there's an awful lot
>> of cache-initialization and so forth that happens each time a backend
>> is started.  It's quick enough to be hard to profile accurately,
>> however, so getting the info needed to speed it up is not so easy.

> You could pre-start servers (ala Apache), then when a connection request
> comes in, the connection should be pretty fast. This would involve
> defining, for each database, the number of servers to prestart (default 0),

Yeah, that's been discussed before.  It seems possible if not exactly
simple --- one of the implications is that the postmaster no longer
listens for client connections, but is reduced to being a factory for
new backends.  The prestarted backends themselves have to be listening
for client connections, since there's no portable way for the postmaster
to pass off a client socket to an already-existing backend.

And of course the major problem with *that* is how do you get the
connection request to arrive at a backend that's been prestarted in
the right database?  If you don't commit to a database then there's
not a whole lot of prestarting that can be done.

It occurs to me that this'd get a whole lot more feasible if one
postmaster == one database, which is something we *could* do if we
implemented schemas.  Hiroshi's been arguing that the current hard
separation between databases in an installation should be done away
with in favor of schemas, and I'm starting to see his point...
        regards, tom lane


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Philip Warner
Date:
At 01:02 10/07/00 -0400, Tom Lane wrote:
>
>> You could pre-start servers (ala Apache), then when a connection request
>> comes in, the connection should be pretty fast. This would involve
>> defining, for each database, the number of servers to prestart (default 0),
>
>since there's no portable way for the postmaster
>to pass off a client socket to an already-existing backend.

That's a pain, because you probably don't want to vary the postmaster
behaviour that much. 

Couldn't you modify the connection protocol to request the port of a free
db server, then redo the connect invisibly inside the front end?

The postmaster would have to manage free servers, and mark the db server as
used etc etc.


>It occurs to me that this'd get a whole lot more feasible if one
>postmaster == one database, which is something we *could* do if we
>implemented schemas.  Hiroshi's been arguing that the current hard
>separation between databases in an installation should be done away
>with in favor of schemas, and I'm starting to see his point...

This has other advantages too - I'd like to be able to shutdown *one*
database, and possibly restart it in 'administrator mode' (eg. for a
restore operation). It also means one misbehaving DB doesn't mess up other
DBs. Sounds very good to me.

Does this mean there would be a postmaster-master that told you the
postmaster port to connect to for the desired DB? Or is there a nicer way
of doing this...



----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Benjamin Adida
Date:
This "pre-starting" is already being done by any web application that uses
connection pooling. (I suspect this speed of connection startup is only
important for something like a web system, correct?). Even if you did
"pre-start" these back-ends, you'd end up with one of two possibilities:

- you reuse the back-end processes from one connection to the other. I
suspect this is very hard, and you'd just be recreating connection pooling
at a lower level, which I don't think is that worthwhile an investment...

- you don't reuse the back-end processes, in which case you're still
spawning one process per connection, which remains a bad idea for web
systems, so you're back to the application-layer connection pooling idea.

I admire the entire Postgres's team efforts to fix any and all issues that
come in. You guys show true humility and a real desire to make this product
the best it can be.

It seems to me, though, that this particular issue is better resolved at the
application layer.

-Ben

on 7/10/00 1:02 AM, Tom Lane at tgl@sss.pgh.pa.us wrote:

>> You could pre-start servers (ala Apache), then when a connection request
>> comes in, the connection should be pretty fast. This would involve
>> defining, for each database, the number of servers to prestart (default 0),
> 
> Yeah, that's been discussed before.  It seems possible if not exactly
> simple --- one of the implications is that the postmaster no longer
> listens for client connections, but is reduced to being a factory for
> new backends.  The prestarted backends themselves have to be listening
> for client connections, since there's no portable way for the postmaster
> to pass off a client socket to an already-existing backend.
> 
> And of course the major problem with *that* is how do you get the
> connection request to arrive at a backend that's been prestarted in
> the right database?  If you don't commit to a database then there's
> not a whole lot of prestarting that can be done.
> 
> It occurs to me that this'd get a whole lot more feasible if one
> postmaster == one database, which is something we *could* do if we
> implemented schemas.  Hiroshi's been arguing that the current hard
> separation between databases in an installation should be done away
> with in favor of schemas, and I'm starting to see his point...
> 
> regards, tom lane
> 



Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Bruce Momjian
Date:
> And of course the major problem with *that* is how do you get the
> connection request to arrive at a backend that's been prestarted in
> the right database?  If you don't commit to a database then there's
> not a whole lot of prestarting that can be done.
> 
> It occurs to me that this'd get a whole lot more feasible if one
> postmaster == one database, which is something we *could* do if we
> implemented schemas.  Hiroshi's been arguing that the current hard
> separation between databases in an installation should be done away
> with in favor of schemas, and I'm starting to see his point...

This is interesting.  You believe schema's would allow a pool of
backends to connect to any database?  That would clearly be a win.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Mike Mascari
Date:
Bruce Momjian wrote:
> 
> > And of course the major problem with *that* is how do you get the
> > connection request to arrive at a backend that's been prestarted in
> > the right database?  If you don't commit to a database then there's
> > not a whole lot of prestarting that can be done.
> >
> > It occurs to me that this'd get a whole lot more feasible if one
> > postmaster == one database, which is something we *could* do if we
> > implemented schemas.  Hiroshi's been arguing that the current hard
> > separation between databases in an installation should be done away
> > with in favor of schemas, and I'm starting to see his point...
> 
> This is interesting.  You believe schema's would allow a pool of
> backends to connect to any database?  That would clearly be a win.

I'm just curious, but did a consensus ever develop on schemas? It
seemed that the schemas/tablespace thread just ran out of steam.
For what its worth, I like the idea of:

1. PostgreSQL installation -> SQL cluster of catalogs
2. PostgreSQL database -> SQL catalog
3. PostgreSQL schema -> SQL schema

This correlates nicely with the current representation of
DATABASE. People can run multiple SQL clusters by running
multiple postmasters on different ports. Today, most people
achieve a logical separation of data by issuing multiple CREATE
DATABASE commands. But under the above, most sites would run with
a single PostgreSQL database (SQL catalog), since:

"Catalogs are named collections of schemas in an SQL-environment"

This would mirror the behavior of Oracle, where most people run
with a single Oracle SID.  The logical separation would be
achieved with SCHEMA's a level under the current DATABASE (a.k.a.
catalog). This eliminates the problem of using softlinks and
creating various subdirectories to mirror *logical* parititioning
of data. It also alleviates the problem people currently
encounter when they've built their data model around multiple
DATABASE's but learn later that they need access to more than one
simultaneously. Instead, they'll model their design around
multiple SCHEMA's which exist within a single DATABASE instance. 

It seems that the discussion of tablespaces shouldn't be mixed
with SCHEMA's except to note that a DATABASE (catalog) should
have a default TABLESPACE whose path matches the current one:

../pgsql/data/base/<mydatabase>

Later, users might be able to create a hierarchy of default
TABLESPACE's where the location of the object is found with logic
like:

1. Is there a object-specified tablespace? (ex: CREATE TABLE payroll IN TABLESPACE...)
2. Is there a user-specified default tablespace? (ex: CREATE USER mike DEFAULT TABLESPACE...)
2. Is there a schema-specified default tablespace? (ex: CREATE SCHEMA accounting DEFAULT TABLESPACE..)
3. Use the catalog-default tablespace (ex: CREATE DATABASE postgres DEFAULT LOCATION '/home/pgsql')

with the last example creating the system tablespace,
'system_tablespace', with '/home/pgsql' as the location.

Anyways, it seems a consensus should be developed on the whole
Cluster/Catalog/Schema scenario.

Mike Mascari


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> It occurs to me that this'd get a whole lot more feasible if one
>> postmaster == one database, which is something we *could* do if we
>> implemented schemas.  Hiroshi's been arguing that the current hard
>> separation between databases in an installation should be done away
>> with in favor of schemas, and I'm starting to see his point...

> This is interesting.  You believe schema's would allow a pool of
> backends to connect to any database?  That would clearly be a win.

No, I meant that we wouldn't have physically separate databases anymore
within an installation, but would provide the illusion of it via
schemas.  So, only one pg_class (for example) per installation.
This would simplify life in a number of areas... but there are downsides
to it as well, of course.
        regards, tom lane


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Thomas Lockhart
Date:
> No, I meant that we wouldn't have physically separate databases anymore
> within an installation, but would provide the illusion of it via
> schemas.  So, only one pg_class (for example) per installation.
> This would simplify life in a number of areas... but there are downsides
> to it as well, of course.

Oops. This seems the wrong way to go. Increasing coupling between
databases to support schemas really means that we've traded one feature
for another, not increased our feature set. 

Schemas are intended to help logically partition a work area/database.
We will need to implement the SQL99 path lookup scheme for finding
resources within a schema-divided database. But imho most installations
will still want resource- and permissions-partitioning between different
databases, and schemas should figure out how to fit within a single
database.

I didn't participate in the tablespace discussion because there seems to
be several PoV's well represented, but I'm interested in the schema
issue ;)
                  - Thomas


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Tom Lane
Date:
Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> No, I meant that we wouldn't have physically separate databases anymore
>> within an installation, but would provide the illusion of it via
>> schemas.  So, only one pg_class (for example) per installation.
>> This would simplify life in a number of areas... but there are downsides
>> to it as well, of course.

> Oops. This seems the wrong way to go. Increasing coupling between
> databases to support schemas really means that we've traded one feature
> for another, not increased our feature set. 

You could argue it that way, or you could say that we're replacing a
crufty old single-purpose feature with a nice new multi-purpose feature.

I'm not by any means sold on removing the physical separation between
databases --- I can see lots of reasons not to.  But I think we ought
to think hard about the choice, not have a knee-jerk reaction that we
don't want to "eliminate a feature".  Physically separate databases
are an implementation choice, not a user feature.
        regards, tom lane


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Bruce Momjian
Date:
> Bruce Momjian wrote:
> > 
> > > And of course the major problem with *that* is how do you get the
> > > connection request to arrive at a backend that's been prestarted in
> > > the right database?  If you don't commit to a database then there's
> > > not a whole lot of prestarting that can be done.
> > >
> > > It occurs to me that this'd get a whole lot more feasible if one
> > > postmaster == one database, which is something we *could* do if we
> > > implemented schemas.  Hiroshi's been arguing that the current hard
> > > separation between databases in an installation should be done away
> > > with in favor of schemas, and I'm starting to see his point...
> > 
> > This is interesting.  You believe schema's would allow a pool of
> > backends to connect to any database?  That would clearly be a win.
> 
> I'm just curious, but did a consensus ever develop on schemas? It
> seemed that the schemas/tablespace thread just ran out of steam.
> For what its worth, I like the idea of:

You can find the entire thread in the current development tree in
doc/TODO.detail/tablespaces.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> It occurs to me that this'd get a whole lot more feasible if one
> >> postmaster == one database, which is something we *could* do if we
> >> implemented schemas.  Hiroshi's been arguing that the current hard
> >> separation between databases in an installation should be done away
> >> with in favor of schemas, and I'm starting to see his point...
> 
> > This is interesting.  You believe schema's would allow a pool of
> > backends to connect to any database?  That would clearly be a win.
> 
> No, I meant that we wouldn't have physically separate databases anymore
> within an installation, but would provide the illusion of it via
> schemas.  So, only one pg_class (for example) per installation.
> This would simplify life in a number of areas... but there are downsides
> to it as well, of course.

Wow, I can image the complications.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Re: [GENERAL] PostgreSQL vs. MySQL

From
Bruce Momjian
Date:
> I'm not by any means sold on removing the physical separation between
> databases --- I can see lots of reasons not to.  But I think we ought
> to think hard about the choice, not have a knee-jerk reaction that we
> don't want to "eliminate a feature".  Physically separate databases
> are an implementation choice, not a user feature.

If we put tables from different database in the same tablespace
directory, and a database gets hosed, there is no way to delete the
files associated with the hosed database, unless we go around and find
all the table files used by all databases, then remove the ones not
referenced.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: pg_backup symlink?

From
Peter Eisentraut
Date:
Philip Warner writes:

> Does anyone have a philosophical objection to a symlink from pg_dump to
> (new) pg_backup?

Yes. The behaviour of a program should not depend on the name used to
invoke it. You can use shell aliases or scripts for that.


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: pg_backup symlink?

From
Philip Warner
Date:
At 00:24 11/07/00 +0200, Peter Eisentraut wrote:
>Philip Warner writes:
>
>> Does anyone have a philosophical objection to a symlink from pg_dump to
>> (new) pg_backup?
>
>Yes. The behaviour of a program should not depend on the name used to
>invoke it. You can use shell aliases or scripts for that.

OK, I suppose I was thinking of the pg_dump symlink as a tool for
compatibility. 

Is there a good solution? It dumps to text for compatibility with the old
pg_dump, but I will most often use 'pg_dump -Fc --blob'. Is there a
recommended 'correct' approach?


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: pg_backup symlink?

From
Lamar Owen
Date:
Philip Warner wrote:
> At 00:24 11/07/00 +0200, Peter Eisentraut wrote:
> >Philip Warner writes:
> >> Does anyone have a philosophical objection to a symlink from pg_dump to
> >> (new) pg_backup?

> >Yes. The behaviour of a program should not depend on the name used to
> >invoke it. You can use shell aliases or scripts for that.
> OK, I suppose I was thinking of the pg_dump symlink as a tool for
> compatibility.

There is already precedent -- postmaster is a symlink to postgres, but
operates differently due to its invocation name.

--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11


Re: pg_backup symlink?

From
Bruce Momjian
Date:
> At 00:24 11/07/00 +0200, Peter Eisentraut wrote:
> >Philip Warner writes:
> >
> >> Does anyone have a philosophical objection to a symlink from pg_dump to
> >> (new) pg_backup?
> >
> >Yes. The behaviour of a program should not depend on the name used to
> >invoke it. You can use shell aliases or scripts for that.
> 
> OK, I suppose I was thinking of the pg_dump symlink as a tool for
> compatibility. 
> 
> Is there a good solution? It dumps to text for compatibility with the old
> pg_dump, but I will most often use 'pg_dump -Fc --blob'. Is there a
> recommended 'correct' approach?

The BSD way is to define an environment variable that is used to supply
additional arguments to the command.  For example, BLOCKSIZE controls if
blocks are reported in 512 or 1k sizes by commands like du.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: pg_backup symlink?

From
The Hermit Hacker
Date:
On Tue, 11 Jul 2000, Peter Eisentraut wrote:

> Philip Warner writes:
> 
> > Does anyone have a philosophical objection to a symlink from pg_dump to
> > (new) pg_backup?
> 
> Yes. The behaviour of a program should not depend on the name used to
> invoke it. You can use shell aliases or scripts for that.

tell that to *how many* Unix programs? :)  sendmail, of course, being the
first to jump to mind ...



Re: pg_backup symlink?

From
Tom Lane
Date:
Lamar Owen <lamar.owen@wgcr.org> writes:
>>>> Yes. The behaviour of a program should not depend on the name used to
>>>> invoke it. You can use shell aliases or scripts for that.

> There is already precedent -- postmaster is a symlink to postgres, but
> operates differently due to its invocation name.

There are dozens of other examples in any standard Unix system.  Just
to take one example, 'ls' has six different links to it on my Unix box,
and they all act differently (ie, supply different default switches to
the basic 'ls' behavior).

Peter is definitely swimming upstream if he hopes to get anyone to adopt
the above as received wisdom.
        regards, tom lane


Re: pg_backup symlink?

From
Philip Warner
Date:
At 18:58 10/07/00 -0400, Tom Lane wrote:
>
>There are dozens of other examples in any standard Unix system.  Just
>to take one example, 'ls' has six different links to it on my Unix box,
>and they all act differently (ie, supply different default switches to
>the basic 'ls' behavior).
>
>Peter is definitely swimming upstream if he hopes to get anyone to adopt
>the above as received wisdom.
>

Does this mean that using a pg_backup symlink would be deemed acceptable?


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: pg_backup symlink?

From
The Hermit Hacker
Date:
On Tue, 11 Jul 2000, Philip Warner wrote:

> At 18:58 10/07/00 -0400, Tom Lane wrote:
> >
> >There are dozens of other examples in any standard Unix system.  Just
> >to take one example, 'ls' has six different links to it on my Unix box,
> >and they all act differently (ie, supply different default switches to
> >the basic 'ls' behavior).
> >
> >Peter is definitely swimming upstream if he hopes to get anyone to adopt
> >the above as received wisdom.
> >
> 
> Does this mean that using a pg_backup symlink would be deemed acceptable?

yes :)

both 'commands' should be documented in the man pages too ... right? :)




Re: pg_backup symlink?

From
Philip Warner
Date:
At 20:39 10/07/00 -0300, The Hermit Hacker wrote:
>
>both 'commands' should be documented in the man pages too ... right? :)
>

Believe it ot not, I have actually started on this. The SGML sources are a
bit hard on the eyes, even for someone who used to use TeX. Is there a
simpler way than manually editing pg_dump.sgml?


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: pg_backup symlink?

From
Bruce Momjian
Date:
> At 20:39 10/07/00 -0300, The Hermit Hacker wrote:
> >
> >both 'commands' should be documented in the man pages too ... right? :)
> >
> 
> Believe it ot not, I have actually started on this. The SGML sources are a
> bit hard on the eyes, even for someone who used to use TeX. Is there a
> simpler way than manually editing pg_dump.sgml?

Yes, hard on the eyes.  No, no better way.  The only suggestion I have
is to use an editor in HTML colorizer mode so the tags are colored.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: pg_backup symlink?

From
Peter Eisentraut
Date:
The Hermit Hacker writes:

> tell that to *how many* Unix programs? :)  sendmail, of course, being the
> first to jump to mind ...

That doesn't mean it's a good idea. For one, it would prevent anyone to
install them as pg_dump71, etc., which I had hoped to offer sometime. But
I'm just one voice... If you make pg_dump a one-line shell script on the
other hand you don't hurt anyone.

Does Windows 98 have (sym)links? That's a supported client platform.


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: pg_backup symlink?

From
Peter Eisentraut
Date:
Philip Warner writes:

> Is there a good solution? It dumps to text for compatibility with the old
> pg_dump, but I will most often use 'pg_dump -Fc --blob'. Is there a
> recommended 'correct' approach?

IMHO, it's a bad strategy to add symlinks as shortcuts to certain
options. Where would that ever lead? There are tons of options settings I
use "most often" in various programs, but for that you can use shells
aliases or scripts, or the program provides an environment variable for
default options.

The default behaviour of pg_dump (or pg_backup or whatever) should be to
write plain text to stdout. If you want to write format "foo", use the
-Ffoo option. If you want to dump blobs, use the --blob option. That makes
sense.

You're really trying to force certain usage patterns by labeling one
invocation "backup" and another "dump". I can foresee the user problems:
"No, you have to use pg_dump for that, not pg_backup!" -- "Don't they do
the same thing?" -- "Why aren't they the same program then?" We're still
battling that sympton in the createdb vs CREATE DATABASE case.

What's wrong with just having pg_dump, period? After all pg_dump isn't
something you use like `ls' or `cat' where every extra keystroke is a
pain.


-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: pg_backup symlink?

From
Philip Warner
Date:
At 02:23 12/07/00 +0200, Peter Eisentraut wrote:
>
>IMHO, it's a bad strategy to add symlinks as shortcuts to certain
>options. Where would that ever lead? 

I suppose the glib answer is "to a more convenient and easy to use tool" 8-}.

>There are tons of options settings I
>use "most often" in various programs, but for that you can use shells
>aliases or scripts, or the program provides an environment variable for
>default options.

In this case I view pg_dump's default behaviour as an anachronism caused by
compatibility issues, not a feature. Dumping to text without blobs is like
asking ls to only list files whose names are in lower case.


>The default behaviour of pg_dump (or pg_backup or whatever) should be to
>write plain text to stdout. If you want to write format "foo", use the
>-Ffoo option. If you want to dump blobs, use the --blob option. That makes
>sense.

With a symlink, that's what you get. You will still be able to add '-Ffoo'
to pg_dump (or -Fp to pg_backup)


>You're really trying to force certain usage patterns by labeling one
>invocation "backup" and another "dump". I can foresee the user problems:
>"No, you have to use pg_dump for that, not pg_backup!"

The actualy answer to the question is: "either use 'pg_dump -Fc --blob', or
just use pg_backup, whichever you find easiest to remember".

This works both ways: "I used pg_dump to backup my db, but it doesn't
contain the blobs" - I've certainly seen that message a few times. Both
issues are solved by documentation. 

Until a scipt file can import blob data directly from stdin, a text file
can not be used to backup blobs, so the default behaviour of pg_dump is
unsuitable for backups.


>We're still
>battling that sympton in the createdb vs CREATE DATABASE case.

My guess is these issues were also created by legacy code.


>What's wrong with just having pg_dump, period? After all pg_dump isn't
>something you use like `ls' or `cat' where every extra keystroke is a
>pain.

No, but for less commonly used utilities, it's probably more important to
have a simple way invoke a basic, important, function.



----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/