Thread: SQL Manager 2007 for PostgreSQL released

SQL Manager 2007 for PostgreSQL released

From

"EMS Database Management Solutions (SQLManager.net)"

Date:

28 May 2007, 18:08:48

We, here at EMS Database Management Solutions, are pleased to announce
SQL Manager 2007 for PostgreSQL - the new major version of the
powerful PostgreSQL administration and development tool!

You can download SQL Manager 2007 for PostgreSQL at:
http://www.sqlmanager.net/products/postgresql/manager/download

You can purchase SQL Manager 2007 for PostgreSQL at:
http://www.sqlmanager.net/products/postgresql/manager/buy

What's new in SQL Manager 2007 for PostgreSQL?
1. Support of UTF8 data in SQL Editor, Query Builder, Grid View, Data
Export/Import Wizard.
2. Now you can continue working with SQL Manager while a query is
being executed in SQL Editor or Query Builder or while table or view
data are being loaded in the editor. A query taking too long to
execute can be aborted when using PostgreSQL 8.0 and higher.
3. New Database Statistics form added; it allows viewing various
statistic information on database objects, helps to find weak points
in database performance, indicates if it's necessary to create new
indices, etc. The corresponding options must be enabled in server
configuration file (postgresql.conf.) for collecting statistics.
4. All Wizards, including Data Export/Import, DB Extract, Copy DB
wizards can now be run in the background mode, so that you could
continue working with the program.
5. New improved SQL Editor of the Visual Studio 2005 kind with Code
Folding function and UNICODE support.
6. Improved Data Import Wizard now allows to import data up to 10
times faster. New import formats have been added: HTML, XML Document,
Open Document Format (OpenOffice), Open XML Format (MS Word 2007, MS
Excel 2007).
7. Support of new PostgreSQL 8.2 features is implemented (FILLFACTOR
parameter for tables and indices, CONNECT permission for databases,
USAGE permission for sequences, support of CREATE INDEX CONCURRENTLY,
ALTER TABLE [ NO ] INHERIT support, creating domains based on other
domains, support of IF EXISTS clause in DROP operators of the Extract
Database Wizard, reassigning ownership of all objects owned by a
role).
8. Object editors now contain the Permissions tab which allows to
grant permissions on the object for users and groups.
9. Common SQL commands can now be generated from DB Explorer (Script
to SQL Editor, Script to Clipboard). Now you can easily generate such
SQL commands as CREATE, ALTER, DROP, SELECT etc for each database
object.
10. Now you can save your queries in the Favorite Queries folders in
DB Explorer; these queries сan be stored either locally in the
registry (nobody but you will have access to them) or in a special
database table (then all or some of your db users will have access to
it).
11. Download File wizard has been added, which allows to download and
upload files from/to PostgreSQL server machine using standard
PostgreSQL functions and FTP/SFTP protocols.
12. Improved Visual database designer; new object type - comments - is
added. One can now choose whether only object names or fields should
be displayed. Now it is possible to select schemas whose tables should
be added to the diagram when performing Reverse Engineering.
13. Improved Data Export Wizard now works faster. New export formats
have been added: XML Document, Open Document Format (OpenOffice), Open
XML Format (MS Word 2007, MS Excel 2007).
14. Database Properties window has been added, which displays database
options and allows to change some of them.
15. Improved Table Editor. Now it is possible to view and change table
properties on a new tab.
16. Tasks menu is added in DB Explorer context menu. It contains
common operations which can be performed on the object selected. Also
hosts are now named in accordance with the selected connection type
(SSH or HTTP tunneling).
17. Improved report building system.
18. Function Debugger. The variable that changed its value at the last
step is displayed bold at the Debug Information window. The SQLSTATE
and SQLERRM variables are also displayed now.
19. Lots of other improvements and bug-fixes.

Hope you will enjoy working with our software.

Re: SQL Manager 2007 for PostgreSQL released

From

"L. Berger"

Date:

28 May 2007, 18:08:52

On May 23, 8:47 am, "EMS Database Management Solutions
(SQLManager.net)" <sqlmana...@gmail.com> wrote:
> We, here at EMS Database Management Solutions, are pleased to announce
> SQL Manager 2007 for PostgreSQL - the new major version of the
> powerful PostgreSQL administration and development tool!
>
> You can download SQL Manager 2007 for PostgreSQL at:http://www.sqlmanager.net/products/postgresql/manager/download
>
> You can purchase SQL Manager 2007 for PostgreSQL at:http://www.sqlmanager.net/products/postgresql/manager/buy
>
> What's new in SQL Manager 2007 for PostgreSQL?

<snip>


Thanks for this, but is there any plan to launch something like this
for use on Linux admin servers? Something that I could install on a
server, and perhaps work with a web interface? I would love some
recommendations.

Many thanks!
LB

--
Conan O' Brien gets it right!
http://blogs.pcworld.com/tipsandtweaks/archives/004369.html

Re: SQL Manager 2007 for PostgreSQL released

From

Raymond O'Donnell

Date:

28 May 2007, 18:48:42

On 23/05/2007 11:40, L. Berger wrote:

> Thanks for this, but is there any plan to launch something like this
> for use on Linux admin servers? Something that I could install on a
> server, and perhaps work with a web interface? I would love some
> recommendations.

Have you tried PHPPgAdmin? I use it all the time and am very happy with it.

Ray.

---------------------------------------------------------------
Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland
rod@iol.ie
---------------------------------------------------------------

Re: SQL Manager 2007 for PostgreSQL released

From

"Nikolay Samokhvalov"

Date:

28 May 2007, 18:51:10

On 22 May 2007 17:47:46 -0700, EMS Database Management Solutions
(SQLManager.net) <sqlmanager@gmail.com> wrote:
> We, here at EMS Database Management Solutions, are pleased to announce
> SQL Manager 2007 for PostgreSQL - the new major version of the
> powerful PostgreSQL administration and development tool!

I'd be happy to use EMS products, but I really need Linux versions.
Any plans regarding *nix platforms support?


--
Best regards,
Nikolay

Re: SQL Manager 2007 for PostgreSQL released

From

Michael Glaesemann

Date:

28 May 2007, 20:02:06

On May 22, 2007, at 19:47 , EMS Database Management Solutions
(SQLManager.net) wrote:

> We, here at EMS Database Management Solutions, are pleased to announce
> SQL Manager 2007 for PostgreSQL - the new major version of the
> powerful PostgreSQL administration and development tool!

Congratulations on the release. In the future, refrain from posting
product announcements to the pgsql-general list. Thanks!

Michael Glaesemann
grzm seespotcode net

Re: SQL Manager 2007 for PostgreSQL released

From

Kenneth Downs

Date:

29 May 2007, 08:12:34

Nikolay Samokhvalov wrote:
> On 22 May 2007 17:47:46 -0700, EMS Database Management Solutions
> (SQLManager.net) <sqlmanager@gmail.com> wrote:
>> We, here at EMS Database Management Solutions, are pleased to announce
>> SQL Manager 2007 for PostgreSQL - the new major version of the
>> powerful PostgreSQL administration and development tool!
>
> I'd be happy to use EMS products, but I really need Linux versions.
> Any plans regarding *nix platforms support?
>
>

Ditto, when will we see the *nix versions?

--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com    www.andromeda-project.org
631-689-7200   Fax: 631-689-0527
cell: 631-379-0010

Re: SQL Manager 2007 for PostgreSQL released

From

Guillaume Lelarge

Date:

01 June 2007, 11:22:25

Kenneth Downs a écrit :
> Nikolay Samokhvalov wrote:
>> On 22 May 2007 17:47:46 -0700, EMS Database Management Solutions
>> (SQLManager.net) <sqlmanager@gmail.com> wrote:
>>> We, here at EMS Database Management Solutions, are pleased to announce
>>> SQL Manager 2007 for PostgreSQL - the new major version of the
>>> powerful PostgreSQL administration and development tool!
>>
>> I'd be happy to use EMS products, but I really need Linux versions.
>> Any plans regarding *nix platforms support?
>>
>>
>
> Ditto, when will we see the *nix versions?
>

I asked them some time ago. They answered me this :

As for SQL Manager for PostgreSQL - we regret to inform you that the
development and support of Linux editions of EMS software products has
become impossible now that Borland no longer supports Kylix libraries
for Delphi, on which all Linux versions of EMS software were based.
Hence, EMS has made a decision to discontinue Linux versions of its
products. Linux products will not be available for sale or download anymore.

--
Guillaume.
<!-- http://abs.traduc.org/
     http://lfs.traduc.org/
     http://docs.postgresqlfr.org/ -->

Slightly OT.

From

gonzales@linuxlouis.net

Date:

01 June 2007, 11:36:05

Group,
I have to admit, I'm a little disappointed.  I'm a HUGE advocate of
PostgreSQL(to state for the record) - in fact I always keep my eyes peeled
for opportunities to recommend it in my day to day business.

So why am I disappointed, and who really cares?

I'm disappointed because SLONY-II has not been released yet to support
multi-master replication!  PostgreSQL is going through all of the releases
- and that's great - BUT, where is the sync-up with the powerhouse of a
component, that Slony-II would bring to the table?  Slony-I is pretty
sweet, but if Slony-II would release, I can imagine that this would
introduce some major competition in the enterprise world against the
commercial dyno's.

I've been doing some really cool stuff with the latest versions of
OpenLDAP and PostgreSQL on Solaris 10(in the past Solaris 9) also using
Linux(Fedora Core - go Redhat).

Anyone have any thoughts/comments to this end?


On Fri, 1 Jun 2007, Guillaume Lelarge wrote:

> Kenneth Downs a écrit :
>> Nikolay Samokhvalov wrote:
>>> On 22 May 2007 17:47:46 -0700, EMS Database Management Solutions
>>> (SQLManager.net) <sqlmanager@gmail.com> wrote:
>>>> We, here at EMS Database Management Solutions, are pleased to announce
>>>> SQL Manager 2007 for PostgreSQL - the new major version of the
>>>> powerful PostgreSQL administration and development tool!
>>>
>>> I'd be happy to use EMS products, but I really need Linux versions.
>>> Any plans regarding *nix platforms support?
>>>
>>>
>>
>> Ditto, when will we see the *nix versions?
>>
>
> I asked them some time ago. They answered me this :
>
> As for SQL Manager for PostgreSQL - we regret to inform you that the
> development and support of Linux editions of EMS software products has
> become impossible now that Borland no longer supports Kylix libraries
> for Delphi, on which all Linux versions of EMS software were based.
> Hence, EMS has made a decision to discontinue Linux versions of its
> products. Linux products will not be available for sale or download anymore.
>
>
>

--
Louis Gonzales
louis.gonzales@linuxlouis.net
http://www.linuxlouis.net

Re: Slightly OT.

From

"Joshua D. Drake"

Date:

01 June 2007, 11:52:29

gonzales@linuxlouis.net wrote:
> Group,
> I have to admit, I'm a little disappointed.  I'm a HUGE advocate of
> PostgreSQL(to state for the record) - in fact I always keep my eyes
> peeled for opportunities to recommend it in my day to day business.
>
> So why am I disappointed, and who really cares?
>
> I'm disappointed because SLONY-II has not been released yet to support
> multi-master replication!  PostgreSQL is going through all of the
> releases - and that's great - BUT, where is the sync-up with the
> powerhouse of a component, that Slony-II would bring to the table?
> Slony-I is pretty sweet, but if Slony-II would release, I can imagine
> that this would introduce some major competition in the enterprise world
> against the commercial dyno's.

Which databases ship with multi-master replication?

Joshua D. Drake



--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: Slightly OT.

From

gonzales@linuxlouis.net

Date:

01 June 2007, 11:54:32

On Fri, 1 Jun 2007, Joshua D. Drake wrote:

> gonzales@linuxlouis.net wrote:
>> Group,
>> I have to admit, I'm a little disappointed.  I'm a HUGE advocate of
>> PostgreSQL(to state for the record) - in fact I always keep my eyes peeled
>> for opportunities to recommend it in my day to day business.
>>
>> So why am I disappointed, and who really cares?
>>
>> I'm disappointed because SLONY-II has not been released yet to support
>> multi-master replication!  PostgreSQL is going through all of the releases
>> - and that's great - BUT, where is the sync-up with the powerhouse of a
>> component, that Slony-II would bring to the table?  Slony-I is pretty
>> sweet, but if Slony-II would release, I can imagine that this would
>> introduce some major competition in the enterprise world against the
>> commercial dyno's.
>
> Which databases ship with multi-master replication?
I dunno, which ones?

Which ones have robust and fully functional multi-master replication?
(Oracle, MS SQL, not-PostgreSQL).

>
> Joshua D. Drake
>
>
>
>

--
Louis Gonzales
louis.gonzales@linuxlouis.net
http://www.linuxlouis.net

Re: Slightly OT.

From

"Joshua D. Drake"

Date:

01 June 2007, 12:00:50

gonzales@linuxlouis.net wrote:
> On Fri, 1 Jun 2007, Joshua D. Drake wrote:

>> Which databases ship with multi-master replication?
> I dunno, which ones?
>
> Which ones have robust and fully functional multi-master replication?
> (Oracle, MS SQL, not-PostgreSQL).

You consider Oracle RAC fully functional multi-master?

You do realize that Oracle is the second largest software company in the
world right? With Microsoft being number 1?

Tell ya what, write me a check, we will get right on it ;0

Joshua D. Drake



>
>>
>> Joshua D. Drake
>>
>>
>>
>>
>


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: Slightly OT.

From

"Alexander Staubo"

Date:

01 June 2007, 12:00:51

On 6/1/07, gonzales@linuxlouis.net <gonzales@linuxlouis.net> wrote:
> I'm disappointed because SLONY-II has not been released yet to support
> multi-master replication!

I wouldn't pin all my hopes on a project still under development. (For
me, personally, add the fact that Slony-I still has not solved
single-master replication in a way that doesn't burden the
developer/DBA with lots of unnecessary extra maintenance; I am not
counting on its developers to fix this issue in Slony-II.)

In the meantime, Cybertec (http://www.postgresql.at/, an Austrian
company) just announced a commercial synchronous multimaster
replication product based on 2-phase commit. It's expensive, and I
can't speak for its maturity, and it may or may not scale as well as
the projected Slony-II design, but the setup seems dead simple, and
from the docs I have found it seems to transparently replicate schema
changes, unlike Slony-I. So that's something.

Alexander.

Re: Slightly OT.

From

"Joshua D. Drake"

Date:

01 June 2007, 12:08:22

Alexander Staubo wrote:
> On 6/1/07, gonzales@linuxlouis.net <gonzales@linuxlouis.net> wrote:
>> I'm disappointed because SLONY-II has not been released yet to support
>> multi-master replication!
>
> I wouldn't pin all my hopes on a project still under development. (For
> me, personally, add the fact that Slony-I still has not solved
> single-master replication in a way that doesn't burden the
> developer/DBA with lots of unnecessary extra maintenance; I am not
> counting on its developers to fix this issue in Slony-II.)
>
> In the meantime, Cybertec (http://www.postgresql.at/, an Austrian
> company) just announced a commercial synchronous multimaster
> replication product based on 2-phase commit. It's expensive, and I
> can't speak for its maturity, and it may or may not scale as well as
> the projected Slony-II design, but the setup seems dead simple, and
> from the docs I have found it seems to transparently replicate schema
> changes, unlike Slony-I. So that's something.

I could be completely cranked but I believe that product is based on
PgCluster which is horrendously slow.

To be fair, it is still under heavy development and does show promise.

Joshua D. Drake


>
> Alexander.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: Slightly OT.

From

"Alexander Staubo"

Date:

01 June 2007, 12:13:46

On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote:
> > In the meantime, Cybertec (http://www.postgresql.at/, an Austrian
> > company) just announced a commercial synchronous multimaster
> > replication product based on 2-phase commit. It's expensive, and I
[snip]
> I could be completely cranked but I believe that product is based on
> PgCluster which is horrendously slow.

Well, dang, that's disappointing. Last I checked, the PGCluster design
was fundamentally unscalable.

Alexander.

Re: Slightly OT.

From

Dave Page

Date:

01 June 2007, 12:27:26

Alexander Staubo wrote:
> On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote:
>> > In the meantime, Cybertec (http://www.postgresql.at/, an Austrian
>> > company) just announced a commercial synchronous multimaster
>> > replication product based on 2-phase commit. It's expensive, and I
> [snip]
>> I could be completely cranked but I believe that product is based on
>> PgCluster which is horrendously slow.
>
> Well, dang, that's disappointing. Last I checked, the PGCluster design
> was fundamentally unscalable.

Multimaster replication generally is - thats why Slony-2 will almost
certainly never exist in the form that it was originally imagined.
Although I'm not (and never have been) an Oracle user, I've heard that
RAC has it's own issues in this area as well.

Regards, Dave

Re: Slightly OT.

From

"Joshua D. Drake"

Date:

01 June 2007, 12:35:17

Dave Page wrote:
> Alexander Staubo wrote:
>> On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote:
>>>> In the meantime, Cybertec (http://www.postgresql.at/, an Austrian
>>>> company) just announced a commercial synchronous multimaster
>>>> replication product based on 2-phase commit. It's expensive, and I
>> [snip]
>>> I could be completely cranked but I believe that product is based on
>>> PgCluster which is horrendously slow.
>> Well, dang, that's disappointing. Last I checked, the PGCluster design
>> was fundamentally unscalable.
>
> Multimaster replication generally is - thats why Slony-2 will almost
> certainly never exist in the form that it was originally imagined.
> Although I'm not (and never have been) an Oracle user, I've heard that
> RAC has it's own issues in this area as well.

IMO, the future is application partitioning, not multi-master.

Also, what I find interesting here is that PostgreSQL on modest hardware
does excessively well.

I have a client right now that is running an 8 core, 16 gig box with
only 14 spindles.

They are processing 6ktps and it takes 14 tomcat servers to bring the
database down.

Any multi-master solution is going to fall over well before 6ktps.

Sincerely,

Joshua D. Drake

P.S. I should note that we are working toward more than 6ktps and will
report back ;)

>
> Regards, Dave
>


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: Slightly OT.

From

gonzales@linuxlouis.net

Date:

01 June 2007, 12:45:00

It does so well because it KICKS ICE!

On Fri, 1 Jun 2007, Joshua D. Drake wrote:

> Dave Page wrote:
>> Alexander Staubo wrote:
>>> On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote:
>>>>> In the meantime, Cybertec (http://www.postgresql.at/, an Austrian
>>>>> company) just announced a commercial synchronous multimaster
>>>>> replication product based on 2-phase commit. It's expensive, and I
>>> [snip]
>>>> I could be completely cranked but I believe that product is based on
>>>> PgCluster which is horrendously slow.
>>> Well, dang, that's disappointing. Last I checked, the PGCluster design
>>> was fundamentally unscalable.
>>
>> Multimaster replication generally is - thats why Slony-2 will almost
>> certainly never exist in the form that it was originally imagined.
>> Although I'm not (and never have been) an Oracle user, I've heard that
>> RAC has it's own issues in this area as well.
>
> IMO, the future is application partitioning, not multi-master.
>
> Also, what I find interesting here is that PostgreSQL on modest hardware does
> excessively well.
>
> I have a client right now that is running an 8 core, 16 gig box with only 14
> spindles.
>
> They are processing 6ktps and it takes 14 tomcat servers to bring the
> database down.
>
> Any multi-master solution is going to fall over well before 6ktps.
>
> Sincerely,
>
> Joshua D. Drake
>
> P.S. I should note that we are working toward more than 6ktps and will report
> back ;)
>
>>
>> Regards, Dave
>>
>
>
>

--
Louis Gonzales
louis.gonzales@linuxlouis.net
http://www.linuxlouis.net

Re: SQL Manager 2007 for PostgreSQL released

From

Tony Caduto

Date:

01 June 2007, 13:08:54

Guillaume Lelarge wrote:
> I asked them some time ago. They answered me this :
>
> As for SQL Manager for PostgreSQL - we regret to inform you that the
> development and support of Linux editions of EMS software products has
> become impossible now that Borland no longer supports Kylix libraries
> for Delphi, on which all Linux versions of EMS software were based.
> Hence, EMS has made a decision to discontinue Linux versions of its
> products. Linux products will not be available for sale or download anymore.
>
>
>
Just a FYI, their win32 versions will probably work fine in Linux via WINE.

Lightning Admin does anyway(work via WINE) and they program their
products with Delphi as well, so I think it would work fine.

Just one word of advice about WINE, make sure you have the core MS true
type fonts installed or the win32 apps will look funny, especially when
using editors.

Later,

--
Tony Caduto
AM Software Design
http://www.amsoftwaredesign.com
Home of PG Lightning Admin for Postgresql
Your best option for Postgresql Administration

multimaster (was: Slightly OT.)

From

Andrew Sullivan

Date:

01 June 2007, 13:16:35

As an aside -- please don't start new topics in old threads.

On Fri, Jun 01, 2007 at 10:42:02AM -0400, gonzales@linuxlouis.net wrote:
>
> I'm disappointed because SLONY-II has not been released yet to support
> multi-master replication!

Well, I wouldn't hold my breath.  Most of the participants in that
project moved on, after concluding either that it wasn't going to
solve their problems, or concluding that it'd cost too much to
develop and support for the likely benefit it would deliver.  As near
as I can tell, development on the project stopped.

The inspiration for the Slony-II project, Postgres-R, has been ported
forward to 8.x series by Markus Schiltknecht.  Last I heard, he was
looking for people to underwrite his work on that project.  So if you
really want those features, the obvious way to do it is to put a
programmer on it, and there happens to be a programmer who has a demo
as his argument that it can be done, and he can do it.

I think you have to understand, however, that Slony-II or Postgres-R
was not in fact the magic carpet you seem to think it was to be.
There are some pretty significant limitations to the async
multimaster approach it uses.  To begin with, AFAIK nobody has a
working, production-grade group communication system available for
use by Postgres -- the ones that the prototypes were built on were
pretty hacky, and appeared not to be ready for prime time.  Second,
nobody has come up with any way to make this work with READ COMMITTED
mode, which means you pay a really huge price for the replication.

My real question in all this is, "What is the problem you are trying
to solve?"  Hot failover using combinations of hardware and software,
and a disk array that can be mounted across two machines, is actually
probably good enough for most cases, assuming it is implemented
correctly (see recent discussion on this topic).  So the availability
piece is mostly solved.  What else do you want?

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
When my information changes, I alter my conclusions.  What do you do sir?
        --attr. John Maynard Keynes

Re: multimaster (was: Slightly OT.)

From

gonzales@linuxlouis.net

Date:

01 June 2007, 13:33:44

Let me clarify - I don't know how/where the thought is that I need
something per se, I personally like deploying the scaled solutions and
playing with software that others have written.  Can I deploy the right
hardware and configurations to meet pretty much near anything anyone could
ever need?  I say yes, I can.  BUT it would so much cooler IF, there was a
multi-master environment configured for the sake of doing it.

I'm glad that not everyone has this disposition about 'why do we need it.'
The pursuit of knowledge is the facilitator of innovation.

As I mentioned, in initial post, I've got OpenLDAP integrated with
postgresql & another application, which my friends and I are going to
release soon as a service - we'll see how far that goes.  Nonetheless, I
would like to have a multi-master postrgresql cluster handling my backend
content.

Why?  Because I'd like to.  If you're happy with not using pursuing it,
great, that works for you.  Sweet! Killer!  Rock on!  My take away is, now
besides knowing their's no active pursuit on the matter, perhaps others in
this huge forum, now know more status, thanks to those of 'you' who are so
up to date!  In all sincerity, thank you for clarifying that status.

There are still advanced users our here, who would still like to see an
opensource product reach that level stature, such that those who don't
know - i.e. 'most' managers and people who have decision making power -
would not be deterred from using Postgresql because it doesn't have a
quote unquote, 'multi-mater' replication.

On Fri, 1 Jun 2007, Andrew Sullivan wrote:

> As an aside -- please don't start new topics in old threads.
>
> On Fri, Jun 01, 2007 at 10:42:02AM -0400, gonzales@linuxlouis.net wrote:
>>
>> I'm disappointed because SLONY-II has not been released yet to support
>> multi-master replication!
>
> Well, I wouldn't hold my breath.  Most of the participants in that
> project moved on, after concluding either that it wasn't going to
> solve their problems, or concluding that it'd cost too much to
> develop and support for the likely benefit it would deliver.  As near
> as I can tell, development on the project stopped.
>
> The inspiration for the Slony-II project, Postgres-R, has been ported
> forward to 8.x series by Markus Schiltknecht.  Last I heard, he was
> looking for people to underwrite his work on that project.  So if you
> really want those features, the obvious way to do it is to put a
> programmer on it, and there happens to be a programmer who has a demo
> as his argument that it can be done, and he can do it.
>
> I think you have to understand, however, that Slony-II or Postgres-R
> was not in fact the magic carpet you seem to think it was to be.
> There are some pretty significant limitations to the async
> multimaster approach it uses.  To begin with, AFAIK nobody has a
> working, production-grade group communication system available for
> use by Postgres -- the ones that the prototypes were built on were
> pretty hacky, and appeared not to be ready for prime time.  Second,
> nobody has come up with any way to make this work with READ COMMITTED
> mode, which means you pay a really huge price for the replication.
>
> My real question in all this is, "What is the problem you are trying
> to solve?"  Hot failover using combinations of hardware and software,
> and a disk array that can be mounted across two machines, is actually
> probably good enough for most cases, assuming it is implemented
> correctly (see recent discussion on this topic).  So the availability
> piece is mostly solved.  What else do you want?
>
> A
>
>

--
Louis Gonzales
louis.gonzales@linuxlouis.net
http://www.linuxlouis.net

Re: multimaster (was: Slightly OT.)

From

Andrew Sullivan

Date:

01 June 2007, 14:36:52

On Fri, Jun 01, 2007 at 12:39:42PM -0400, gonzales@linuxlouis.net wrote:

> ever need?  I say yes, I can.  BUT it would so much cooler IF,
> there was a multi-master environment configured for the sake of
> doing it.

Yes, it would be.  Unfortunately, when I was in the position where I
had to explain to my bosses that we still didn't have any
demonstrable progress after a couple of years of effort and expense,
they told me they weren't willing to continue funding that work.  I
presume other people had similar experiences with their bosses.
Moreover, because we had something else that solved our _particular_
problem -- i.e. zero-transaction-loss failure recovery within our SLA
-- it was hard to justify the continued work.

> I'm glad that not everyone has this disposition about 'why do we
> need it.' The pursuit of knowledge is the facilitator of
> innovation.

Others have argued that necessity is the mother of invention.  I have
no idea what causes better software in general; what I know is that,
for companies that are likely to spend money to pay people to work on
multimaster replication, a strong need and evidence of progress are
two necessary conditions.  Bruce mentioned (at PgCon in Ottawa) that
one of the UCB people, in handing the project off, said that Postgres
needed a few people with a lot of time rather than a lot of people
with a little time.  Multimaster replication is like that, and
therefore finding someone who will work on it (either because they're
paid to or because they are incapable of leaving the problem alone)
is going to be the requirement to get it done.

> Why?  Because I'd like to.

My point was really to ask, "How much do you want it?"  If the answer
to that is, "Enough to use it when someone else shows up with the
goods," then I'm trying to suggest that that's not enough to cause
the happy result you desire.  Someone needs to do the work, which
means that someone has to want the result badly enough to put the
time and talent into getting the result.

> would not be deterred from using Postgresql because it doesn't have a
> quote unquote, 'multi-mater' replication.

It is certainly true today that if what you really need is RAC, you
have exactly one company to go to, and that's Oracle.  If you need
something else, there might in fact be a Postgres answer for you.
Part of the problem here is that people often say "multimaster"
without stating what they're asking for.

Does MySQL have multimaster replication?  Well, yes and no, depending
on which set of features you might want to use and what data breakage
you're willing to tolerate.

What about SQL Server?  See above.

Ok, what about DB2? Well, no, not really, but it scales well and you
can do HA with it using hot-standby failover clusters at the OS
level, if you believe their literature.  There's also some component
that sort of hangs on the side and does some data synchronisation
between points, allowing something that looks like multimaster but
sounds like it has some nasty gotchas (and I get that feeling just
from reading the brochures.  OTOH, I am not completely rational when
it comes to claims made by IBM).  You can in fact build a Postgres
system that does the same things today, though.

These are all different solutions to different problems, so it's not
surprising that they look different.  This was the reason I asked,
"What is the problem you are trying to solve?"

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
Unfortunately reformatting the Internet is a little more painful
than reformatting your hard drive when it gets out of whack.
        --Scott Morris

Re: Slightly OT.

From

Jeff Davis

Date:

01 June 2007, 15:14:42

On Fri, 2007-06-01 at 17:00 +0200, Alexander Staubo wrote:
> the projected Slony-II design, but the setup seems dead simple, and
> from the docs I have found it seems to transparently replicate schema
> changes, unlike Slony-I. So that's something.
>

To be fair to Slony-I, the fact that it does not replicate DDL is a
feature, not a bug. It's table-based, which is a very flexible design.

Regards,
    Jeff Davis

Re: multimaster (was: Slightly OT.)

From

"Alexander Staubo"

Date:

01 June 2007, 15:40:24

On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
> These are all different solutions to different problems, so it's not
> surprising that they look different.  This was the reason I asked,
> "What is the problem you are trying to solve?"

You mean aside from the obvious one, scalability?

The databases is becoming a bottleneck for a lot of so-called "Web
2.0" apps which use a shared-nothing architecture (such as Rails,
Django or PHP) in conjunction with a database. Lots of ad-hoc database
queries that come not just from web hits but also from somewhat
awkwardly fitting an object model onto a relational database.

These "new" apps are typically intensely personal and contextual,
where every page is personalized for the visiting user, and doing a
whole bunch of crazy multijoin queries to fetch the latest posts, the
most recent recommendations from your friends, the most highly rated
stuff. In fact, merely doing something seemingly simple like
incrementing a row's counter every time a post has been viewed is
eventually going to have a negative performance impact on a
traditional OLTP-optimized relational database.

I'm sure some people would disagree with the significance of the above
(possibly by replying that a relational database is the wrong kind of
tool for such apps), or that there is an urgent need to scale beyond
the single server, but I would hope that there would, at some point,
appear a solution that could enable a database to scale horizontally
with minimal impact on the application. In light of this need, I think
we could be more productive by rephrasing the question "how/when we
can implement multimaster replication?" as "how/when can we implement
horizontal scaling?".

As it stands today, horizontally partitioning a database into multiple
separate "shards" is incredibly invasive on the application
architecture, and typically relies on brittle and non-obvious hacks
such as configuring sequence generators with staggered starting
numbers, omitting referential integrity constraints, sacrificing
transactional semantics, and moving query aggregation into the app
level. On top of this, dumb caches such as Memcached are typically
layered to avoid hitting the database in the first place.

Still, with MySQL and a bit of glue, guys like eBay, Flickr and
MySpace are partitioning their databases relatively successfully using
such tricks. These guys are not average database users, but not they
are not the only ones that have suffered from database bottlenecks and
overcome them using clever, if desperate, measures. Cal Henderson (or
was it Stewart Butterfield?) of Flickr has famously said he would
never again start a project that didn't have a partitioning from the
start.

I would love to see a discussion about how PostgreSQL could address
these issues.

Alexander.

Re: multimaster

From

"Joshua D. Drake"

Date:

01 June 2007, 15:49:25

Alexander Staubo wrote:
> On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
>> These are all different solutions to different problems, so it's not
>> surprising that they look different.  This was the reason I asked,
>> "What is the problem you are trying to solve?"
>
> You mean aside from the obvious one, scalability?

Multimaster doesn't give you scalability (at least not like a lot of
people think it does).

>
> The databases is becoming a bottleneck for a lot of so-called "Web
> 2.0" apps which use a shared-nothing architecture (such as Rails,
> Django or PHP) in conjunction with a database. Lots of ad-hoc database
> queries that come not just from web hits but also from somewhat
> awkwardly fitting an object model onto a relational database.
>

Databases are a bottleneck when you get a bunch of so called web 2.0
developers thinking they know an inch about databases.

What you are basically saying below is... web 2.0 developers such as
rails developers have so fundamentally broken the way it is supposed to
be done, we should too...

Not too convincing.

>
> As it stands today, horizontally partitioning a database into multiple
> separate "shards" is incredibly invasive on the application
> architecture, and typically relies on brittle and non-obvious hacks
> such as configuring sequence generators with staggered starting
> numbers, omitting referential integrity constraints, sacrificing
> transactional semantics, and moving query aggregation into the app
> level. On top of this, dumb caches such as Memcached are typically
> layered to avoid hitting the database in the first place.
>
> Still, with MySQL and a bit of glue, guys like eBay, Flickr and
> MySpace are partitioning their databases relatively successfully using
> such tricks. These guys are not average database users, but not they
> are not the only ones that have suffered from database bottlenecks and
> overcome them using clever, if desperate, measures. Cal Henderson (or
> was it Stewart Butterfield?) of Flickr has famously said he would
> never again start a project that didn't have a partitioning from the
> start.
>
> I would love to see a discussion about how PostgreSQL could address
> these issues.
>
> Alexander.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>               http://archives.postgresql.org/
>


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: Slightly OT.

From

"Alexander Staubo"

Date:

01 June 2007, 15:57:42

On 6/1/07, Jeff Davis <pgsql@j-davis.com> wrote:
> On Fri, 2007-06-01 at 17:00 +0200, Alexander Staubo wrote:
> > the projected Slony-II design, but the setup seems dead simple, and
> > from the docs I have found it seems to transparently replicate schema
> > changes, unlike Slony-I. So that's something.
>
> To be fair to Slony-I, the fact that it does not replicate DDL is a
> feature, not a bug. It's table-based, which is a very flexible design.

I fail to see how that's an excuse not to replicate DDL. If I run
"alter table" on the master, there is no reason whatever that this
command cannot be executed on all the slaves -- which is what I would
expect of a replication system.

To put it differently: A slave's table is a replica of the master's
table; if I alter the master table, and the slave is not updated to
reflect this change, then the slave table is no longer a true replica,
and the system has failed its core purpose, that of *replicating*.

I could be wrong, but I believe Slony fails at this because it is
trigger-based and simply cannot detect DDL changes.

Alexander.

Re: Slightly OT.

From

"Joshua D. Drake"

Date:

01 June 2007, 16:09:39

Alexander Staubo wrote:
> On 6/1/07, Jeff Davis <pgsql@j-davis.com> wrote:
>> On Fri, 2007-06-01 at 17:00 +0200, Alexander Staubo wrote:
>> > the projected Slony-II design, but the setup seems dead simple, and
>> > from the docs I have found it seems to transparently replicate schema
>> > changes, unlike Slony-I. So that's something.
>>
>> To be fair to Slony-I, the fact that it does not replicate DDL is a
>> feature, not a bug. It's table-based, which is a very flexible design.
>
> I fail to see how that's an excuse not to replicate DDL. If I run
> "alter table" on the master, there is no reason whatever that this
> command cannot be executed on all the slaves -- which is what I would
> expect of a replication system.

As the owner of a company that actually actively developing a
replication system and has for years... I suggest you start putting your
code where your words are.

This is not nearly as simple as it seems. There is a reason that Slony
attempts to do it in user space instead of postgresql space.

Joshua D. Drake


>
> To put it differently: A slave's table is a replica of the master's
> table; if I alter the master table, and the slave is not updated to
> reflect this change, then the slave table is no longer a true replica,
> and the system has failed its core purpose, that of *replicating*.
>
> I could be wrong, but I believe Slony fails at this because it is
> trigger-based and simply cannot detect DDL changes.
>
> Alexander.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
>


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: Slightly OT.

From

Andrew Sullivan

Date:

01 June 2007, 16:35:49

On Fri, Jun 01, 2007 at 08:57:36PM +0200, Alexander Staubo wrote:
> I fail to see how that's an excuse not to replicate DDL. If I run
> "alter table" on the master, there is no reason whatever that this
> command cannot be executed on all the slaves -- which is what I would
> expect of a replication system.

There is a way to replicate the DDL.  A well-documented way.  It's
sort of ugly, I think everyone will admit, but it was the path chosen
because it allowed the entire thing to fit in user space, which meant
it was possible to install it on an unpatched PostgreSQL that was
already deployed in the field.  That's a non-zero benefit.

> I could be wrong, but I believe Slony fails at this because it is
> trigger-based and simply cannot detect DDL changes.

No, there were in fact alternatives (like, for instance, patching the
back end code).  But that was undesirable for the reason I note
above.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.
                --Brad Holland

Re: multimaster

From

Andrew Sullivan

Date:

01 June 2007, 16:46:01

On Fri, Jun 01, 2007 at 11:50:09AM -0700, Joshua D. Drake wrote:
> What you are basically saying below is... web 2.0 developers such as
> rails developers have so fundamentally broken the way it is supposed to
> be done, we should too...

I don't know that's all there is to it.  After all, we support arrays
in the database, which ought to make any relational purist shudder.
It seems to me that there _are_ legitimate limits to single-back-end
approaches, even for well designed systems, and being able to spread
some of that load around would be a big advantage.  I just don't want
to do it without thinking hard about what the compromises might be,
and figuring out which ones to make.  "Breaks transactional model",
for instance, is right out :)

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
I remember when computers were frustrating because they *did* exactly what
you told them to.  That actually seems sort of quaint now.
        --J.D. Baldwin

Re: multimaster (was: Slightly OT.)

From

Andrew Sullivan

Date:

01 June 2007, 16:46:07

On Fri, Jun 01, 2007 at 08:40:13PM +0200, Alexander Staubo wrote:
> On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
> >These are all different solutions to different problems, so it's not
> >surprising that they look different.  This was the reason I asked,
> >"What is the problem you are trying to solve?"
>
> You mean aside from the obvious one, scalability?

Why is that the "obvious one"?  If that's your problem, say so.  I
have different problems.  I don't need 30 back end machines to keep
my website running.  Something obvious in one context is a misfeature
of pointless complication in another.

> appear a solution that could enable a database to scale horizontally
> with minimal impact on the application. In light of this need, I think
> we could be more productive by rephrasing the question "how/when we
> can implement multimaster replication?" as "how/when can we implement
> horizontal scaling?".

Indeed, this may well be a different problem.  In fact, if what you
want is "to scale horizontally with minimal impact on the
application", I encourage you to go out and buy the first database
replication system that will actually do that for you.  Not the one
that _tells_ you they can, the one that actually does.

I agree that horizontal scaling is a desirable feature, but I don't
think it obvious that multimaster replication, whatever that means,
is the thing that will solve that problem.

> I would love to see a discussion about how PostgreSQL could address
> these issues.

Well, a good start would be to list what exactly you do and do not
mean by horizontal scaling: what is the behaviour under various
scenarios.  That's a good way to list at least what the problem is.
(Your mail was a good start, but only a start.  Is RI required across
nodes?  Why not?  Why?  What is allowed to break?  &c.)

A
--
Andrew Sullivan  | ajs@crankycanuck.ca
The fact that technology doesn't work is no bar to success in the marketplace.
        --Philip Greenspun

Re: Slightly OT.

From

Tom Lane

Date:

01 June 2007, 16:51:08

"Alexander Staubo" <alex@purefiction.net> writes:
> On 6/1/07, Jeff Davis <pgsql@j-davis.com> wrote:
>> To be fair to Slony-I, the fact that it does not replicate DDL is a
>> feature, not a bug. It's table-based, which is a very flexible design.

> I fail to see how that's an excuse not to replicate DDL.
> I could be wrong, but I believe Slony fails at this because it is
> trigger-based and simply cannot detect DDL changes.

You are wrong.  The Slony guys say this is intentional, and they have
some good arguments.  They may be making a virtue of necessity, but
automatic replication of DDL is not nearly as open-and-shut a decision
as you paint it.

            regards, tom lane

Re: multimaster

From

"Joshua D. Drake"

Date:

01 June 2007, 17:01:39

Andrew Sullivan wrote:
> On Fri, Jun 01, 2007 at 11:50:09AM -0700, Joshua D. Drake wrote:
>> What you are basically saying below is... web 2.0 developers such as
>> rails developers have so fundamentally broken the way it is supposed to
>> be done, we should too...
>
> I don't know that's all there is to it.  After all, we support arrays
> in the database, which ought to make any relational purist shudder.
> It seems to me that there _are_ legitimate limits to single-back-end
> approaches, even for well designed systems, and being able to spread
> some of that load around would be a big advantage.

I am not arguing against balancing load. I am arguing against the idea
that multi-master is some holy grail.

Sincerely,

Joshua D. Drake

> I just don't want
> to do it without thinking hard about what the compromises might be,
> and figuring out which ones to make.  "Breaks transactional model",
> for instance, is right out :)
>
> A
>


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: Slightly OT.

From

Chris Browne

Date:

01 June 2007, 17:04:07

gonzales@linuxlouis.net writes:
> I'm disappointed because SLONY-II has not been released yet to support
> multi-master replication!  PostgreSQL is going through all of the
> releases - and that's great - BUT, where is the sync-up with the
> powerhouse of a component, that Slony-II would bring to the table?
> Slony-I is pretty sweet, but if Slony-II would release, I can imagine
> that this would introduce some major competition in the enterprise
> world against the commercial dyno's.

There is some effort still going into Postgres-R from a research
perspective, but as far as I know, nobody has been working on Slony-II
for well over a year now.

Unfortunately, a combination of factors went together to make it "not
workable."

- Spread licensing was something of an issue;
- Spread scalability caused quite a few issues (it's apparently
  tough to make it stable when using it under HEAVY load);
- There was a token passing bottleneck in Spread limiting
  its performance;
- New application failure scenarios emerged that wouldn't have
  take place in non-MM systems.

The issues built to the point of making it unworthwhile to continue
development effort :-(.

If someone completed a suitable reimplementation of the wheel on GCS,
and produced something more usable for the purpose, such as the
(theorized) Anasazi system, it might be worth proceeding again.

http://www.lethargy.org/~jesus/archives/53-Lets-reimplement-the-wheel...-or-at-least-another-GCS..html

But it's fair to say that reality did not live up to the early hopes.
Supposing we came up with a Wicked Better GCS, that might merely allow
efforts to get a bit further, and then hang up on something else.
Don't hold your breath expecting Slony-II to be around any corners...
--
let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;;
http://linuxfinances.info/info/sap.html
"It seems  certain that much of  the success of Unix  follows from the
readability, modifiability, and portability of its software."
-- Dennis M. Ritchie, September, 1979

Re: multimaster

From

Chris Browne

Date:

01 June 2007, 17:15:55

alex@purefiction.net ("Alexander Staubo") writes:
> On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
>> These are all different solutions to different problems, so it's not
>> surprising that they look different.  This was the reason I asked,
>> "What is the problem you are trying to solve?"
>
> You mean aside from the obvious one, scalability?

I'd have to call that expectation "obviously WRONG."

It was *CERTAIN* that Slony-II, if it had turned out as good as the
*most optimistic hopes* were going, would have some substantial losses
of performance compared to a single DB instance due to the need to
apply locks across all nodes.

There would be *some* scalability gains to be had, but the primary
reason for looking for multimaster replication is that you need high
availability so badly that you are willing to give up performance to
get it.

> As it stands today, horizontally partitioning a database into multiple
> separate "shards" is incredibly invasive on the application
> architecture, and typically relies on brittle and non-obvious hacks
> such as configuring sequence generators with staggered starting
> numbers, omitting referential integrity constraints, sacrificing
> transactional semantics, and moving query aggregation into the app
> level. On top of this, dumb caches such as Memcached are typically
> layered to avoid hitting the database in the first place.

Question: In what way would you expect an attempt to do
mostly-trying-to-be-transparent multimaster replication to help with
these issues you're bringing up?

Slony-II was trying to provide answers to various of those
"non-obvious hacks"; various of those things point at areas where it
would *have to* be somewhat slow.

> Still, with MySQL and a bit of glue, guys like eBay, Flickr and
> MySpace are partitioning their databases relatively successfully using
> such tricks. These guys are not average database users, but not they
> are not the only ones that have suffered from database bottlenecks and
> overcome them using clever, if desperate, measures. Cal Henderson (or
> was it Stewart Butterfield?) of Flickr has famously said he would
> never again start a project that didn't have a partitioning from the
> start.
>
> I would love to see a discussion about how PostgreSQL could address
> these issues.

Partitioning isn't multimaster replication; it's something worthy of
having a discussion independent of anything about MMR.
--
(reverse (concatenate 'string "ofni.sesabatadxunil" "@" "enworbbc"))
http://linuxdatabases.info/info/sap.html
"There  are  almost  unlimited  ways  for making  your  programs  more
complicated or bizarre" -- Arthur Norman

Continuous PITR (was Re: multimaster)

From

Ron Johnson

Date:

01 June 2007, 18:01:34

On 06/01/07 11:16, Andrew Sullivan wrote:
[snip]
>
> My real question in all this is, "What is the problem you are trying
> to solve?"  Hot failover using combinations of hardware and software,
> and a disk array that can be mounted across two machines, is actually
> probably good enough for most cases, assuming it is implemented
> correctly (see recent discussion on this topic).  So the availability
> piece is mostly solved.  What else do you want?

What you need are disk blocks to be mirrored to a machine at the DR
site.  Or "continuous PITR" to the DR machine.

How difficult would it be to modify the process (the postmaster?)
that writes the xlogs(?) to tee them to a listening process across
the cloud on the DR machine, which then applies them to the DR database?

This then begs the question: are CREATE|ALTER TABLESPACE commands
stored in the xlogs?

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

Re: Slightly OT.

From

"Alexander Staubo"

Date:

01 June 2007, 18:08:57

On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote:
> As the owner of a company that actually actively developing a
> replication system and has for years... I suggest you start putting your
> code where your words are.

That doesn't make any sense. As a database *user* it's my prerogative
to criticize the bits that make my life painful. Intentional or not,
the Slony design compromises its user-friendliness.

I would love for the answer to have been "sorry, we did not have time
or manpower enough to implement fully transparent replication yet,
because it's a rather complex, you see"; but it's not, and I balk at
the idea that you cannot strive for something better.

For example, there is clearly an opportunity to implement the
appropriate hooks in PostgreSQL that can be used *if they are
available*; otherwise, on unpatched/older systems, require the use of
the slonik command.

Alexander.

Re: Slightly OT.

From

"Alexander Staubo"

Date:

01 June 2007, 18:22:26

On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
> > I could be wrong, but I believe Slony fails at this because it is
> > trigger-based and simply cannot detect DDL changes.
>
> No, there were in fact alternatives (like, for instance, patching the
> back end code).  But that was undesirable for the reason I note
> above.

Curiously enough, that does not conflict with anything I wrote. I am,
clearly, not wrong: A deliberate decision was made not to patch
PostgreSQL with the hooks Slony would need to support DDL changes;
therefore, since it relies purely on triggers, it cannot detect DDL
changes.

Alexander.

Re: Continuous PITR (was Re: multimaster)

From

Andrew Sullivan

Date:

01 June 2007, 18:26:11

On Fri, Jun 01, 2007 at 03:58:01PM -0500, Ron Johnson wrote:
> What you need are disk blocks to be mirrored to a machine at the DR
> site.  Or "continuous PITR" to the DR machine.

I thought you could already do this?  (I'm not, but I was pretty sure
someone reported doing it already.)

> This then begs the question: are CREATE|ALTER TABLESPACE commands
> stored in the xlogs?

(I'll spare the rant about begging the question.)  Since they're
transactional, they must be, no?

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
When my information changes, I alter my conclusions.  What do you do sir?
        --attr. John Maynard Keynes

Re: Slightly OT.

From

Andrew Sullivan

Date:

01 June 2007, 18:31:32

On Fri, Jun 01, 2007 at 11:08:50PM +0200, Alexander Staubo wrote:
> That doesn't make any sense. As a database *user* it's my prerogative
> to criticize the bits that make my life painful.

Sure.  And as a user of free software, it is your prerogative to
propose a way that the software can be modified to make it do what
you want, and to do the work to so modify it.  Does this mean you are
offering?

> For example, there is clearly an opportunity to implement the
> appropriate hooks in PostgreSQL that can be used *if they are
> available*; otherwise, on unpatched/older systems, require the use of
> the slonik command.

I don't know that that is clear at all.  To begin with, one would
have to get agreement on what those hooks would be.  If you look on
the pgfoundry site, you'll note that I set up a project there to try
to get such a list of hooks defined.  It went nowhere: everyone who
was working on replication said it was premature and impossible to do
this in advance and such like.  Moreover, what you are suggesting is
a _massive_ increase in the complications of the code, because it
suggests to me that you want DDL to happen as easily as it does in
single-node cases.  But it's not that easy, which is another part of
the reason DDL is handled specially.  If you don't know what the hard
parts are, I suggest you go and read the rather detailed original
concept document that Jan put together for the community prior to
starting work on the system.  But just as a teaser: what do you do if
your DDL on the local node has succeeded, and you added additional
data in the same transaction, but the DDL fails for some reason on a
remote node?  Note that this one isn't even one of the actually
tricky cases.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.
        --Alexander Hamilton

Re: Slightly OT.

From

Gregory Stark

Date:

01 June 2007, 19:01:40

"Alexander Staubo" <alex@purefiction.net> writes:

> I would love for the answer to have been "sorry, we did not have time
> or manpower enough to implement fully transparent replication yet,
> because it's a rather complex, you see";

Would you still love that if you're one of the people who use replication to
move the data to a reporting database which has a modified schema appropriate
for the different usage? This improvement would make it useless for that
purpose.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Slightly OT.

From

"Alexander Staubo"

Date:

01 June 2007, 19:05:27

On 6/2/07, Gregory Stark <stark@enterprisedb.com> wrote:
> "Alexander Staubo" <alex@purefiction.net> writes:
>
> > I would love for the answer to have been "sorry, we did not have time
> > or manpower enough to implement fully transparent replication yet,
> > because it's a rather complex, you see";
>
> Would you still love that if you're one of the people who use replication to
> move the data to a reporting database which has a modified schema appropriate
> for the different usage? This improvement would make it useless for that
> purpose.

All you would require is a simple boolean flag to enable or disable
automatic DDL propagation, surely. Clearly people use replication for
different purposes; the current system favours people who prefer to
handle DDL propagation manually, and I am not one of them.

Alexander.

Re: Continuous PITR (was Re: multimaster)

From

Ron Johnson

Date:

01 June 2007, 19:15:34

On 06/01/07 16:25, Andrew Sullivan wrote:
> On Fri, Jun 01, 2007 at 03:58:01PM -0500, Ron Johnson wrote:
>> What you need are disk blocks to be mirrored to a machine at the DR
>> site.  Or "continuous PITR" to the DR machine.
>
> I thought you could already do this?  (I'm not, but I was pretty sure
> someone reported doing it already.)

 From within PG, or "manually"?

>> This then begs the question: are CREATE|ALTER TABLESPACE commands
>> stored in the xlogs?
>
> (I'll spare the rant about begging the question.)  Since they're
> transactional, they must be, no?

I wasn't sure whether they were a special case or not.

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

Re: Slightly OT.

From

"Alexander Staubo"

Date:

01 June 2007, 19:24:20

On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
> On Fri, Jun 01, 2007 at 11:08:50PM +0200, Alexander Staubo wrote:
> > That doesn't make any sense. As a database *user* it's my prerogative
> > to criticize the bits that make my life painful.
>
> Sure.  And as a user of free software, it is your prerogative to
> propose a way that the software can be modified to make it do what
> you want, and to do the work to so modify it.  Does this mean you are
> offering?

It's my prerogative, but not my moral obligation. I have no intention
of becoming a Slony developer. I use Slony because I have no choice,
and I would not have touched it with a bargepole if I did not need to.
(I do contribute to projects that I do enjoy using.) On the other
hand, if my current project goes well, I hope that I will be able to
persuade my partners to set aside some cash to fund improvements to
PostgreSQL and/or Slony.

> > For example, there is clearly an opportunity to implement the
> > appropriate hooks in PostgreSQL that can be used *if they are
> > available*; otherwise, on unpatched/older systems, require the use of
> > the slonik command.
>
> I don't know that that is clear at all.
[snip]

Perhaps not, but all I wrote was there is an *opportunity*, technical
complexities aside, in an effort to provide *constructive* criticism.
I said this based on your previous comment, that "the path [was]
chosen because it allowed the entire thing to fit in user space, which
meant it was possible to install it on an unpatched PostgreSQL", which
implies that patching PostgreSQL is a possibility that may be
considered.

> To begin with, one would
> have to get agreement on what those hooks would be.  If you look on
> the pgfoundry site, you'll note that I set up a project there to try
> to get such a list of hooks defined.  It went nowhere:
[snip]

For the record, I really appreciate the effort you made. I wasn't
paying attention to pgsql-hackers at the time, so I missed the
somewhat depressing discussion you had with Markus Schiltknecht.

> Moreover, what you are suggesting is
> a _massive_ increase in the complications of the code

That, incidentally, is the kind of thing I want to hear, as opposed to
"you are wrong", which is neither helpful nor polite.

[snip]
> the reason DDL is handled specially.  If you don't know what the hard
> parts are, I suggest you go and read the rather detailed original
> concept document that Jan put together for the community prior to
> starting work on the system.  But just as a teaser: what do you do if
> your DDL on the local node has succeeded, and you added additional
> data in the same transaction, but the DDL fails for some reason on a
> remote node?  Note that this one isn't even one of the actually
> tricky cases.

Could you not (I ask naively) detect the first DDL statement is
submitted in a transaction on the master, then start a transaction on
each slave, then funnel this and all subsequent statements
synchronously to every nodes, then prepare and commit everyone?

Mind you, I profess virtual ignorance of the numerous border cases
involved, so do go ahead and tell me how wrong I am and how silly I am
for suggesting I could have the balls to even consider suggesting
something like this. :)

This suggestion implies transparent DDL replication would be
synchronous, which seems like a decent compromise when you can ensure
that all nodes are committed atomically, and virtually guaranteed to
do so. That would be an improvement on the current behaviour (section
15 of the Slony manual):

"If there is anything broken about the script, or about how it
executes on a particular node, this will cause the slon daemon for
that node to panic and crash"

Alexander.

Re: multimaster

From

Guy Rouillier

Date:

01 June 2007, 19:24:38

Alexander Staubo wrote:

> As it stands today, horizontally partitioning a database into multiple
> separate "shards" is incredibly invasive on the application
> architecture, and typically relies on brittle and non-obvious hacks
> such as configuring sequence generators with staggered starting
> numbers, omitting referential integrity constraints, sacrificing
> transactional semantics, and moving query aggregation into the app
> level. On top of this, dumb caches such as Memcached are typically
> layered to avoid hitting the database in the first place.

Have you looked at raidb?  http://c-jdbc.objectweb.org.  Narrow niche,
but if it happens to be the one you are in, then it's an option.  I took
a quick look at the user's page, and both of them were using PostgreSQL.

--
Guy Rouillier

Re: Slightly OT.

From

Andrew Sullivan

Date:

01 June 2007, 19:25:43

On Sat, Jun 02, 2007 at 12:05:20AM +0200, Alexander Staubo wrote:
> All you would require is a simple boolean flag to enable or disable
> automatic DDL propagation, surely.

You know, it is just possible that some of the responses you are
getting in this thread have to do with the glib way you say "just a
simple flag", waving away all the corner cases and difficult parts.

What do you do, for instance, when your automatic DDL wedges the
replication system after data-replicating events have come in?  What
do you do when there happens to be a schema mismatch that you didn't
know about?  (How do you even detect such a thing?)  That isn't a
SMOP: it requires design that is not trivial, and you don't seem to
be spending any time thinking about those issues before brushing off
the current design as some sort of nasty thoughtless attempt to make
your life more difficult on the part of those who have worked on the
system.  DDL changes require that every node have the new schema
before any of the node-affecting data gets there.  We have _enough_
problems with DDL failing on target systems without increasing this
problem tenfold by doing it automatically.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
The whole tendency of modern prose is away from concreteness.
        --George Orwell

Re: Slightly OT.

From

Andrew Sullivan

Date:

01 June 2007, 19:31:46

On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote:
> Could you not (I ask naively) detect the first DDL statement is
> submitted in a transaction

Maybe.

> on the master, then start a transaction on
> each slave, then funnel this and all subsequent statements
> synchronously to every nodes, then prepare and commit everyone?

You could if 2PC was ubiquitous, which is certainly wasn't when the
code was designed (remember, it was originally compatible all the way
back to 7.3).  Some people suggested using 2PC "if it's there", but
that just seems to me to be asking for really painful problems.  It
also entails that all DDL has to happen on every node at the same
time, which imposes a bottleneck not actually currently in the
system.

It is probably the case, however, that version 2 of the system will
break some of these backwards compatibility attempts in order to
depend on some new back end features -- putting this entirely in user
space turns out to be awful.  It's how we got the monstrous catalog
corruption hack.

This is getting pretty Slony specific, though, so if we're to
continue this thread, I suggest we do it on the Slony list.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
"The year's penultimate month" is not in truth a good way of saying
November.
        --H.W. Fowler

Re: Slightly OT.

From

Jeff Davis

Date:

01 June 2007, 20:01:46

On Sat, 2007-06-02 at 00:05 +0200, Alexander Staubo wrote:
> On 6/2/07, Gregory Stark <stark@enterprisedb.com> wrote:
> > "Alexander Staubo" <alex@purefiction.net> writes:
> >
> > > I would love for the answer to have been "sorry, we did not have time
> > > or manpower enough to implement fully transparent replication yet,
> > > because it's a rather complex, you see";
> >
> > Would you still love that if you're one of the people who use replication to
> > move the data to a reporting database which has a modified schema appropriate
> > for the different usage? This improvement would make it useless for that
> > purpose.
>
> All you would require is a simple boolean flag to enable or disable
> automatic DDL propagation, surely. Clearly people use replication for
> different purposes; the current system favours people who prefer to
> handle DDL propagation manually, and I am not one of them.
>

Here is some work going on that looks like what you want:

http://archives.postgresql.org/pgsql-hackers/2007-03/msg00050.php
http://momjian.postgresql.org/cgi-bin/pgtodo?pitr

You might also seriously consider PgPool-II.

There are many, many approaches to replication, high-availability, and
related topics. For almost any combination of log-based, query-based,
trigger-based and synchronous, asynchronous and master/master,
master/slave and shared storage, local storage -- there is some PG
person working on it.

They all have advantages and disadvantages. The one, specific type of
replication that suits you is not necessarily what everyone else wants.

Try to understand that some of these options are built around businesses
out of *need* rather than want. Businesses *need* functionality and
flexibility. Administrative simplicity and conveniences (like replicated
DDL) are obviously not the only goals of something like Slony-I.

Regards,
    Jeff Davis

One last Slony question (was Re: Slightly OT.)

From

Ron Johnson

Date:

01 June 2007, 20:19:19

On 06/01/07 17:31, Andrew Sullivan wrote:
> On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote:
>> Could you not (I ask naively) detect the first DDL statement is
>> submitted in a transaction
>
> Maybe.
>
>> on the master, then start a transaction on
>> each slave, then funnel this and all subsequent statements
>> synchronously to every nodes, then prepare and commit everyone?
>
> You could if 2PC was ubiquitous, which is certainly wasn't when the
> code was designed (remember, it was originally compatible all the way
> back to 7.3).  Some people suggested using 2PC "if it's there", but
> that just seems to me to be asking for really painful problems.  It
> also entails that all DDL has to happen on every node at the same
> time, which imposes a bottleneck not actually currently in the
> system.

Since DDL is infrequent, is that bottleneck an acceptable trade-off?

> It is probably the case, however, that version 2 of the system will
> break some of these backwards compatibility attempts in order to
> depend on some new back end features -- putting this entirely in user
> space turns out to be awful.  It's how we got the monstrous catalog
> corruption hack.
>
> This is getting pretty Slony specific, though, so if we're to
> continue this thread, I suggest we do it on the Slony list.



--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

Re: Slightly OT.

From

"Alexander Staubo"

Date:

01 June 2007, 20:30:59

On 6/2/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
> On Sat, Jun 02, 2007 at 12:05:20AM +0200, Alexander Staubo wrote:
> > All you would require is a simple boolean flag to enable or disable
> > automatic DDL propagation, surely.
>
> You know, it is just possible that some of the responses you are
> getting in this thread have to do with the glib way you say "just a
> simple flag", waving away all the corner cases and difficult parts.

Or maybe I'm focused on the end-user experience and trying to mentally
fit the technology to that idea rather than the other way around. Too
much software is designed by developers, and too little software is
designed for users.

I don't mean to be glib at all -- the person was asking, in effect,
"but this hypothetical feature X would break case Y", to which I
suggested a flag to turn X on or off. PostgreSQL has plenty of flags
that turn features on and off. Whether or not the hypothetical feature
is hard to implement, or should be implemented, is an orthogonal
concern.

> What do you do, for instance, when your automatic DDL wedges the
> replication system after data-replicating events have come in?

There needs to be a point of synchronization when a DDL transaction
appears that blocks further write transactions from running. As far as
I can tell, the slaves themselves can continue to receive pending
events, but perhaps not. As far as I can tell, table locking on the
master ensures that concurrent, not-yet-committed transactions started
*before* the DDL transaction will block the DDL itself, but I'm sure
there's a gap here that could lead to craziness; it seems to me that
you could detect it reliably and throw an error when it happens.

I admit I am at a disadvantage here. I never intended to enter into a
full-fledged discussion about implementation details, but you teased
me, dammit. I clearly don't have enough knowledge about the way Slony
works, but that does not disqualify me from suggesting that the
current behaviour is unfriendly. Certainly there are technical and
historic facts that explain this behaviour, but I see nothing wrong
with challenging those notions in the name of helping the situation,
at least according to my own aesthetics.

Last I checked, nobody was actually terribly *happy* about having to
pipe schema changes through slonik. Of course, as far as I can see
(especially with regard to this thread), nobody seems to mind all that
much, either, but then again people put up with a lot of crap before
we had, say, sliced bread or psql tab completion.

> [...] before brushing off
> the current design as some sort of nasty thoughtless attempt to make
> your life more difficult on the part of those who have worked on the
> system.

I'm not that paranoid. :)

Alexander.

Re: One last Slony question (was Re: Slightly OT.)

From

"Joshua D. Drake"

Date:

01 June 2007, 20:34:42

Ron Johnson wrote:
> On 06/01/07 17:31, Andrew Sullivan wrote:
>> On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote:
>>> Could you not (I ask naively) detect the first DDL statement is
>>> submitted in a transaction
>>
>> Maybe.
>>
>>> on the master, then start a transaction on
>>> each slave, then funnel this and all subsequent statements
>>> synchronously to every nodes, then prepare and commit everyone?
>>
>> You could if 2PC was ubiquitous, which is certainly wasn't when the
>> code was designed (remember, it was originally compatible all the way
>> back to 7.3).  Some people suggested using 2PC "if it's there", but
>> that just seems to me to be asking for really painful problems.  It
>> also entails that all DDL has to happen on every node at the same
>> time, which imposes a bottleneck not actually currently in the
>> system.
>
> Since DDL is infrequent, is that bottleneck an acceptable trade-off?

Define infrequent? I have customers that do it, everyday in prod. They
do it willingly and refuse to change that habit.

Joshua D. Drake



--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: Slightly OT.

From

"Alexander Staubo"

Date:

01 June 2007, 20:44:59

On 6/2/07, Jeff Davis <pgsql@j-davis.com> wrote:
> Here is some work going on that looks like what you want:
>
> http://archives.postgresql.org/pgsql-hackers/2007-03/msg00050.php

I had no idea someone was working on WAL-log-based replication; I saw
the TODO entry a while ago, but I missed the thread. I think WAL
replication is a beautiful idea, so I'll gladly throw my support
behind this. Thanks for the pointer.

> You might also seriously consider PgPool-II.

pgpool-II seems like a decent idea. I'm not sure if the partitioning
can support referential integrity though -- would they have to be
declared as CHECK constraints that used dblink()?

Also, it doesn't seem capable of planning a query intelligently, which
means that a query such as "select * from foo where id = 123" is going
to be aggregated across all nodes even though only one node has the
partition covering the id range [0, 1000], say.

Alexander.

Re: One last Slony question (was Re: Slightly OT.)

From

Ron Johnson

Date:

01 June 2007, 20:55:33

On 06/01/07 18:35, Joshua D. Drake wrote:
> Ron Johnson wrote:
>> On 06/01/07 17:31, Andrew Sullivan wrote:
>>> On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote:
>>>> Could you not (I ask naively) detect the first DDL statement is
>>>> submitted in a transaction
>>>
>>> Maybe.
>>>
>>>> on the master, then start a transaction on
>>>> each slave, then funnel this and all subsequent statements
>>>> synchronously to every nodes, then prepare and commit everyone?
>>>
>>> You could if 2PC was ubiquitous, which is certainly wasn't when the
>>> code was designed (remember, it was originally compatible all the way
>>> back to 7.3).  Some people suggested using 2PC "if it's there", but
>>> that just seems to me to be asking for really painful problems.  It
>>> also entails that all DDL has to happen on every node at the same
>>> time, which imposes a bottleneck not actually currently in the
>>> system.
>>
>> Since DDL is infrequent, is that bottleneck an acceptable trade-off?
>
> Define infrequent? I have customers that do it, everyday in prod. They
> do it willingly and refuse to change that habit.

Even 2 or 3 ALTER TABLE or CREATE INDEX or CREATE TABLE statements
per day is a drop in the bucket compared to the number of I/U/D
statements, no?

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

Re: One last Slony question (was Re: Slightly OT.)

From

"Joshua D. Drake"

Date:

01 June 2007, 21:16:58

Ron Johnson wrote:
> On 06/01/07 18:35, Joshua D. Drake wrote:
>>>
>>> Since DDL is infrequent, is that bottleneck an acceptable trade-off?
>>
>> Define infrequent? I have customers that do it, everyday in prod. They
>> do it willingly and refuse to change that habit.
>
> Even 2 or 3 ALTER TABLE or CREATE INDEX or CREATE TABLE statements per
> day is a drop in the bucket compared to the number of I/U/D statements, no?

True.

J


>


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

Re: multimaster

From

"Alexander Staubo"

Date:

01 June 2007, 21:27:18

On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote:
> Alexander Staubo wrote:
> > You mean aside from the obvious one, scalability?
>
> Multimaster doesn't give you scalability (at least not like a lot of
> people think it does).

That depends on your particular definition of multimaster.

> Databases are a bottleneck when you get a bunch of so called web 2.0
> developers thinking they know an inch about databases.

That's a fairly bizarre and disrespectful blanket statement that
includes some good friends of mine, and probably many of people on
this list. I wonder you what you mean by it; regardless of the type of
application, a single machine has a very real, very finite number of
transactions it can process per second, and a finite number of
spindles you can attach to it. No amount of handwaving about the
incompetence of others changes that.

Note that I am not talking about multimaster replication here, just
scalability. Like Andrew Sullivan I think multimaster replication is
infeasible by design, at least with a system such as PostgreSQL.

> What you are basically saying below is... web 2.0 developers such as
> rails developers have so fundamentally broken the way it is supposed to
> be done, we should too...

I don't know if I said that, but I would love to hear how they have
broken it, and what you propose the solution to be.

The eBay architecture is one interesting example, demonstrating a
situation where no single box could possibly handle an entire
database. Their requirements are extreme, but there are much smaller
datasets and applications with similar performance characteristics
that easily saturate a single box, so at a high level the same
principles apply.

  http://glinden.blogspot.com/2006/12/talk-on-ebay-architecture.html

The MySpace scaling story is similarly interesting, if mostly because
of the egregious blunders made underway:

  http://www.baselinemag.com/article2/0,1540,2082921,00.asp

You could argue that they solved the problems in an inelegant,
irrelational way, but it seems that they solved it, and this is
reflected on their balance sheets, which in the end is probably the
most appropriate metric of success.

Alexander.

Re: multimaster

From

"Alexander Staubo"

Date:

01 June 2007, 21:28:50

On 6/2/07, Guy Rouillier <guyr-ml1@burntmail.com> wrote:
> Have you looked at raidb?  http://c-jdbc.objectweb.org.  Narrow niche,
> but if it happens to be the one you are in, then it's an option.  I took
> a quick look at the user's page, and both of them were using PostgreSQL.

Looked at it briefly. C-JDBC is called Sequoia now. It's probably
pretty good, but it's Java, and using JDBC from Ruby is a bit awkward;
we're not really ready to use JRuby.

Alexander.

Re: Slightly OT.

From

Jeff Davis

Date:

01 June 2007, 21:30:31

On Sat, 2007-06-02 at 01:44 +0200, Alexander Staubo wrote:
> On 6/2/07, Jeff Davis <pgsql@j-davis.com> wrote:
> > Here is some work going on that looks like what you want:
> >
> > http://archives.postgresql.org/pgsql-hackers/2007-03/msg00050.php
>
> I had no idea someone was working on WAL-log-based replication; I saw
> the TODO entry a while ago, but I missed the thread. I think WAL
> replication is a beautiful idea, so I'll gladly throw my support
> behind this. Thanks for the pointer.
>

I think that project would add some great functionality to postgres.

> > You might also seriously consider PgPool-II.
>
> pgpool-II seems like a decent idea. I'm not sure if the partitioning
> can support referential integrity though -- would they have to be
> declared as CHECK constraints that used dblink()?

You shouldn't use a volatile function in a check constraint. Use a
trigger instead, but even that is unlikely to work for enforcing
constraints correctly.

In general, for partitioning, you have to make some sacrifices. It's
very challenging (and/or expensive) to ensure uniqueness across
partitions.

> Also, it doesn't seem capable of planning a query intelligently, which
> means that a query such as "select * from foo where id = 123" is going
> to be aggregated across all nodes even though only one node has the
> partition covering the id range [0, 1000], say.

Take a closer look. There is quite a lot of intelligence in there. The
query you mention will be rewritten to send simple WHERE clauses to the
underlying partitions. That, combined with constraint exclusion (CE),
will ensure that the only partitions scanned are those not excluded by
the partition's predicate (i.e. it will only scan one partition if it's
partitioned based on "id"). Actually, I'm not sure whether it relies on
CE or not, but the point is that it won't scan all the partitions.

Also, this means it could use an index scan on that underlying
partition, which is crucial.

Regards,
    Jeff Davis

Re: multimaster

From

"Alexander Staubo"

Date:

01 June 2007, 21:36:00

On 6/1/07, Chris Browne <cbbrowne@acm.org> wrote:
> There would be *some* scalability gains to be had, but the primary
> reason for looking for multimaster replication is that you need high
> availability so badly that you are willing to give up performance to
> get it.

...dependent on some specific definition of "multimaster". I note that
the MySQL people happily use "multimaster replication" to mean a
master/slave setup where masters are also slaves, and conflicts are
supposedly handled by assuming that the user always assigns unique IDs
to new rows. That's not necessarily my definition.

> > As it stands today, horizontally partitioning a database into multiple
> > separate "shards" is incredibly invasive on the application
> > architecture, and typically relies on brittle and non-obvious hacks
> > such as configuring sequence generators with staggered starting
> > numbers, omitting referential integrity constraints, sacrificing
> > transactional semantics, and moving query aggregation into the app
> > level. On top of this, dumb caches such as Memcached are typically
> > layered to avoid hitting the database in the first place.
>
> Question: In what way would you expect an attempt to do
> mostly-trying-to-be-transparent multimaster replication to help with
> these issues you're bringing up?

I don't expect anything. What I said was: "I think we could be more
productive by rephrasing the question 'how/when we can implement
multimaster replication?' as 'how/when can we implement horizontal
scaling?'". Which means we're in agreement when you say:

> Partitioning isn't multimaster replication; it's something worthy of
> having a discussion independent of anything about MMR.

Alexander.

Re: One last Slony question (was Re: Slightly OT.)

From

Ron Johnson

Date:

02 June 2007, 00:04:26

On 06/01/07 19:17, Joshua D. Drake wrote:
> Ron Johnson wrote:
>> On 06/01/07 18:35, Joshua D. Drake wrote:
>>>>
>>>> Since DDL is infrequent, is that bottleneck an acceptable trade-off?
>>>
>>> Define infrequent? I have customers that do it, everyday in prod.
>>> They do it willingly and refuse to change that habit.
>>
>> Even 2 or 3 ALTER TABLE or CREATE INDEX or CREATE TABLE statements per
>> day is a drop in the bucket compared to the number of I/U/D
>> statements, no?
>
> True.

So Alexander Staubo's idea of synchronous DDL replication via 2PC
has some merit?

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

Partitioning (was Re: Slightly OT.)

From

Ron Johnson

Date:

02 June 2007, 00:17:34

On 06/01/07 19:29, Jeff Davis wrote:
[snip]
> You shouldn't use a volatile function in a check constraint. Use a
> trigger instead, but even that is unlikely to work for enforcing
> constraints correctly.
>
> In general, for partitioning, you have to make some sacrifices. It's
> very challenging (and/or expensive) to ensure uniqueness across
> partitions.

Are partitioned databases the same as federated databases?

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

Re: Continuous PITR (was Re: multimaster)

From

Greg Smith

Date:

02 June 2007, 03:08:02

On Fri, 1 Jun 2007, Ron Johnson wrote:

> How difficult would it be to modify the process (the postmaster?) that writes
> the xlogs(?) to tee them to a listening process across the cloud on the DR
> machine, which then applies them to the DR database?

On an 8.2 server, you can practically do this right now by aggressively
using the pg_xlogfile_name_offset() function and piping the data over to
somewhere else.  See section 23.4.4 of
http://www.postgresql.org/docs/8.2/static/warm-standby.html

It's not quite putting a "tee" in the engine itself, but there's not a lot
of practial difference if you're copying over the data only a moment after
it appears with the offset function instead; it's just a different set of
efficiency trade-offs (more polling the way it's implemented, but the base
engine never gets held up waiting for network I/O to the DR box).

P.S. it's not the "the cloud" anymore, it's "the tubes".

> This then begs the question: are CREATE|ALTER TABLESPACE commands stored in
> the xlogs?

See section 23.3.5 of
http://www.postgresql.org/docs/8.2/static/continuous-archiving.html for
information about what goes into the WAL when you do that.  You'd need to
implement a process that made sure to sync changes in the underlying
filesystem before modifying a tablespace.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: multimaster

From

PFC

Date:

02 June 2007, 04:54:48

> Have you looked at raidb?  http://c-jdbc.objectweb.org.  Narrow niche,
> but if it happens to be the one you are in, then it's an option.  I took
> a quick look at the user's page, and both of them were using PostgreSQL.

    I just love those Java guys.
    The world starts and ends with Java.
    How do you execute a psql script with that thing ? How do you actually
use any non-java stuff with it ?
    The same features could be implemented in a connection pool like
pgpool2...

Re: multimaster

From

Martijn van Oosterhout

Date:

02 June 2007, 05:19:48

On Sat, Jun 02, 2007 at 02:27:06AM +0200, Alexander Staubo wrote:
> >What you are basically saying below is... web 2.0 developers such as
> >rails developers have so fundamentally broken the way it is supposed to
> >be done, we should too...
>
> I don't know if I said that, but I would love to hear how they have
> broken it, and what you propose the solution to be.

I don't know if it's a general problem, but I've been involved in a
using rails and it appears to have it's own way of declaring the
database. It presumes to handle referential integrity and uniqueness in
the application code (!).

Lo and behold, there are now some uniqueness violations and no-one knows
why. There seems to be a general unwillingness to let the database
check this because it's "slow".

My proposal is: let databases do what they're good at: checking
uniqueness, referential integrity and constraints, and let the user
code deal with the actual work.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

signature.asc

Re: multimaster

From

Rodrigo Gonzalez

Date:

02 June 2007, 10:26:21

Alexander Staubo wrote:
> On 6/2/07, Guy Rouillier <guyr-ml1@burntmail.com> wrote:
>> Have you looked at raidb?  http://c-jdbc.objectweb.org.  Narrow niche,
>> but if it happens to be the one you are in, then it's an option.  I took
>> a quick look at the user's page, and both of them were using PostgreSQL.
>
> Looked at it briefly. C-JDBC is called Sequoia now. It's probably
> pretty good, but it's Java, and using JDBC from Ruby is a bit awkward;
> we're not really ready to use JRuby.
>
> Alexander.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>
>
Check carob, it include and odbc driver to connect to sequoia

Re: multimaster

From

"Alexander Staubo"

Date:

02 June 2007, 10:59:45

On 6/2/07, Martijn van Oosterhout <kleptog@svana.org> wrote:
> I don't know if it's a general problem, but I've been involved in a
> using rails and it appears to have it's own way of declaring the
> database. It presumes to handle referential integrity and uniqueness in
> the application code (!).

I think you've been misled. True, Rails/ActiveRecord does bear the
signs of having been designed for MySQL/MyISAM, which has neither
transactions nor referential integrity, but this does not mean that
Rails does not support these constructs, or that Rails users don't use
them. I value my data integrity, so all my relations have RI, unique
constraints, null constraints, etc. as in any well-designed schema.

Alexander.

Re: One last Slony question (was Re: Slightly OT.)

From

Andrew Sullivan

Date:

02 June 2007, 15:35:35

On Fri, Jun 01, 2007 at 06:15:40PM -0500, Ron Johnson wrote:
>
> Since DDL is infrequent, is that bottleneck an acceptable trade-off?

I don't know.  We'd have to do the analysis.  But it could be a
problem.  Look at it this way: if you have a replica that is, for
isntance, _always_ 30 minutes behind, as a sort of poor-person's
fast-recovery PITR, then you lose that functionality if you have to
perform DDL on the replica at the same time as on the origin, because
you have to catch up first.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
"The year's penultimate month" is not in truth a good way of saying
November.
        --H.W. Fowler

Re: Slightly OT.

From

Andrew Sullivan

Date:

02 June 2007, 15:40:02

On Sat, Jun 02, 2007 at 01:30:53AM +0200, Alexander Staubo wrote:

> There needs to be a point of synchronization when a DDL transaction
> appears that blocks further write transactions from running. As far as
> I can tell, the slaves themselves can continue to receive pending
> events, but perhaps not.

In order to do it automatically, you have to lock everyone, get all
the events through, and then perform the DDL, and then come out of
lock.  Otherwise, what happens when you do DROP COLUMN?  If it goes
through ahead of data that ought to go into that column, you have
just broken your cluster.  I suppose you could figure out a way to
work around this, but pretty soon you are building an artificial
intelligence expert system with event-predicting capabilities.  Such
systems are not well known for their simplicity and ease of
maintenance.

> Last I checked, nobody was actually terribly *happy* about having to
> pipe schema changes through slonik.

Nobody would suggest it's the friendliest arrangement.  But this is a
field where the details really count, and therefore proposals to make
it more friendly have to account for how that friendliness in a lot
of cases doesn't lead to complete breakage in others.  (I had to be
exposed to the multimaster MS SQL stuff, years ago, and I have to say
that it was great when it worked; but when things went south, boy did
your life suck.  Whether it is better now, I don't know.)

A
--
Andrew Sullivan  | ajs@crankycanuck.ca
Unfortunately reformatting the Internet is a little more painful
than reformatting your hard drive when it gets out of whack.
        --Scott Morris

Re: multimaster

From

Guy Rouillier

Date:

03 June 2007, 01:20:20

PFC wrote:
>
>> Have you looked at raidb?  http://c-jdbc.objectweb.org.  Narrow niche,
>> but if it happens to be the one you are in, then it's an option.  I
>> took a quick look at the user's page, and both of them were using
>> PostgreSQL.
>
>     I just love those Java guys.
>     The world starts and ends with Java.
>     How do you execute a psql script with that thing ? How do you
> actually use any non-java stuff with it ?

That's why I said "Narrow niche, but if it happens to be the one you're
in ...".  I do find your response rather peculiar.  The objectweb guys
saw a need and filled it in the domain they work in - Java.  Should the
Java folks complain because Perl scripts on CPAN are not accessible from
Java?

--
Guy Rouillier

Re: multimaster

From

"Alexander Staubo"

Date:

03 June 2007, 15:23:11

On 6/3/07, Ian Harding <harding.ian@gmail.com> wrote:
> Rails propaganda explicitly proposes not "repeating yourself" and
[...]
> The creator of RoR explicitly discourages use of RI, rules, triggers,
> etc in the database as unnecessary.  His disciples take this as
> gospel.

The creator of Rails is not the only developer working on Rails. There
are Rails developers who disagree with him on these issues, and "his
disciples" does not equate "the community". On the other hand, I admit
that this mindset has a negative impact on the development of Rails as
a whole.

I consider myself a moderate pragmatist, and I think both sides are
slightly wrong; the database side not pragmatic enough, and the Rails
side pragmatic at the cost of moving too much database logic into the
app.

For example, part of the point of having validations declared on the
model is so that you can raise user-friendly errors (and pipe them
through gettext for localization) such as "Your password must be at
least 4 characters long and contain only letters and digits".
Databases don't support this, and so applications end up having to
duplicate data-validation logic in order to find out what kind of user
input is invalid.

There might be hoops you could jump through to reduce the duplication.
You could, perhaps, introspect the schema and see that the password
column has a "check" constraint with a certain expression (eg.,
"password ~ '^\w+$'). On insertion failure, you use the expression
string to generate a select -- eg., "select password ~ '^\w+$' from
(select 'foobar'::text as password) as t" -- on every column you have
to see what failed. I don't think PostgreSQL had full support for ANSI
SQLSTATE column context information yet, but even if it had, I think
you would not get information about *all* failing columns, only the
first one (anyone know?), so you would still needt run the values
through the database with a select.

At this point you don't have an error message, but you could store
these in a separate table (table_name, column_name, message) or
perhaps (table_name, constraint_name, message) and look them up on
failure.

Another option might be to use triggers that call "raise" -- which may
be acceptable for "check" constraints, but breaks the idiom for the
others type of constraints; at best you will be repeating yourself.
Another idea: I believe SQLSTATE (as implemented by PostgreSQL)
currently lacks context information about which columns failed a
constraint, but you had this, you could correlate

None of this is terribly idiomatic, and involves a bunch of glue
between application and database which needs to incorporated into a
database layer.

This may be a case for letting constraints have an optional
description; this way the schema would also be self-documenting, eg.:

  create table foo (
    id serial,
    name text check (name ~ '^(\w|\s)+$') or raise error 'Name must
contain letters, digits and spaces only'
  );

This still isn't enough for the app side -- if multiple columns fail
to validate, the app needs to know about them all, not just the first
one:

# create table persons (name text check (name != '') not null unique,
age int check (age > 0));
# insert into persons (name, age) values ('', 0);
ERROR:  new row for relation "persons" violates check constraint
"test_name_check1"

> I have used rails for a couple projects and I repeated myself
> religiously in database constraints.  This revealed a couple
> disturbing properties of rails that made me not want to use it,
> notably, assuming an empty input box equates to an empty string vice
> null for text data types, even if the table is defined as accepting
> nulls.

An empty string is not null! Null means the value is missing, which is
clearly not the case here. I would say Rails is exactly in the right
here. When an HTML form is posted, empty input boxes are declared as
empty strings, which what the user entered. The problem is not with
Rails/ActiveRecord but with your form handling. If you want empty
boxes to become null, add some client-side JavaScript logic that sets
the "disabled" attribute on empty input elements before form is
submitted; this will prevent the client from sending the value.

Alexander.

Re: multimaster

From

"Ian Harding"

Date:

03 June 2007, 15:30:24

On 6/2/07, Alexander Staubo <alex@purefiction.net> wrote:
> On 6/2/07, Martijn van Oosterhout <kleptog@svana.org> wrote:
> > I don't know if it's a general problem, but I've been involved in a
> > using rails and it appears to have it's own way of declaring the
> > database. It presumes to handle referential integrity and uniqueness in
> > the application code (!).
>
> I think you've been misled. True, Rails/ActiveRecord does bear the
> signs of having been designed for MySQL/MyISAM, which has neither
> transactions nor referential integrity, but this does not mean that
> Rails does not support these constructs, or that Rails users don't use
> them. I value my data integrity, so all my relations have RI, unique
> constraints, null constraints, etc. as in any well-designed schema.
>

Rails propaganda explicitly proposes not "repeating yourself" and
since the RI constraints are defined in the rails models, and they are
enforced in the framework with a graceful feedback mechanism for
users, they implicitly denigrate defining constraints in the DB as
"Repeating yourself."

The creator of RoR explicitly discourages use of RI, rules, triggers,
etc in the database as unnecessary.  His disciples take this as
gospel.

I have used rails for a couple projects and I repeated myself
religiously in database constraints.  This revealed a couple
disturbing properties of rails that made me not want to use it,
notably, assuming an empty input box equates to an empty string vice
null for text data types, even if the table is defined as accepting
nulls.

- Ian

Re: multimaster

From

Martijn van Oosterhout

Date:

03 June 2007, 17:24:20

On Sun, Jun 03, 2007 at 07:47:04PM +0200, Alexander Staubo wrote:
> The creator of Rails is not the only developer working on Rails. There
> are Rails developers who disagree with him on these issues, and "his
> disciples" does not equate "the community". On the other hand, I admit
> that this mindset has a negative impact on the development of Rails as
> a whole.

Indeed, it certainly left me with a bad taste in my mouth.

> For example, part of the point of having validations declared on the
> model is so that you can raise user-friendly errors (and pipe them
> through gettext for localization) such as "Your password must be at
> least 4 characters long and contain only letters and digits".
> Databases don't support this, and so applications end up having to
> duplicate data-validation logic in order to find out what kind of user
> input is invalid.

I think you're confusing validation and integrity constraints. The
example you're giving could be implemented in either the DB or the app.
Personnaly I'd do it in the app since it's something that doesn't
affect the integrity of the data. If I go in and manually change
someone's password to something not following that rule it's not going
to affect anything.

Integrity constraints are different: if you violate them your data has
serious problem. They are the assumptions of your model upon which
everything depends. We're talking referential integrity and uniqueness.
These things *cannot* be checked reliably in the app, and you shouldn't
try. You assume the constraints are valid and feel confident they are,
because the database has checked them for you.

> This still isn't enough for the app side -- if multiple columns fail
> to validate, the app needs to know about them all, not just the first
> one:

Basically, validation in this case is completely orthoginal to
integrity checks. If the data being validated isn't crucial to the
operation of the app, I wouldn't bother putting it in the database. But
fundamental integrity constraints, the database is the only place.

I wish you success in your efforts to making rails a little more sane
in this area.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

signature.asc

Re: multimaster

From

"Alexander Staubo"

Date:

03 June 2007, 17:54:52

On 6/3/07, Martijn van Oosterhout <kleptog@svana.org> wrote:
> > For example, part of the point of having validations declared on the
> > model is so that you can raise user-friendly errors (and pipe them
> > through gettext for localization) such as "Your password must be at
> > least 4 characters long and contain only letters and digits".
> > Databases don't support this, and so applications end up having to
> > duplicate data-validation logic in order to find out what kind of user
> > input is invalid.
>
> I think you're confusing validation and integrity constraints. The
> example you're giving could be implemented in either the DB or the app.

No, I was pointing out that Rails supports uniqueness and referential
integrity, but that it implemented validations as a general construct
in order to (among other things) provide user-friendly messages.

But what I said also applies to uniqueness and foreign key
constraints. Databases, including PostgreSQL, makes it hard for an
application to determine what part of the data failed when it did. You
get an error for some arbitrary column, but not all columns; and the
error does not (as far as I know) actually contain the column that
failed.

> Personnaly I'd do it in the app since it's something that doesn't
> affect the integrity of the data. If I go in and manually change
> someone's password to something not following that rule it's not going
> to affect anything.

I agree with you and I don't; as it stands now, it's too hard to
implement validation in the database alone, for the reasons I stated
earlier. But I would love for it to be possible, so that I can be sure
that not even plain SQL can screw up the data.

Alexander.

Re: SQL Manager 2007 for PostgreSQL released

From

Tino Wildenhain

Date:

03 June 2007, 18:18:10

L. Berger wrote:
> On May 23, 8:47 am, "EMS Database Management Solutions
> (SQLManager.net)" <sqlmana...@gmail.com> wrote:
>> We, here at EMS Database Management Solutions, are pleased to announce
>> SQL Manager 2007 for PostgreSQL - the new major version of the
>> powerful PostgreSQL administration and development tool!
>>
>> You can download SQL Manager 2007 for PostgreSQL at:http://www.sqlmanager.net/products/postgresql/manager/download
>>
>> You can purchase SQL Manager 2007 for PostgreSQL at:http://www.sqlmanager.net/products/postgresql/manager/buy
>>
>> What's new in SQL Manager 2007 for PostgreSQL?
>
> <snip>
>
>
> Thanks for this, but is there any plan to launch something like this
> for use on Linux admin servers? Something that I could install on a
> server, and perhaps work with a web interface? I would love some
> recommendations.

Maybe I'm a bit old fashioned, but I would never ever consider
graphical frontends for a server. I mean, ssh -L and
local GUI client work very well here. Even web based stuff
seems dangerous (if not just for educational purposes)

Regards
Tino

Re: Continuous PITR (was Re: multimaster)

From

Scott Ribe

Date:

03 June 2007, 18:43:11

> P.S. it's not the "the cloud" anymore, it's "the tubes".

It was always tubes. The cloud was just a convenient simplification for the
technically declined ;-)

--
Scott Ribe
scott_ribe@killerbytes.com
http://www.killerbytes.com/
(303) 722-0567 voice

Re: multimaster

From

Jeff Davis

Date:

03 June 2007, 19:11:45

On Sun, 2007-06-03 at 22:54 +0200, Alexander Staubo wrote:
> I agree with you and I don't; as it stands now, it's too hard to
> implement validation in the database alone, for the reasons I stated
> earlier. But I would love for it to be possible, so that I can be sure
> that not even plain SQL can screw up the data.

You're blurring the line between an RDBMS and an application.
Applications errors and database errors do not have a one-to-one
mapping, although they do usually overlap.

There are times when one database error maps onto several possible
user-level errors; and when many database errors map onto the same
user-level error; and when one database error does not cause any
user-level error; and when something that is a user-level error might
not have a matching constraint in the database at all. Trying to equate
the two concepts is a bad idea.

The application has much more information about the user and the context
of the error that the database shouldn't have. For instance, the
language that the user speaks might affect the error message. Or, there
may be two possible user interface actions that result in the same
constaint violation. For instance if you have a two-column unique
constraint, perhaps there is one interface to change one column and one
another. But you might want to return a different error to the user that
makes sense in the context of which value they tried to change.

A database error doesn't even always need to be propogated back to the
user. If so, there would be no need for SAVEPOINTs and nobody would use
ISOLATION LEVEL SERIALIZABLE (not directly related to constraints, but
can cause an error just the same).

Some user errors don't have a corresponding database constriant at all.
For instance, how about a "re-type your password here" field? That
should cause an error if it doesn't match the "password" field, but the
database would have no matching constraint.

Regards,
    Jeff Davis

Re: multimaster

From

"Alexander Staubo"

Date:

03 June 2007, 19:51:18

On 6/4/07, Jeff Davis <pgsql@j-davis.com> wrote:
> On Sun, 2007-06-03 at 22:54 +0200, Alexander Staubo wrote:
> > I agree with you and I don't; as it stands now, it's too hard to
> > implement validation in the database alone, for the reasons I stated
> > earlier. But I would love for it to be possible, so that I can be sure
> > that not even plain SQL can screw up the data.
>
> You're blurring the line between an RDBMS and an application.
> Applications errors and database errors do not have a one-to-one
> mapping, although they do usually overlap.

True, and when they overlap you tend to want to describe the
validation errors in one place, not two -- either the database or the
app, not both. Relational databases have traditionally argued that
these rules should be in the former, so that there's one layer through
which every single change has to go.

> There are times when one database error maps onto several possible
> user-level errors; and when many database errors map onto the same
> user-level error; and when one database error does not cause any
> user-level error; and when something that is a user-level error might
> not have a matching constraint in the database at all. Trying to equate
> the two concepts is a bad idea.

I agree. In my experience, however, the best kind of data model is the
one that is immediately mappable to user-level concepts -- to human
concepts. A "user" relation has attributes like "name", "birth_date",
etc. If you manage to keep the model flat and friendly enough, you can
map the attributes to forms and translate attribute-level errors
directly to form error messages.

In the cases where a user-level attribute is represented by a set of
columns, or a referenced relation, or similar, you provide simple
shims that translate between them. For example, you probably want to
store date-time attributes as a single "timestamp with timezone"
column, but offer two fields to the user, one for the date and for the
time. With Rails this kind of shim is simple:

class User < ActiveRecord::Base
  ...
  validates_each :human_birth_date do |record, user, value|
    record.errors.add(attr, "Bad date") unless MyDateParser.valid?(value)
  end

  def human_birth_date
    birth_datetime.strftime("%Y-%m-d")
  end

  def human_birth_date=(date)
    year, month, day = MyDateParser.parse(date)
    birth_datetime = Time.local(year, month, day, birth_datetime.hour,
birth_datetime.minute)
  end
end

With a well-designed, normalized schema, mapping relations and their
attributes to user input is very easy. I would argue that if mapping
is a problem, your schema is probably to blame.

> The application has much more information about the user and the context
> of the error that the database shouldn't have. For instance, the
> language that the user speaks might affect the error message.

Localization is easily accomplished by piping the error message through gettext.

> Some user errors don't have a corresponding database constriant at all.
> For instance, how about a "re-type your password here" field? That
> should cause an error if it doesn't match the "password" field, but the
> database would have no matching constraint.

That's a user-interface detail, and not a data model detail; a
re-typed password has no database counterpart. I am speaking purely
about invariant constraints on the data itself.

Alexander.

Re: multimaster

From

Jeff Davis

Date:

03 June 2007, 23:22:57

On Mon, 2007-06-04 at 00:51 +0200, Alexander Staubo wrote:
> True, and when they overlap you tend to want to describe the
> validation errors in one place, not two -- either the database or the
> app, not both. Relational databases have traditionally argued that
> these rules should be in the former, so that there's one layer through
> which every single change has to go.
>

I disagree here. You often _do_ want to describe some types of errors
twice. You check the same thing in different ways at different points in
the code, and that isolates errors and assures developers that certain
assumptions are safe.

In the database world, it's particularly important to use declarative
constraints. If developer A inserts bad data and developer B uses that
bad data, it could compound the problem and yet remain invisible until
the problem is almost impossible to debug. Constraints assure the
developers that they are starting with some known state.

Applications should check for inconsistencies when it makes sense, as
well. Every important API that I can think of checks the input, and
reports some kind of useful error when the assumptions are violated.
Every system call has all sorts of useful error codes. For example,
read() can return EBADF. Nobody thinks "Hey, I'll send a random integer
for the file descriptor", but I'd be willing to bet that the error
condition has been reached by accident before, and probably saved
someone a lot of time versus just filling the target buffer with random
bytes and returning success.

I would argue it's more important in a database, because the error
conditions can persist for a longer period of time and cause more damage
in the process, but the idea is the same.

> I agree. In my experience, however, the best kind of data model is the
> one that is immediately mappable to user-level concepts -- to human
> concepts. A "user" relation has attributes like "name", "birth_date",
> etc. If you manage to keep the model flat and friendly enough, you can
> map the attributes to forms and translate attribute-level errors
> directly to form error messages.
>

I think you're oversimplifying. What you say works when user input is a
separate, contained, isolated chunk of data. In that case, any error is
only related to the self-consistency of the input, and can easily be
mapped back to a user-level error.

However, it breaks down when you have constraints on the
interrelationships between pieces of data. These interrelationships are
what can be broken from multiple points in the application code, and
there is no way to map backwards from the constraint violation to a
specific user error. Hence, the application must translate.

Try to imagine some of the complexities in a scheduling system, and what
kind of constraints that might involve. Then, think about how some of
the same constraints might be broken in very different ways. Time
conflicts could come about either by participants overscheduling
themselves, or by the event itself shifting in time such that some
participants are overscheduled. Perhaps someone tries to sign up for an
event that's already full, or perhaps the venue moves to a new location
with a lower capacity. I can't think of any way to map backwards from
the constraint violation to the user level error.

You could probably imagine similar problems with an inventory system.

> > The application has much more information about the user and the context
> > of the error that the database shouldn't have. For instance, the
> > language that the user speaks might affect the error message.
>
> Localization is easily accomplished by piping the error message through gettext.
>

And what about the two-column unique index that can be violated from two
different aspects of the UI? You only get one database error, but you
really should present two different errors to the user.

Any time that a constraint can be violated through two completely
different paths, your one-to-one constraint-to-application-error breaks
down.

The root of the issue is that the database knows that an inconsistent
state has been reached, it does not know _how_, nor should it. The how
might change as new code is added or perhaps as new bugs are introduced.
Constraints in an RDBMS should be declarative which is very important
(you don't need to prove that a column is always in a unique state, you
look, and it says it is unique). You can add procedural code to an
RDBMS, but you can end up making your database your application that
way.

User-level errors are heavily dependent on _how_ the error occurred. The
database doesn't know this, so the constraint violation message
shouldn't presume to know how it happened.

I'll use the analogy to read() again: who passes EBADF back to the user
directly? Does that mean it's a bad API? No, it just means it had no
idea why you passed it a bad file descriptor, but it knows it's bad, and
it tells the caller. Similarly with exceptions in java/ruby/python: you
shouldn't pass those exceptions back to the user.

Regards,
    Jeff Davis

Re: multimaster

From

Tino Wildenhain

Date:

04 June 2007, 08:22:14

Alexander Staubo schrieb:
> On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
>> These are all different solutions to different problems, so it's not
>> surprising that they look different.  This was the reason I asked,
>> "What is the problem you are trying to solve?"
>
> You mean aside from the obvious one, scalability?
>
> The databases is becoming a bottleneck for a lot of so-called "Web
> 2.0" apps which use a shared-nothing architecture (such as Rails,
> Django or PHP) in conjunction with a database. Lots of ad-hoc database
> queries that come not just from web hits but also from somewhat
> awkwardly fitting an object model onto a relational database.

...

> the single server, but I would hope that there would, at some point,
> appear a solution that could enable a database to scale horizontally
> with minimal impact on the application. In light of this need, I think
> we could be more productive by rephrasing the question "how/when we
> can implement multimaster replication?" as "how/when can we implement
> horizontal scaling?".
>
> As it stands today, horizontally partitioning a database into multiple
> separate "shards" is incredibly invasive on the application
> architecture, and typically relies on brittle and non-obvious hacks
> such as configuring sequence generators with staggered starting
> numbers, omitting referential integrity constraints, sacrificing
> transactional semantics, and moving query aggregation into the app
> level. On top of this, dumb caches such as Memcached are typically

Did you have a look at BizgresMPP?

Especially for your shared-nothing approach it seems to be a better
solution then just replicating everything.

Regards
Tino

Re: multimaster

From

"Alexander Staubo"

Date:

04 June 2007, 09:19:05

On 6/4/07, Tino Wildenhain <tino@wildenhain.de> wrote:
> Did you have a look at BizgresMPP?
>
> Especially for your shared-nothing approach it seems to be a better
> solution then just replicating everything.

I had completely forgotten about that one. Bizgres.org seems down at
the moment, but looking at their whitepaper, the architecture looks
similar to that of pgpool-II; from what I can see, you connect through
a proxy that transparently partitions data across multiple PostgreSQL
database instances, and then queries them in parallel and merges the
query results. Looks promising.

Alexander.

Re: Partitioning (was Re: Slightly OT.)

From

Jeff Davis

Date:

04 June 2007, 17:52:20

On Fri, 2007-06-01 at 22:13 -0500, Ron Johnson wrote:
> On 06/01/07 19:29, Jeff Davis wrote:
> [snip]
> > You shouldn't use a volatile function in a check constraint. Use a
> > trigger instead, but even that is unlikely to work for enforcing
> > constraints correctly.
> >
> > In general, for partitioning, you have to make some sacrifices. It's
> > very challenging (and/or expensive) to ensure uniqueness across
> > partitions.
>
> Are partitioned databases the same as federated databases?
>

I think that usually people refer to a table that is split to be
partitioned (whether across servers or within a single server). I think
federated databases are where various parts of the database are split
across servers, but tables may be intact.

That's my own understanding of the terminology.

Regards,
    Jeff Davis

Re: multimaster

From

Karsten Hilbert

Date:

05 June 2007, 09:20:55

On Sun, Jun 03, 2007 at 07:47:04PM +0200, Alexander Staubo wrote:

> For example, part of the point of having validations declared on the
> model is so that you can raise user-friendly errors (and pipe them
> through gettext for localization) such as "Your password must be at
> least 4 characters long and contain only letters and digits".

If anyone is interested, we have sort of re-implemented gettext in SQL:

    http://cvs.savannah.gnu.org/viewvc/gnumed/gnumed/server/sql/?root=gnumed

(see the gmI18n-*.sql stuff)

In essence it enables you to write queries like so

    select pk, name, _(name) as l10n_name from states

which will give you a localized name for states.name in l10n_name.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346