Re: embedded/"serverless" (Re: serverless postgresql) - Mailing list pgsql-general

From Steve Atkins
Subject Re: embedded/"serverless" (Re: serverless postgresql)
Date
Msg-id 20040122172127.GB10078@gp.word-to-the-wise.com
Whole thread Raw
In response to Re: embedded/"serverless" (Re: serverless postgresql)  (Jeff Bowden <jlb@houseofdistraction.com>)
List pgsql-general
On Thu, Jan 22, 2004 at 09:03:13AM -0800, Jeff Bowden wrote:
> >>Why can't you just run a postgres instance for this?  What is magic
> >>about "embedded" for this sort of application?  Sounds like a clever
> >>wrapper script is all that's necessary for something like that, no?
> >
> >
> >The "magic" seems to be that the application appears to be managing
> >the database by its own, without the help of any separate database
> >product. I've seen commercial product that uses embedded Firebird and
> >changes the default extension of the database file from *.fdb into
> >*.dat or something.
> >
> >Of course, with the current cluster/database directory layout, it's
> >still easy to spot PostgreSQL footprints all over the place
> >(pg_hba.conf, pg_xlog/, PG_VERSION, etc).
> >
>
> It's not so important to hide the fact that postgres is being used as it
> is to spare the user from having to know anything about general database
> administration when the all the app really needs from the db is for it
> to be a more powerful way of storing and retreiving information from
> user-owned files.

Installation and upgrades are another issue.

To 'install' a typical embedded database requires no action at all -
all the infrastructure is embedded within the application, or at most
a single external dynamica library.

I deploy and support applications that use PostgreSQL as a backend
database and a huge amount of support effort is consumed by PostgreSQL
issues. (We're just upgrading from 7.2.* to 7.4.1, and walking even
very unix literate users through the pain of a dump, upgrade, restore
process is very painful).

Other differences between an embedded database and a typical RDBMS is
that the embedded database is likely to be between twice as fast and
ten times as fast, depending on the details of the application. One
tradeoff, of course, is that the embedded database is likely to be
single-user, or at most single-writer, multiple-reader. The other
advantage that the embedded DB likely has is that it probably accepts
queries and returns results as language native types via direct
calls, avoiding all the overhead of translating queries into SQL
and translating the result sets back again.

Another difference between the two is that an RDBMS like PostgreSQL
is that PostgreSQL will make more requirements of the underlying OS
(for instance shared-memory and semaphores) - meaning that the kernel
may need to be reconfigured and the machine rebooted.

If what you need is a single-user database then, nice as it is to
have a full RDBMS, you really don't need all that infrastructure,
and the users will (likely) be far happier without it. If it'd
be nice to have SQL support, then there are embedded, single-user,
databases with some level of SQL support.

All of this explains why an embedded PostgreSQL isn't a great idea. It
being a true multi-user database means that even if you went though
all the work needed to turn it into an embedded database you wouldn't
get most of the advantages.

Short term, if you want embedded SQL, use something like SQLite that's
designed to work that way. If you want to share code between an
embedded DB and PostgreSQL then use a database agnostic API (such as
DBI or libdbi) and restrict yourself to fairly vanilla SQL. If you
must have PostgreSQL used by a non DB aware user then you'll have to
do what I've done, which is to build some automated maintenance code
around PostgreSQL to handle installation and day-to-day maintenance
and monitoring automagically, or patch something similar together
based on the vacuum daemon and so on.

Now, while I think that an embedded fork of PostgreSQL is completely
missing the point I do think that a low maintenance fork or
configuration option would be a very useful feature. I'd love to
be able to ship an application that would

  o Have a private installation of PostgreSQL

  o That would run semi-persistently - if the DB isn't running, the
    application will transparently start it, and if the DB is idle
    for some length of time it gracefully shuts down

  o Is zero-maintenance - all vacuuming, analysing etc is handled
    automatically. So are database version upgrades.

  o That runs under the permissions of the user running the application

  o And that could, by tweaking an application configuration variable
    could swap out the private PostgreSQL installation and instead
    access a standard installation

But (apart from the easy DB version upgrade bit) I'd far rather have
point-in-time or incremental backups, easy replication, richer SQL
support, easier reporting and easier to use replication. So, much as
I'd like it, a low maintenance build is a long way down my list of
priorities.

Cheers,
  Steve



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Lost plpgsql function
Next
From: Tom Lane
Date:
Subject: Re: Lost plpgsql function