Thread: Multiple logical databases

Multiple logical databases

From

"Mark Woodward"

Date:

02 February 2006, 10:59:03

I am working on an issue that I deal with a lot, there is of course a
standard answer, but maybe it is something to think about for PostgreSQL
9.0 or something. I think I finally understand what I have been fighting
for a number of years. When I have been grousing about postgresql
configuration, this has been what I have been fighting.

One of the problems with the current PostgreSQL design is that all the
databases operated by one postmaster server process are interlinked at
some core level. They all share the same system tables. If one database
becomes corrupt because of disk or something, the whole cluster is
affected. If one db is REALLY REALLY huge and doesn't change, and a few
others are small and change often, pg_dumpall will spend most of its time
dumping the unchanging data.

Now, the answer, obviously, is to create multiple postgresql database
clusters and run postmaster for each logical group of databases, right?
That really is a fine idea, but....

Say, in pgsql, I do this: "\c newdb" It will only find the database that I
have in that logical group. If another postmaster is running, obviously,
psql doesn't know anything about it.

From the DB admin perspective, maybe there should be some heirarchical
structure to this. What if there were a program, maybe a special parent
"postmaster" process, I don't know, that started a list of child
postmasters based on some site config? The parent postmaster would hold
all the configuration parameters of the child postmaster processes, so
there would only be on postgresql.conf.

This also answers "how do we get postgresql options in a database,"
because the parent postmaster only needs to bootstrap the others, it can
be configured to run lean and mean, and the "real" settings can be
inspected and changed at will. A trigger will send a HUP to child
postmasters when their settings change. The parent postmaster only needs
one connection for each child and one admin, right?

Does anyone see this as useful?

Re: Multiple logical databases

From

Andrew Dunstan

Date:

02 February 2006, 11:13:49

On Thu, 2006-02-02 at 10:23 -0500, Mark Woodward wrote:
> If one db is REALLY REALLY huge and doesn't change, and a few
> others are small and change often, pg_dumpall will spend most of its time
> dumping the unchanging data.
> 

My usual backup strategy does pg_dumpall -g to get the (tiny) global
data, and then pg_dump for each individual database. Quite apart from
anything else I prefer to have custom format dumps anyway, but I think
this should meet your need for less frequent dumping of some constant
database.

cheers

andrew

Re: Multiple logical databases

From

Tom Lane

Date:

02 February 2006, 11:19:05

"Mark Woodward" <pgsql@mohawksoft.com> writes:
> One of the problems with the current PostgreSQL design is that all the
> databases operated by one postmaster server process are interlinked at
> some core level. They all share the same system tables. If one database
> becomes corrupt because of disk or something, the whole cluster is
> affected.

This problem is not as large as you paint it, because most of the system
catalogs are *not* shared.

> Does anyone see this as useful?

No...
        regards, tom lane

Re: Multiple logical databases

From

"Mark Woodward"

Date:

02 February 2006, 11:32:26

> "Mark Woodward" <pgsql@mohawksoft.com> writes:
>> One of the problems with the current PostgreSQL design is that all the
>> databases operated by one postmaster server process are interlinked at
>> some core level. They all share the same system tables. If one database
>> becomes corrupt because of disk or something, the whole cluster is
>> affected.
>
> This problem is not as large as you paint it, because most of the system
> catalogs are *not* shared.
>
>> Does anyone see this as useful?

Seriously? No use at all? You don't see any purpose in controlling and
managing multiple postgresql postmaster processes from one central point?
Sure you don't want to think about this a little?

Re: Multiple logical databases

From

Andreas Pflug

Date:

02 February 2006, 12:12:57

Mark Woodward wrote:
>>"Mark Woodward" <pgsql@mohawksoft.com> writes:
>>
>>>One of the problems with the current PostgreSQL design is that all the
>>>databases operated by one postmaster server process are interlinked at
>>>some core level. They all share the same system tables. If one database
>>>becomes corrupt because of disk or something, the whole cluster is
>>>affected.
>>
>>This problem is not as large as you paint it, because most of the system
>>catalogs are *not* shared.
>>
>>
>>>Does anyone see this as useful?
> 
> 
> Seriously? No use at all? You don't see any purpose in controlling and
> managing multiple postgresql postmaster processes from one central point?

pgAdmin does so. IMHO it's totally sufficient to handle this on a client 
side level.

Regards,
Andreas

Re: Multiple logical databases

From

Alvaro Herrera

Date:

02 February 2006, 12:21:25

Mark Woodward wrote:

> Seriously? No use at all? You don't see any purpose in controlling and
> managing multiple postgresql postmaster processes from one central point?

I'd rather spend effort in fixing the problems that arise from big
databases; for example Hannu's patch for concurrent vacuum attacks one
of the problems that IMHO are important.  More elaborate partitioning
does too.

Anyway, if you're very excited about it, I don't think it's impossible
to code a super-postmaster that would redirect a client to the real
postmaster.  I even think it can be done without modifying the regular
postmaster.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Multiple logical databases

From

Stephan Szabo

Date:

02 February 2006, 13:40:33

On Thu, 2 Feb 2006, Mark Woodward wrote:

> Now, the answer, obviously, is to create multiple postgresql database
> clusters and run postmaster for each logical group of databases, right?
> That really is a fine idea, but....
>
> Say, in pgsql, I do this: "\c newdb" It will only find the database that I
> have in that logical group. If another postmaster is running, obviously,
> psql doesn't know anything about it.

> >From the DB admin perspective, maybe there should be some heirarchical
> structure to this. What if there were a program, maybe a special parent
> "postmaster" process, I don't know, that started a list of child
> postmasters based on some site config? The parent postmaster would hold
> all the configuration parameters of the child postmaster processes, so
> there would only be on postgresql.conf.

>
> This also answers "how do we get postgresql options in a database,"
> because the parent postmaster only needs to bootstrap the others, it can
> be configured to run lean and mean, and the "real" settings can be
> inspected and changed at will. A trigger will send a HUP to child
> postmasters when their settings change. The parent postmaster only needs
> one connection for each child and one admin, right?
>
> Does anyone see this as useful?

Not as described above, no.  Perhaps with a more concrete plan that
actually talks about these things in more details. For example, you posit
the \c thing as an issue, I don't personally agree, but you also don't
address it with a solution.

Re: Multiple logical databases

From

"Mark Woodward"

Date:

02 February 2006, 14:40:31

> On Thu, 2 Feb 2006, Mark Woodward wrote:
>
>> Now, the answer, obviously, is to create multiple postgresql database
>> clusters and run postmaster for each logical group of databases, right?
>> That really is a fine idea, but....
>>
>> Say, in pgsql, I do this: "\c newdb" It will only find the database that
>> I
>> have in that logical group. If another postmaster is running, obviously,
>> psql doesn't know anything about it.
>
>> >From the DB admin perspective, maybe there should be some heirarchical
>> structure to this. What if there were a program, maybe a special parent
>> "postmaster" process, I don't know, that started a list of child
>> postmasters based on some site config? The parent postmaster would hold
>> all the configuration parameters of the child postmaster processes, so
>> there would only be on postgresql.conf.
>
>>
>> This also answers "how do we get postgresql options in a database,"
>> because the parent postmaster only needs to bootstrap the others, it can
>> be configured to run lean and mean, and the "real" settings can be
>> inspected and changed at will. A trigger will send a HUP to child
>> postmasters when their settings change. The parent postmaster only needs
>> one connection for each child and one admin, right?
>>
>> Does anyone see this as useful?
>
> Not as described above, no.  Perhaps with a more concrete plan that
> actually talks about these things in more details. For example, you posit
> the \c thing as an issue, I don't personally agree, but you also don't
> address it with a solution.

While I understand that it is quite a vague suggestion, I guess I was
brainstorming more than detailing an actual set of features.

My issue is this, (and this is NOT a slam on PostgreSQL), I have a number
of physical databases on one machine on ports 5432, 5433, 5434. All
running the same version and in fact, installation of PostgreSQL.

Even though they run on the same machine, run the same version of the
software, and are used by the same applications, they have NO
interoperability. For now, lets just accept that they need to be on
separate physical clusters because some need to be able to started and
stopped while others need to remain running, there are other reasons, but
one reason will suffice for the discussion.

From an administration perspective, a single point of admin would seem
like a logical and valuable objective, no?

Beyond just the admin advanatges, the utilities could be modified to
handle a root server that redirects to child servers. The psql program,
when handling a "\c" command, queries the root server to find the child
server and then connects to that.

libpq could also be modified to handle this without changing the
applications.

The child postmasters will query the root postmaster when a DB is created
and deleted to keep it up to date. Conflicts between two children can be
managed by either some sort of first come first serve or disallow creating
of a duplicate name, or some other method.

So, conn = connect("host=localhost dbname=mydb"); Will connect to the root
server, find the actual server, and then connect to it, completely hiding
the different physical databases, and creating one very large logical
install.

Perhaps this can even be written to include large scale clustering. Who
knows?

Re: Multiple logical databases

From

Martijn van Oosterhout

Date:

02 February 2006, 15:16:55

On Thu, Feb 02, 2006 at 02:05:03PM -0500, Mark Woodward wrote:
> My issue is this, (and this is NOT a slam on PostgreSQL), I have a number
> of physical databases on one machine on ports 5432, 5433, 5434. All
> running the same version and in fact, installation of PostgreSQL.

One way of acheiving this would be to allow the PGHOST and/or PGPORT
variables to be lists and when you connect it tries each combination
until it finds on that works. Maybe not as clean but a lot easier to
implement.

Unless ofcourse you want "psql -l" to list all databases in all
clusters...

I think it would be better to put the intelligence into libpq rather
than trying to create more servers...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Multiple logical databases

From

Josh Berkus

Date:

02 February 2006, 18:16:13

Mark,

> Even though they run on the same machine, run the same version of the
> software, and are used by the same applications, they have NO
> interoperability. For now, lets just accept that they need to be on
> separate physical clusters because some need to be able to started and
> stopped while others need to remain running, there are other reasons,
> but one reason will suffice for the discussion.

Well, to answer your original question, I personally would not see your 
general idea as useful at all.  I admin 9 or 10 PostgreSQL servers 
currently and have never run across a need, or even a desire, to do what 
you are doing.

In fact, if there's any general demand, it's to go the opposite way: 
patches to lock down the system tables and prevent switching databases to 
support ISPs and other shared-hosting situations.

For an immediate solution to what you are encountering, have you looked at 
pgPool?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

Re: Multiple logical databases

From

Mark Kirkwood

Date:

02 February 2006, 19:12:34

Mark Woodward wrote:

> My issue is this, (and this is NOT a slam on PostgreSQL), I have a number
> of physical databases on one machine on ports 5432, 5433, 5434. All
> running the same version and in fact, installation of PostgreSQL.
> 
> Even though they run on the same machine, run the same version of the
> software, and are used by the same applications, they have NO
> interoperability. For now, lets just accept that they need to be on
> separate physical clusters because some need to be able to started and
> stopped while others need to remain running, there are other reasons, but
> one reason will suffice for the discussion.
> 

Hmmm - do you really need to start and stop them? or are you just doing 
that to forbid user access whilst doing data loads etc?

If so, then you might get more buy-in by requesting enhancements that 
work with the design of Pg a little more (or I hope they do anyway....) e.g:

1/ Enable/disable (temporarily) user access to individual databases via 
a simple admin command (tho 'ALTER DATABASE xxx CONNECTION LIMIT 0' will 
suffice if you do loads with a superuser role).

2/ Restrict certain users to certain databases via simple admin commands 
(editing pg_hba.conf is not always convenient or possible).

3/ Make cross db relation references a little more transparent (e.g 
maybe introduce SYNONYM for this).

Other related possibilities come to mind, like being able to segment the  buffer cache on a database level (e.g: bigdb
gets90% of the shared 

buffers.... not 100%, as I want to keep smalldb's tables cached always....).

Cheers

Mark

Re: Multiple logical databases

From

Peter Eisentraut

Date:

02 February 2006, 19:41:23

Mark Woodward wrote:
> From an administration perspective, a single point of admin would
> seem like a logical and valuable objective, no?

I don't understand why you are going out of your way to separate your 
databases (for misinformed reasons, it appears) and then want to design 
a way to centrally control them so they can all fail together.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Multiple logical databases

From

"Mark Woodward"

Date:

02 February 2006, 22:49:30

> Mark Woodward wrote:
>> From an administration perspective, a single point of admin would
>> seem like a logical and valuable objective, no?
>
> I don't understand why you are going out of your way to separate your
> databases (for misinformed reasons, it appears) and then want to design
> a way to centrally control them so they can all fail together.
>

Oh come on, "misinformed?" is that really called for?

Think about a website that (and I have one) has the U.S.A. Streetmap
database, the freedb CD database, and a slew of sites based on phpbb and
drupal.

Maybe one should put them all in one database cluster, but...

The street database is typically generated and QAed in the lab. It is then
uploaded to the server. It has many millions of rows and about a half
dozen indexes. To dump and reload takes almost a day.

Compressing the DB and uploading it into the site, uncompressing it,
stoping the current postgresql process, swapping the data directory, and
restarting it can be done in about an hour. One can not do this if the
street map database is part of the standard database cluster. The same
thing happens with the freedb database.

Unless you can tell me how to insert live data and indexes to a cluster
without having to reload the data and recreate the indexes, then I hardly
think I am "misinformed." The ad hominem attack wasn't nessisary.

I have no problem with disagreement, but I take exception to insult.

If no one sees a way to manage multiple physical database clusters as one
logical cluster as something worth doing, then so be it. I have a
practical example of a valid reason how this would make PostgreSQL easier
to work with. Yes there are work arounds. Yes it is not currently
unworkable.

It is just that it could be better. As I mentioned earlier, I have been
dealing with this sort of problem for a number of years now, and I think
this is the "cool" solution to the problem.

Re: Multiple logical databases

From

Tino Wildenhain

Date:

03 February 2006, 03:16:50

Mark Woodward schrieb:
...
> Unless you can tell me how to insert live data and indexes to a cluster
> without having to reload the data and recreate the indexes, then I hardly
> think I am "misinformed." The ad hominem attack wasn't nessisary.

I see you had a usecase for something like pg_diff and pg_patch ;)
...
> If no one sees a way to manage multiple physical database clusters as one
> logical cluster as something worth doing, then so be it. I have a
> practical example of a valid reason how this would make PostgreSQL easier
> to work with. Yes there are work arounds. Yes it is not currently
> unworkable.

I dont see your problem, really ;)

1) if you have very big and very workloaded databases, you often have
them on different physically boxes anyway
2) you can run any number of postmasters on the same box - just put  them to listen on different ip:port.

Now to the management - you say cddb and geodb are managed off host.
So they are not managed on the life server and so you dont need to
switch your psql console to them.

And yeah, its really not a problem, to quit psql and connect
to a different server anyway :-)

If you dont like to type -p otherport, you can either create
aliases with all the arguments or use something like pgadmin3
which enables you to easy switch from database to database,
from host to host as you like.

Now is there any usecase I have missed which you still would
like to have addressed?

Kind regards
Tino Wildenhain

Re: Multiple logical databases

From

Richard Huxton

Date:

03 February 2006, 04:12:10

Josh Berkus wrote:
> Mark,
> 
>> Even though they run on the same machine, run the same version of the
>> software, and are used by the same applications, they have NO
>> interoperability. For now, lets just accept that they need to be on
>> separate physical clusters because some need to be able to started and
>> stopped while others need to remain running, there are other reasons,
>> but one reason will suffice for the discussion.
> 
> For an immediate solution to what you are encountering, have you looked at 
> pgPool?

I agree with Josh - pgpool sounds like the place to start with this.

That's got to be the easiest place to add some sort of "listall"/"switch 
todb" functionality. It also means you're not *forced* to have only one 
version of PG, or have them all on the same machine.

--   Richard Huxton  Archonet Ltd

Re: Multiple logical databases

From

"Mark Woodward"

Date:

03 February 2006, 08:41:03

> Mark Woodward schrieb:
> ...
>> Unless you can tell me how to insert live data and indexes to a cluster
>> without having to reload the data and recreate the indexes, then I
>> hardly
>> think I am "misinformed." The ad hominem attack wasn't nessisary.
>
> I see you had a usecase for something like pg_diff and pg_patch ;)
> ...
>> If no one sees a way to manage multiple physical database clusters as
>> one
>> logical cluster as something worth doing, then so be it. I have a
>> practical example of a valid reason how this would make PostgreSQL
>> easier
>> to work with. Yes there are work arounds. Yes it is not currently
>> unworkable.
>
> I dont see your problem, really ;)
>
> 1) if you have very big and very workloaded databases, you often have
> them on different physically boxes anyway
> 2) you can run any number of postmasters on the same box - just put
>    them to listen on different ip:port.
>
> Now to the management - you say cddb and geodb are managed off host.
> So they are not managed on the life server and so you dont need to
> switch your psql console to them.
>
> And yeah, its really not a problem, to quit psql and connect
> to a different server anyway :-)
>
> If you dont like to type -p otherport, you can either create
> aliases with all the arguments or use something like pgadmin3
> which enables you to easy switch from database to database,
> from host to host as you like.
>
> Now is there any usecase I have missed which you still would
> like to have addressed?

I don't, as it happens, have these databases on different machines, but
come to think about it, maybe it doesn't matter.

The "port" aspect is troubling, it isn't really self documenting. The
application isn't psql, the applications are custom code written in PHP
and C/C++.

Like I said, in this thread of posts, yes there are ways of doing this,
and I've been doing it for years. It is just one of the rough eges that I
think could be smoother.

(in php)
pg_connect("dbname=geo host=dbserver");

Could connect and query the dbserver, if the db is not on it, connect to a
database of known servers, find geo, and use that information to connect.
It sounds like a simple thing, for sure, but to be useful, there needs to
be buy in from the group otherwise it is just some esoteric hack.

The point is, that I have been working with this sort of "use case" for a
number of years, and being able to represent multiple physical databases
as one logical db server would make life easier. It was a brainstorm I had
while I was setting this sort of system for the [n]th time.

For my part, I have tried to maintain my own change list for PostgreSQL in
the past, but it is a pain. The main source changes too frequently to keep
up and in the end is just another project to maintain.

Using the "/etc/hosts" file or DNS to maintain host locations for is a
fairly common and well known practice, but there is no such mechanism for
"ports." The problem now becomes a code issue, not a system administration
issue.

If one writes the code to their website to use a generic host name, say,
"dbserver," then one can easily test system changes locally and push the
code to a live site. The only difference is the host name. When a port is
involved, there is no systemic way to represent that to the operating
system, and must therefor be part of the code. As part of the code, it
must reside in a place where code has access, and must NOT be pushed with
the rest of the site.

Having some mechanism to deal with this would be cleaner IMHO.

Re: Multiple logical databases

From

Doug McNaught

Date:

03 February 2006, 09:24:29

"Mark Woodward" <pgsql@mohawksoft.com> writes:

> The point is, that I have been working with this sort of "use case" for a
> number of years, and being able to represent multiple physical databases
> as one logical db server would make life easier. It was a brainstorm I had
> while I was setting this sort of system for the [n]th time.

It sounds like all that would be needed is a kind of "smart
proxy"--has a list of database clusters on the machine and the
databases they contain, and speaks enough of the protocol to recognize
the startup packet and reroute it internally to the right cluster.
I've heard 'pgpool' mentioned here; from a quick look at the docs it
looks similar but not quite what you want.

So your databases would listen on 5433, 5434, etc and the proxy would
listen on 5432 and route everything properly.  If a particular cluster
is not up, the proxy could just error out the connection.

Hmm, that'd be fun to write if I ever find the time...

-Doug

Re: Multiple logical databases

From

Martijn van Oosterhout

Date:

03 February 2006, 09:34:50

On Fri, Feb 03, 2006 at 08:05:48AM -0500, Mark Woodward wrote:
> Using the "/etc/hosts" file or DNS to maintain host locations for is a
> fairly common and well known practice, but there is no such mechanism for
> "ports." The problem now becomes a code issue, not a system administration
> issue.

Actually, there is, it's in /etc/services and the functions are
getservbyname and getservbyport. I wonder if it'd be possible to have
psql use this if you put a string in the port part of the connect
string.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: Multiple logical databases

From

Chris Campbell

Date:

03 February 2006, 09:47:06

On Feb 3, 2006, at 08:05, Mark Woodward wrote:

> Using the "/etc/hosts" file or DNS to maintain host locations for is a
> fairly common and well known practice, but there is no such  
> mechanism for
> "ports." The problem now becomes a code issue, not a system  
> administration
> issue.

What if you assigned multiple IPs to a machine, then used ipfw (or  
something) to forward connections to port 5432 for each IP to the  
proper IP and port?

You could use /etc/hosts or DNS to give each IP a host name, and use  
it in your code.

For example (this only does forwarding for clients on localhost, but  
you get the idea), you could set up:

Host      IP:port          Forwards to
--------  ---------------  -----------------
db_one    127.0.1.1:5432   192.168.1.5:5432
db_two    127.0.1.2:5432   192.168.1.6:5432
db_three  127.0.1.3:5432   192.168.1.6:5433
fb_four   127.0.1.4:5432   16.51.209.8:8865

You could reconfigure the redirection by changing the ipfw  
configuration -- you wouldn't change your client code at all. It  
would continue to use a connection string of "... host=db_one", but  
you'd change 127.0.1.1:5432 to forward to the new IP/port.

Or use pgpool. :)

- Chris

Re: Multiple logical databases

From

"Mark Woodward"

Date:

03 February 2006, 10:58:20

> "Mark Woodward" <pgsql@mohawksoft.com> writes:
>
>> The point is, that I have been working with this sort of "use case" for
>> a
>> number of years, and being able to represent multiple physical databases
>> as one logical db server would make life easier. It was a brainstorm I
>> had
>> while I was setting this sort of system for the [n]th time.
>
> It sounds like all that would be needed is a kind of "smart
> proxy"--has a list of database clusters on the machine and the
> databases they contain, and speaks enough of the protocol to recognize
> the startup packet and reroute it internally to the right cluster.
> I've heard 'pgpool' mentioned here; from a quick look at the docs it
> looks similar but not quite what you want.
>
> So your databases would listen on 5433, 5434, etc and the proxy would
> listen on 5432 and route everything properly.  If a particular cluster
> is not up, the proxy could just error out the connection.
>
> Hmm, that'd be fun to write if I ever find the time...

It is similar to a proxy, yes, but that is just part of it. The setup and
running of these systems should all be managed.

Re: Multiple logical databases

From

Peter Eisentraut

Date:

03 February 2006, 11:06:22

Mark Woodward wrote:
> Oh come on, "misinformed?" is that really called for?

Claiming that all databases share the same system tables is misinformed, 
with no judgement passed.

> The street database is typically generated and QAed in the lab. It is
> then uploaded to the server. It has many millions of rows and about a
> half dozen indexes. To dump and reload takes almost a day.

There is work happening on speeding up bulk loads.

> Unless you can tell me how to insert live data and indexes to a
> cluster without having to reload the data and recreate the indexes,

I think this sort of thing can be worked on.  VACUUM FREEZE and some 
tool support could make this happen.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Multiple logical databases

From

Doug McNaught

Date:

03 February 2006, 11:29:36

"Mark Woodward" <pgsql@mohawksoft.com> writes:

> It is similar to a proxy, yes, but that is just part of it. The setup and
> running of these systems should all be managed.

All that requires is some scripts that wrap pg_ctl and bring the right
instances up and down, perhaps with a web interface on top of them.  I
don't see any need to put that functionality in the proxy.

-Doug

Re: Multiple logical databases

From

Chris Browne

Date:

03 February 2006, 13:04:09

pgsql@mohawksoft.com ("Mark Woodward") writes:
> The "port" aspect is troubling, it isn't really self
> documenting. The application isn't psql, the applications are custom
> code written in PHP and C/C++.

Nonsense.  See /etc/services

> Using the "/etc/hosts" file or DNS to maintain host locations for is
> a fairly common and well known practice, but there is no such
> mechanism for "ports." The problem now becomes a code issue, not a
> system administration issue.

Nonsense.  See /etc/services

> If one writes the code to their website to use a generic host name,
> say, "dbserver," then one can easily test system changes locally and
> push the code to a live site. The only difference is the host
> name. When a port is involved, there is no systemic way to represent
> that to the operating system, and must therefor be part of the
> code. As part of the code, it must reside in a place where code has
> access, and must NOT be pushed with the rest of the site.
>
> Having some mechanism to deal with this would be cleaner IMHO.

I'm sure it would be, that's why there has been one, which has been in
use since the issuance of RFC 349 by Jon Postel back in May of 1972.
The mechanism is nearly 34 years old.

Note that RFCs are no longer used to issue port listings, as per RFC
3232, back in 2002.  Now, IANA manages a repository of standard port
numbers, commonly populated into /etc/services.
 <http://www.iana.org/assignments/port-numbers>

For customizations, see:

% man 5 services
-- 
(format nil "~S@~S" "cbbrowne" "acm.org")
http://www.ntlug.org/~cbbrowne/sgml.html
"Motto for a research laboratory: What we work on today, others will
first think of tomorrow." -- Alan J. Perlis

Re: Multiple logical databases

From

Rick Gigger

Date:

03 February 2006, 13:42:08

On Feb 3, 2006, at 6:47 AM, Chris Campbell wrote:

> On Feb 3, 2006, at 08:05, Mark Woodward wrote:
>
>> Using the "/etc/hosts" file or DNS to maintain host locations for  
>> is a
>> fairly common and well known practice, but there is no such  
>> mechanism for
>> "ports." The problem now becomes a code issue, not a system  
>> administration
>> issue.
>
> What if you assigned multiple IPs to a machine, then used ipfw (or  
> something) to forward connections to port 5432 for each IP to the  
> proper IP and port?

If he had multiple ips couldn't he just make them all listen only on  
one specific ip (instead of '*') and just use the default port?

Re: Multiple logical databases

From

Chris Campbell

Date:

03 February 2006, 13:44:07

On Feb 3, 2006, at 12:43, Rick Gigger wrote:

> If he had multiple ips couldn't he just make them all listen only  
> on one specific ip (instead of '*') and just use the default port?

Yeah, but the main idea here is that you could use ipfw to forward  
connections *to other hosts* if you wanted to. Basically working like  
a proxy.

- Chris

Re: Multiple logical databases

From

Josh Berkus

Date:

03 February 2006, 14:23:04

Mark, all:

> > So your databases would listen on 5433, 5434, etc and the proxy would
> > listen on 5432 and route everything properly.  If a particular cluster
> > is not up, the proxy could just error out the connection.
> >
> > Hmm, that'd be fun to write if I ever find the time...
>
> It is similar to a proxy, yes, but that is just part of it. The setup
> and running of these systems should all be managed.

Per my earlier comment, this really seems like an obvious extension of 
pgPool, or Sequoia if you're a java geek.  No need to re-invent the wheel.

In terms of the PostgreSQL Core, though, Mark, it sounds like you're 
treating the symptoms and not the causes.   What you really need is a way 
to load a large database very quickly (in binary form or otherwise) 
without downing the cluster.  This is a generally desired feature that has 
been discussed several times on this list, and you could get general 
agreement on easily.

The feature you proposed is a way to make your idiosyncratic setup easier 
to manage, but doesn't apply to anyone else's problems on this list, so 
you're going to have a hard time drumming up enthusiasm.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

Re: Multiple logical databases

From

Jeremy Drake

Date:

03 February 2006, 15:12:37

On Fri, 3 Feb 2006, Josh Berkus wrote:

> The feature you proposed is a way to make your idiosyncratic setup easier
> to manage, but doesn't apply to anyone else's problems on this list, so
> you're going to have a hard time drumming up enthusiasm.

I am somewhat reluctant to interject into this discussion, but the
particular "idiosyncratic setup" referred to is not the only one where
this may be useful.

The immediate use I thought of was being able to have what appeared to be
multiple databases on the same server with different locale settings,
which cannot be changed post-initdb.  I could see having different
databases in different locales being a useful feature, perhaps in a
wikipedia type setup so that the english, chinese, and arabic wikis could
each provide the correct sort order and other locale-specific properties
while still providing a single logical database "server" for connection
strings.  This just being the first example I could think of in which such
a setup could be useful.

-- 
In the beginning, I was made.  I didn't ask to be made.  No one consulted
with me or considered my feelings in this matter.  But if it brought some
passing fancy to some lowly humans as they haphazardly pranced their way
through life's mournful jungle, then so be it.
- Marvin the Paranoid Android, From Douglas Adams' Hitchiker's Guide to the
Galaxy Radio Scripts

Re: Multiple logical databases

From

Josh Berkus

Date:

03 February 2006, 15:20:02

Jeremy,

> The immediate use I thought of was being able to have what appeared to
> be multiple databases on the same server with different locale settings,
> which cannot be changed post-initdb. 

Again, this is patching the symtoms instead of going after the cause.  The 
real issue you're trying to address is not being able to set locale per 
database, which is what we really want.

Not that symptomatic cures are out of the question for add-ons, like pgPool 
(and I could see a lot of uses for a pgPool that could obscure the fact 
that it was connecting to multiple servers).  But they aren't the way to 
go for the core code.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

Re: Multiple logical databases

From

"Mark Woodward"

Date:

03 February 2006, 16:01:56

> On Feb 3, 2006, at 12:43, Rick Gigger wrote:
>
>> If he had multiple ips couldn't he just make them all listen only
>> on one specific ip (instead of '*') and just use the default port?
>
> Yeah, but the main idea here is that you could use ipfw to forward
> connections *to other hosts* if you wanted to. Basically working like
> a proxy.

I certainly hope that is not the idea, ipfw would be a whole level of
complexity to be avoided.

Re: Multiple logical databases

From

"Mark Woodward"

Date:

03 February 2006, 16:04:07

>
> On Feb 3, 2006, at 6:47 AM, Chris Campbell wrote:
>
>> On Feb 3, 2006, at 08:05, Mark Woodward wrote:
>>
>>> Using the "/etc/hosts" file or DNS to maintain host locations for
>>> is a
>>> fairly common and well known practice, but there is no such
>>> mechanism for
>>> "ports." The problem now becomes a code issue, not a system
>>> administration
>>> issue.
>>
>> What if you assigned multiple IPs to a machine, then used ipfw (or
>> something) to forward connections to port 5432 for each IP to the
>> proper IP and port?
>
> If he had multiple ips couldn't he just make them all listen only on
> one specific ip (instead of '*') and just use the default port?

That is a good idea, and yes it would work, as do a lot of other
scenarios, but shouldn't PostgreSQL take care of "PostgreSQL?"

Re: Multiple logical databases

From

Bruce Momjian

Date:

03 February 2006, 20:37:37

Mark Woodward wrote:
> >
> > On Feb 3, 2006, at 6:47 AM, Chris Campbell wrote:
> >
> >> On Feb 3, 2006, at 08:05, Mark Woodward wrote:
> >>
> >>> Using the "/etc/hosts" file or DNS to maintain host locations for
> >>> is a
> >>> fairly common and well known practice, but there is no such
> >>> mechanism for
> >>> "ports." The problem now becomes a code issue, not a system
> >>> administration
> >>> issue.
> >>
> >> What if you assigned multiple IPs to a machine, then used ipfw (or
> >> something) to forward connections to port 5432 for each IP to the
> >> proper IP and port?
> >
> > If he had multiple ips couldn't he just make them all listen only on
> > one specific ip (instead of '*') and just use the default port?
> 
> That is a good idea, and yes it would work, as do a lot of other
> scenarios, but shouldn't PostgreSQL take care of "PostgreSQL?"

PostgreSQL takes care of PostgreSQL only if it is best at doing it ---
in thise case, it is not.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Multiple logical databases

From

Martijn van Oosterhout

Date:

18 February 2006, 17:38:17

On Fri, Feb 03, 2006 at 08:05:48AM -0500, Mark Woodward wrote:
> Like I said, in this thread of posts, yes there are ways of doing this,
> and I've been doing it for years. It is just one of the rough eges that I
> think could be smoother.
>
> (in php)
> pg_connect("dbname=geo host=dbserver");
>
> Could connect and query the dbserver, if the db is not on it, connect to a
> database of known servers, find geo, and use that information to connect.
> It sounds like a simple thing, for sure, but to be useful, there needs to
> be buy in from the group otherwise it is just some esoteric hack.

It turns out what you like actually exists, lookup the "service"
parameter in the connectdb string. It will read the values for the
server, port, etc from a pg_service.conf file.

There is an example in the tree but it looks something like the following:

[servicename]
dbname=blah
user=blah
pass=blah

So all you need to specify is "service=servicename" and it will grab
the parameters. This allows you to change the connection without
changeing the code.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

pg_service.conf

From

"Mark Woodward"

Date:

19 February 2006, 10:32:11

> On Fri, Feb 03, 2006 at 08:05:48AM -0500, Mark Woodward wrote:
>> Like I said, in this thread of posts, yes there are ways of doing this,
>> and I've been doing it for years. It is just one of the rough eges that
>> I
>> think could be smoother.
>>
>> (in php)
>> pg_connect("dbname=geo host=dbserver");
>>
>> Could connect and query the dbserver, if the db is not on it, connect to
>> a
>> database of known servers, find geo, and use that information to
>> connect.
>> It sounds like a simple thing, for sure, but to be useful, there needs
>> to
>> be buy in from the group otherwise it is just some esoteric hack.
>
> It turns out what you like actually exists, lookup the "service"
> parameter in the connectdb string. It will read the values for the
> server, port, etc from a pg_service.conf file.
>
> There is an example in the tree but it looks something like the following:
>
> [servicename]
> dbname=blah
> user=blah
> pass=blah
>
> So all you need to specify is "service=servicename" and it will grab
> the parameters. This allows you to change the connection without
> changeing the code.
>

This is a great feature!!

It doesn't seem to be documented in the administrators guide. Its
mentioned in the libpq section, and only a reference to
pg_service.conf.sample

IMHO we should push for this to be the mainstream connection
methodology!!! The variables: host, port, and dbname are very problematic
for admins and developers who often live in different worlds.

The developers "should" just use the "servicename" of a database, and the
admins should maintain pg_service.conf. This moves the responsibility of
the wheres and hows of connecting to the database to the admin away from
the developer.

Should there be a section of the administration manual for this?

Re: pg_service.conf

From

Simon Riggs

Date:

19 February 2006, 10:50:37

On Sun, 2006-02-19 at 10:00 -0500, Mark Woodward wrote:
> > On Fri, Feb 03, 2006 at 08:05:48AM -0500, Mark Woodward wrote:
> >> Like I said, in this thread of posts, yes there are ways of doing this,
> >> and I've been doing it for years. It is just one of the rough eges that
> >> I
> >> think could be smoother.
> >>
> >> (in php)
> >> pg_connect("dbname=geo host=dbserver");
> >>
> >> Could connect and query the dbserver, if the db is not on it, connect to
> >> a
> >> database of known servers, find geo, and use that information to
> >> connect.
> >> It sounds like a simple thing, for sure, but to be useful, there needs
> >> to
> >> be buy in from the group otherwise it is just some esoteric hack.
> >
> > It turns out what you like actually exists, lookup the "service"
> > parameter in the connectdb string. It will read the values for the
> > server, port, etc from a pg_service.conf file.
> >
> > There is an example in the tree but it looks something like the following:
> >
> > [servicename]
> > dbname=blah
> > user=blah
> > pass=blah
> >
> > So all you need to specify is "service=servicename" and it will grab
> > the parameters. This allows you to change the connection without
> > changeing the code.
> >
> 
> This is a great feature!!

Yes, it is, but there is a distinct difference between what you asked
for and what have been described as solutions (good though they are).

Both services and pg_service.conf are client-side solutions. So if you
have 20,000 clients to worry about you have some problems. What was
proposed was a central naming service (described as a database of known
servers) that would allow a server-side name to service mapping.

A server-side (i.e. centrally managed) name server seems like an
improvement over the client-side solutions described, IMHO, but I'd
leave it to others to describe how that might work. (e.g. DNS is a
better solution than multiple distributed /etc/hosts files).

Best Regards, Simon Riggs

Re: pg_service.conf

From

Martijn van Oosterhout

Date:

19 February 2006, 10:53:44

On Sun, Feb 19, 2006 at 10:00:01AM -0500, Mark Woodward wrote:
> > It turns out what you like actually exists, lookup the "service"
> > parameter in the connectdb string. It will read the values for the
> > server, port, etc from a pg_service.conf file.
> >
> > There is an example in the tree but it looks something like the following:
> >
> > [servicename]
> > dbname=blah
> > user=blah
> > pass=blah
> >
> > So all you need to specify is "service=servicename" and it will grab
> > the parameters. This allows you to change the connection without
> > changeing the code.
> >
>
> This is a great feature!!
>
> It doesn't seem to be documented in the administrators guide. Its
> mentioned in the libpq section, and only a reference to
> pg_service.conf.sample

Indeed, I only just found out about it yesterday. It's a very little
known feature that needs some advertisement. Right now we need to work
up some documentation patches so people come across it easier.

Where do you think it should be mentioned?
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: pg_service.conf

From

Douglas McNaught

Date:

19 February 2006, 10:58:02

Simon Riggs <simon@2ndquadrant.com> writes:

> A server-side (i.e. centrally managed) name server seems like an
> improvement over the client-side solutions described, IMHO, but I'd
> leave it to others to describe how that might work. (e.g. DNS is a
> better solution than multiple distributed /etc/hosts files).

Funnily enough, you could *use* DNS for this--you could define a
custom RR type containing hostname, port, database etc and have
entries in DNS for each "service" (e.g. 'production-db.mycorp.com').
I think HESIOD used this mechanism.

Of course, you'd need an internal DNS server that you had full control
over...

-Doug

Re: pg_service.conf

From

Martijn van Oosterhout

Date:

19 February 2006, 11:10:34

On Sun, Feb 19, 2006 at 09:58:01AM -0500, Douglas McNaught wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
>
> > A server-side (i.e. centrally managed) name server seems like an
> > improvement over the client-side solutions described, IMHO, but I'd
> > leave it to others to describe how that might work. (e.g. DNS is a
> > better solution than multiple distributed /etc/hosts files).
>
> Funnily enough, you could *use* DNS for this--you could define a
> custom RR type containing hostname, port, database etc and have
> entries in DNS for each "service" (e.g. 'production-db.mycorp.com').
> I think HESIOD used this mechanism.

Well, there exist such things as SRV records already for describing how
to find services. In theory you could create an entry like:

_postgres._tcp.example.com  SRV  10 5 5432 db1.example.com

So that if you typed "psql example.com" it would lookup the server and
port number. You may be able to put a dbname after that, not sure. And
you can always put whatever you like into a TXT record.

In any case, someone still needs to write the code for it.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: pg_service.conf

From

"Mark Woodward"

Date:

19 February 2006, 11:13:14

> On Sun, Feb 19, 2006 at 10:00:01AM -0500, Mark Woodward wrote:
>> > It turns out what you like actually exists, lookup the "service"
>> > parameter in the connectdb string. It will read the values for the
>> > server, port, etc from a pg_service.conf file.
>> >
>> > There is an example in the tree but it looks something like the
>> following:
>> >
>> > [servicename]
>> > dbname=blah
>> > user=blah
>> > pass=blah
>> >
>> > So all you need to specify is "service=servicename" and it will grab
>> > the parameters. This allows you to change the connection without
>> > changeing the code.
>> >
>>
>> This is a great feature!!
>>
>> It doesn't seem to be documented in the administrators guide. Its
>> mentioned in the libpq section, and only a reference to
>> pg_service.conf.sample
>
> Indeed, I only just found out about it yesterday. It's a very little
> known feature that needs some advertisement. Right now we need to work
> up some documentation patches so people come across it easier.
>
> Where do you think it should be mentioned?

As it was mentioned in another reply, this is not "everything" I wanted,
but it is a big step closer that makes the rest managable.

As for the "central" administration issue, yes, it is not a central
administration solution, but files like these fall into the category of
one to many "push" strategies, something like "bulkcopy -f targets
pg_service.conf /usr/local/etc"

I think it should be clearly in the administration section of the manual.
A DBA is not going to look at the libpq section, similarly, PHP or Java
developers won't either. I use libpq all the time, the last time I looked
at pq_connect was years ago.

Like I said, this is a REALLY USEFULL feature that should be presented as
the "best method" for specifying databases, in the administration manual.
It should also be mentioned in the PHP API as well.

Re: pg_service.conf

From

"Mark Woodward"

Date:

19 February 2006, 11:24:29

> On Sun, 2006-02-19 at 10:00 -0500, Mark Woodward wrote:
>> > On Fri, Feb 03, 2006 at 08:05:48AM -0500, Mark Woodward wrote:
>> >> Like I said, in this thread of posts, yes there are ways of doing
>> this,
>> >> and I've been doing it for years. It is just one of the rough eges
>> that
>> >> I
>> >> think could be smoother.
>> >>
>> >> (in php)
>> >> pg_connect("dbname=geo host=dbserver");
>> >>
>> >> Could connect and query the dbserver, if the db is not on it, connect
>> to
>> >> a
>> >> database of known servers, find geo, and use that information to
>> >> connect.
>> >> It sounds like a simple thing, for sure, but to be useful, there
>> needs
>> >> to
>> >> be buy in from the group otherwise it is just some esoteric hack.
>> >
>> > It turns out what you like actually exists, lookup the "service"
>> > parameter in the connectdb string. It will read the values for the
>> > server, port, etc from a pg_service.conf file.
>> >
>> > There is an example in the tree but it looks something like the
>> following:
>> >
>> > [servicename]
>> > dbname=blah
>> > user=blah
>> > pass=blah
>> >
>> > So all you need to specify is "service=servicename" and it will grab
>> > the parameters. This allows you to change the connection without
>> > changeing the code.
>> >
>>
>> This is a great feature!!
>
> Yes, it is, but there is a distinct difference between what you asked
> for and what have been described as solutions (good though they are).

Well, true, it isn't what I want, but it makes a big step.
>
> Both services and pg_service.conf are client-side solutions. So if you
> have 20,000 clients to worry about you have some problems. What was
> proposed was a central naming service (described as a database of known
> servers) that would allow a server-side name to service mapping.

True, but the one to many cluster push solution has been dealt with so
many times that as a datacenter solution isn't too troubling.

>
> A server-side (i.e. centrally managed) name server seems like an
> improvement over the client-side solutions described, IMHO, but I'd
> leave it to others to describe how that might work. (e.g. DNS is a
> better solution than multiple distributed /etc/hosts files).

DNS isn't always a better solution than /etc/hosts, both have their pros
and cons. The /etc/hosts file is very useful for "instantaneous,"
reliable, and redundent name lookups. DNS services, espcially in a large
service environment can get bogged down. 20,000 hosts doing a lot of
lookups can require a dedicated single point of failure. OK, so you add
two DNS machines and load balance across them with a fault tollerant load
balancer, how many thousands of dollars? For how much information? A
simple "clustercpy -f targets pg_service.conf /etc" would save thousands
of dollars, increase efficiency, increase reliability, decrease electrical
costs, etc.

Don't get me wrong, DNS, as it is designed, is PERFECT for the distributed
nature of the internet, but replication of fairly static data under the
control of a central authority (the admin) is better.
>
> Best Regards, Simon Riggs
>

Re: pg_service.conf

From

Peter Eisentraut

Date:

19 February 2006, 11:56:17

Mark Woodward wrote:
> Don't get me wrong, DNS, as it is designed, is PERFECT for the
> distributed nature of the internet, but replication of fairly static
> data under the control of a central authority (the admin) is better.

What about this zeroconf/bonjour stuff?  I'm not familiar with it, but 
it sounds like it could tie into this discussion.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: pg_service.conf

From

Martijn van Oosterhout

Date:

19 February 2006, 12:10:50

On Sun, Feb 19, 2006 at 04:56:11PM +0100, Peter Eisentraut wrote:
> Mark Woodward wrote:
> > Don't get me wrong, DNS, as it is designed, is PERFECT for the
> > distributed nature of the internet, but replication of fairly static
> > data under the control of a central authority (the admin) is better.
>
> What about this zeroconf/bonjour stuff?  I'm not familiar with it, but
> it sounds like it could tie into this discussion.

I think the major issue is that most such systems (like RFC2782) deal
only with finding the hostname:port of the service and don't deal with
usernames/passwords/dbname. What we want is a system that not only
finds the service, but tells you enough to connect. You can't connect
to a postgres server without a dbname and these discovery protocols
don't generally provide that.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: pg_service.conf

From

Douglas McNaught

Date:

19 February 2006, 12:12:20

Peter Eisentraut <peter_e@gmx.net> writes:

> Mark Woodward wrote:
>> Don't get me wrong, DNS, as it is designed, is PERFECT for the
>> distributed nature of the internet, but replication of fairly static
>> data under the control of a central authority (the admin) is better.
>
> What about this zeroconf/bonjour stuff?  I'm not familiar with it, but 
> it sounds like it could tie into this discussion.

That's a possibility, but I think it's hard to make it work outside a
single LAN (as in, it's not zero-conf anymore :) because it relies on
broadcasts. 

-Doug

Re: pg_service.conf

From

Douglas McNaught

Date:

19 February 2006, 12:15:50

"Mark Woodward" <pgsql@mohawksoft.com> writes:

> DNS isn't always a better solution than /etc/hosts, both have their pros
> and cons. The /etc/hosts file is very useful for "instantaneous,"
> reliable, and redundent name lookups. DNS services, espcially in a large
> service environment can get bogged down. 20,000 hosts doing a lot of
> lookups can require a dedicated single point of failure. OK, so you add
> two DNS machines and load balance across them with a fault tollerant load
> balancer, how many thousands of dollars? For how much information? A
> simple "clustercpy -f targets pg_service.conf /etc" would save thousands
> of dollars, increase efficiency, increase reliability, decrease electrical
> costs, etc.

Um, is there something wrong with having multiple DNS servers in
resolv.conf?  Other than having to time out on #1 before you try #2?
I'm genuinely curious.

> Don't get me wrong, DNS, as it is designed, is PERFECT for the distributed
> nature of the internet, but replication of fairly static data under the
> control of a central authority (the admin) is better.

You're probably right; clustercpy or rsync would work better if you
have admin access to all the machines in question.  The nice thing
about the DNS method is that you wouldn't necessarily have to have
that access on an ongoing basis.

-Doug

Re: pg_service.conf

From

Tom Lane

Date:

19 February 2006, 13:41:36

Martijn van Oosterhout <kleptog@svana.org> writes:
> I think the major issue is that most such systems (like RFC2782) deal
> only with finding the hostname:port of the service and don't deal with
> usernames/passwords/dbname. What we want is a system that not only
> finds the service, but tells you enough to connect.

In other words, anyone on the LAN who asks nicely can get a database
password?  No thank you.

I don't actually believe that a server-side substitute for pg_service
would be worth anything at all.  First, it just begs the question of 
how you find the server.  Second, pg_service is only really interesting
if there are multiple servers you want to connect to.  It's not
reasonable to assume that one of them will know about any (let alone
all) of the others.  Once you start to think about security it's even
worse: you've got that one storing passwords and so on for the other
servers.

My complaint about pg_service is actually that it should have been
designed to support per-user values more easily.  It's a takeoff on
the ODBC ini file concept, but we forgot the per-user ~/.odbc.ini part.
        regards, tom lane

Re: pg_service.conf

From

"Mark Woodward"

Date:

19 February 2006, 14:27:36

> Martijn van Oosterhout <kleptog@svana.org> writes:
>> I think the major issue is that most such systems (like RFC2782) deal
>> only with finding the hostname:port of the service and don't deal with
>> usernames/passwords/dbname. What we want is a system that not only
>> finds the service, but tells you enough to connect.
>
> In other words, anyone on the LAN who asks nicely can get a database
> password?  No thank you.
>
> I don't actually believe that a server-side substitute for pg_service
> would be worth anything at all.  First, it just begs the question of
> how you find the server.  Second, pg_service is only really interesting
> if there are multiple servers you want to connect to.  It's not
> reasonable to assume that one of them will know about any (let alone
> all) of the others.  Once you start to think about security it's even
> worse: you've got that one storing passwords and so on for the other
> servers.

Tom, mark your calendar, I think in this one instance, we are in 100%
total agreement. I'm not sure what this means, does one of have to change
our opinion?

Actually, pg_service.conf, as I think more about it, is more than just
"pg_service is only really interesting if there are multiple servers you
want to connect to," it even abstracts the physical database name, which
is interesting as well.

>
> My complaint about pg_service is actually that it should have been
> designed to support per-user values more easily.  It's a takeoff on
> the ODBC ini file concept, but we forgot the per-user ~/.odbc.ini part.

I can certainly see that application, and it should be trivial to add any
that code. Do you think it is worth doing?

Re: pg_service.conf

From

"Mark Woodward"

Date:

19 February 2006, 14:29:43

> Mark Woodward wrote:
>> Don't get me wrong, DNS, as it is designed, is PERFECT for the
>> distributed nature of the internet, but replication of fairly static
>> data under the control of a central authority (the admin) is better.
>
> What about this zeroconf/bonjour stuff?  I'm not familiar with it, but
> it sounds like it could tie into this discussion.
>

Perhaps zeroconf is useful for stuff like thin clients, but I'm not sure
that it introduces anything into this discussion.

Re: pg_service.conf

From

"Mark Woodward"

Date:

19 February 2006, 14:32:05

> "Mark Woodward" <pgsql@mohawksoft.com> writes:
>
>> DNS isn't always a better solution than /etc/hosts, both have their pros
>> and cons. The /etc/hosts file is very useful for "instantaneous,"
>> reliable, and redundent name lookups. DNS services, espcially in a large
>> service environment can get bogged down. 20,000 hosts doing a lot of
>> lookups can require a dedicated single point of failure. OK, so you add
>> two DNS machines and load balance across them with a fault tollerant
>> load
>> balancer, how many thousands of dollars? For how much information? A
>> simple "clustercpy -f targets pg_service.conf /etc" would save thousands
>> of dollars, increase efficiency, increase reliability, decrease
>> electrical
>> costs, etc.
>
> Um, is there something wrong with having multiple DNS servers in
> resolv.conf?  Other than having to time out on #1 before you try #2?
> I'm genuinely curious.

What is the "timeout" of that DNS lookup, before it goes to the second DNS
server?



>
>> Don't get me wrong, DNS, as it is designed, is PERFECT for the
>> distributed
>> nature of the internet, but replication of fairly static data under the
>> control of a central authority (the admin) is better.
>
> You're probably right; clustercpy or rsync would work better if you
> have admin access to all the machines in question.  The nice thing
> about the DNS method is that you wouldn't necessarily have to have
> that access on an ongoing basis.

That is, of course, one of DNS' pros, but in an environment where that is
not nessisary, why bother?

Re: pg_service.conf

From

Douglas McNaught

Date:

19 February 2006, 16:51:00

"Mark Woodward" <pgsql@mohawksoft.com> writes:

>> Um, is there something wrong with having multiple DNS servers in
>> resolv.conf?  Other than having to time out on #1 before you try #2?
>> I'm genuinely curious.
>
> What is the "timeout" of that DNS lookup, before it goes to the second DNS
> server?

I think on the order of 20-30 seconds, which may or may not be an
issue.

-Doug

Re: pg_service.conf

From

Steve Atkins

Date:

19 February 2006, 18:14:52

On Feb 19, 2006, at 10:59 AM, Mark Woodward wrote:

>> "Mark Woodward" <pgsql@mohawksoft.com> writes:
>>
>>> DNS isn't always a better solution than /etc/hosts, both have  
>>> their pros
>>> and cons. The /etc/hosts file is very useful for "instantaneous,"
>>> reliable, and redundent name lookups. DNS services, espcially in  
>>> a large
>>> service environment can get bogged down. 20,000 hosts doing a lot of
>>> lookups can require a dedicated single point of failure. OK, so  
>>> you add
>>> two DNS machines and load balance across them with a fault tollerant
>>> load
>>> balancer, how many thousands of dollars? For how much information? A
>>> simple "clustercpy -f targets pg_service.conf /etc" would save  
>>> thousands
>>> of dollars, increase efficiency, increase reliability, decrease
>>> electrical
>>> costs, etc.
>>
>> Um, is there something wrong with having multiple DNS servers in
>> resolv.conf?  Other than having to time out on #1 before you try #2?
>> I'm genuinely curious.
>
> What is the "timeout" of that DNS lookup, before it goes to the  
> second DNS
> server?

Depends on the resolver you use. Often the "timeout" is zero. Other  
times
it's adaptive, depending on history of response time from the servers.

Except in the case of horrible misconfiguration, it's rarely a problem.

Cheers,  Steve

Re: pg_service.conf

From

Bruce Momjian

Date:

24 February 2006, 22:28:38

Mark Woodward wrote:
> > It turns out what you like actually exists, lookup the "service"
> > parameter in the connectdb string. It will read the values for the
> > server, port, etc from a pg_service.conf file.
> >
> > There is an example in the tree but it looks something like the following:
> >
> > [servicename]
> > dbname=blah
> > user=blah
> > pass=blah
> >
> > So all you need to specify is "service=servicename" and it will grab
> > the parameters. This allows you to change the connection without
> > changeing the code.
> >
> 
> This is a great feature!!
> 
> It doesn't seem to be documented in the administrators guide. Its
> mentioned in the libpq section, and only a reference to
> pg_service.conf.sample
> 
> IMHO we should push for this to be the mainstream connection
> methodology!!! The variables: host, port, and dbname are very problematic
> for admins and developers who often live in different worlds.

The documenation is sparse because at the time it was added, there was
little interest in it from the community, so a single mention was added
and documentation was pushed into the config file.  We can adjust that
now that there is interest.

--  Bruce Momjian   http://candle.pha.pa.us SRA OSS, Inc.   http://www.sraoss.com
 + If your life is a hard drive, Christ can be your backup. +