Thread: Conservation of OIDs
I run three instances of a database, in a typical change-control scenario: Production, QAT, and DEV. The Production database is the "real" data, and we periodically take a back up from Prod and re-instantiate QAT and DEV by dropping them and then restoring from the Prod backup. The development (DEV) instance is used by application developers and DBA's to "play" with, when initially trying out or developing new features. It gives them completely realistic sample data to work with, and I don't care if they screw up the data or even accidentally delete all of it, because I can easily re-create DEV from PROD. QAT is somewhat similar to DEV, but it is intended to be the proving ground for newly implemented features: we start with QAT in the same, (presumably stable) state as Prod, apply any required changes, test the client application and data, repeat as necessary until it works right, then process the same changes against Prod. My question relates to how to avoid expending OID's unnecessarily. I have declared all my tables as WITHOUT OIDS, but my understanding is that applies only to the actual table data and that database objects, like tables, view, procedures, etc, still have a uniquie OID assigned at instantiation. So every time I refresh QAT and DEV from the production database, I burn up lots of OID's. Not that OID's are in short supply, but I'm anal retentive about these things and so if there is a straight-forward way to avoid unnecesary OID consumption it would help me sleep better. ~Berend Tober
On Fri, 14 Nov 2003 10:01:51 -0500 (EST), <btober@seaworthysys.com> wrote: >The Production database is the "real" data, and we periodically take a >back up from Prod and re-instantiate QAT and DEV by dropping them and >then restoring from the Prod backup. > Not that OID's are in short supply, >but I'm anal retentive about these things and so if there is a >straight-forward way to avoid unnecesary OID consumption it would help me >sleep better. OIDs are unsigned 32 bit. Something like 1000 or 10000 are reserved for system use. So you still have more than 4000000000 for your objects. How many objects (tables, indices, operators, functions, ...) are there, and how often is "periodically"? If you're really concerned, you can initdb separate clusters for QAT and DEV and run three postmasters using three different ports. Servus Manfred
> On Fri, 14 Nov 2003 10:01:51 -0500 (EST), <btober@seaworthysys.com> > wrote: >>The Production database is the "real" data, and we periodically take a >> back up from Prod and re-instantiate QAT and DEV by dropping them and >> then restoring from the Prod backup. > >> Not that OID's are in short supply, >>but I'm anal retentive about these things and so if there is a >>straight-forward way to avoid unnecesary OID consumption it would help >> me sleep better. > > OIDs are unsigned 32 bit...you still have more than 4000000000 for your > objects. How many objects (tables, indices, operators, functions, ...) Only several thousand, so as a practical matter I realize in this case it realistically is not a problem, but for the sake of completenss in understanding how I *could* do this, I'd like to learn. > ...and how often is "periodically"? I guess periodically isn't really the right word. We may not refresh QAT or DEV at all for weeks. But when a change is required, there may be many, say dozens, of iterations on some days when testing is in full operation because we want to be sure to have QAT in the same state as PROD before applying a sequence of changes for QA testing that are proposed for PROD before implementing in PROD. So an iterative cycle happens where a proposed change is implemented in QAT, problems are discovered in QAT, we try to fix the problems, then refresh QAT, re-apply, test again, etc., until we get it right and are confident that we need do it only once in PROD. Another reason we may do a refresh, against PROD even, is to change a column width, since, as numerous list participants here have realized, is not really very easy otherwise, if you want to be sure to get all dependent views, rules, and triggers to not break. > ...you can initdb separate clusters for QAT > and DEV and run three postmasters using three different ports. This is what I was thinking but didn't really understand fully what I would have to do. So your suggestion involves initdb -D my_QAT_path then psql -f my_PROD_backup.sql ... [?] Can you help me out with a little more detail and tell me where does the "different port" assignement come into play? I know I have the default port 5432 in my postgresql.conf file, but how do I associate different ports with the different postmasters? I guess the -h option on the start command line, but my RH Linux machine is setup up start the postmaster automatically on startup, using the /etc/init.d/postgresql script file. How would I get the postmaster to auto start multiple instances using the different port configurations? ~Berend Tober
> > If you're really concerned, you can initdb separate clusters for QAT > and DEV and run three postmasters using three different ports. > Follow-up question: Are different ports really necessary? I currently have the three different databases defined all in the same cluster, and differentiated by name, e.g., mydb, mydbqat, and mydbdev. If I have the postmaster start these three instances in separate clusters, respectively, using three different directories, such as $ postmaster -D /usr/local/pgsql/data $ postmaster -D /usr/local/pgsql/qat $ postmaster -D /usr/local/pgsql/dev and make no change regarding which port the daemon listens on, but DO have a single, unique database name, one per cluster (mydb, mydbqat, mydbdev) in the respective clusters, will connections requests to any one of these databases on that same port find their way to the right place (and back)? Or maybe would be better to not used different database names at all to differential PROD, QAT, and DEV, and instead differentiate solely which cluster is to be used by means of your suggestion of using different ports? I dunno. What do you think? ~Berend Tober
btober@seaworthysys.com writes: > Follow-up question: Are different ports really necessary? Yes. > I currently have the three different databases defined all in the same > cluster, and differentiated by name, e.g., mydb, mydbqat, and mydbdev. > If I have the postmaster start these three instances in separate > clusters, respectively, using three different directories, such as > > $ postmaster -D /usr/local/pgsql/data > $ postmaster -D /usr/local/pgsql/qat > $ postmaster -D /usr/local/pgsql/dev ... then the second and third command will fail. -- Peter Eisentraut peter_e@gmx.net
On Fri, Nov 14, 2003 at 01:37:50PM -0500, btober@seaworthysys.com wrote: > > If you're really concerned, you can initdb separate clusters for QAT > > and DEV and run three postmasters using three different ports. > > Follow-up question: Are different ports really necessary? No, and it will be probably a waste of shared_buffers too. IMHO you should keep them under a single postmaster, and forget about the OID wraparound problem. It won't be a problem for you anyway, because if you create the whole DB from scratch you won't have OID collision anyway, even on year 2030 when your OIDs really do wrap around. Maybe by then we will have 8 bit OIDs anyway, and you will surely initdb way before that. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Las mujeres son como hondas: mientras más resistencia tienen, más lejos puedes llegar con ellas" (Jonas Nightingale, Leap of Faith)
On Fri, 14 Nov 2003 btober@seaworthysys.com wrote: > > > On Fri, 14 Nov 2003 10:01:51 -0500 (EST), <btober@seaworthysys.com> > > wrote: > >>The Production database is the "real" data, and we periodically take a > >> back up from Prod and re-instantiate QAT and DEV by dropping them and > >> then restoring from the Prod backup. > > > > OIDs are unsigned 32 bit...you still have more than 4000000000 for your > > objects. How many objects (tables, indices, operators, functions, ...) > > Only several thousand, so as a practical matter I realize in this case it > realistically is not a problem, but for the sake of completenss in > understanding how I *could* do this, I'd like to learn. > >... > > > ...you can initdb separate clusters for QAT > > and DEV and run three postmasters using three different ports. > > This is what I was thinking but didn't really understand fully what I > would have to do. So your suggestion involves > > initdb -D my_QAT_path > > then > > psql -f my_PROD_backup.sql ... [?] > > Can you help me out with a little more detail and tell me where does the > "different port" assignement come into play? I know I have the default > port 5432 in my postgresql.conf file, but how do I associate different > ports with the different postmasters? I guess the -h option on the start > command line, but my RH Linux machine is setup up start the postmaster > automatically on startup, using the /etc/init.d/postgresql script file. > How would I get the postmaster to auto start multiple instances using the > different port configurations? Whoa! You mean these aren't already separate database clusters or even separate systems? I am very shocked, you can't do a proper Dev --> QAT --> Prod environment if all three systems are run by the same postmaster, or on the same host imo. But maybe I'm just over cautious, or worked on systems where access to production systems is controlled. I can see the advantages in that Dev and QAT environments are automatically the same as Prod but in general Dev can be a law unto itself almost and QAT reflects the environment of Prod, e.g. Prod is Solaris 5.9 so QAT is Solaris 5.9, with the only differences being changes applied to QAT that have not yet been applied to Prod, and Dev could be Windows if that can provide everything needed to develop for the end product. At the very least I think your three database should be run as separate clusters, indeed reading the section I edited out from your email about the usage pattern on QAT and Dev my first thought was "Well if you think oid wrap around would be a problem just throw an initdb into your rebuild cycle." I've seen some useful replies on how to run these separately but am I the only one shocked that the whole process is happening on a production system? -- Nigel Andrews
> Whoa! You mean these aren't already separate database clusters or even > separate systems? I am very shocked, you can't do a proper Dev --> QAT > --> Prod environment if all three systems are run by the same > postmaster, or on the same host imo. I can see having separate clusters would save me the unnecessary expenditure of OID',s, and I can see access to the production data being restricted (from developers?) in most cases as a good thing, but for this environment the set of end-users and developers is actually quite small, they are in close proximity and have ready communication, and so I don't see why it would be a problem to have the same postmaster running all three databases for this case. I would be interested in hearing more detail as to why it is a bad idea in general. ~Berend Tober
>Whoa! You mean these aren't already separate database clusters or even separate >systems? I am very shocked, you can't do a proper Dev --> QAT --> Prod >environment if all three systems are run by the same postmaster, or on the same >host imo. But maybe I'm just over cautious, or worked on systems where access >to production systems is controlled. > > > I second this. Use different databases for each. You can run them on the same machine (there are some real advantages to this) but create a separate initdb for each... Then run PostgreSQL on its own port for each. If you really want to make it structured create virtual IP addresses for each so that you never think about it... dev.database.com qat.database.com prod.database.com >I can see the advantages in that Dev and QAT environments are automatically the >same as Prod but in general Dev can be a law unto itself almost and QAT >reflects the environment of Prod, e.g. Prod is Solaris 5.9 so QAT is Solaris >5.9, with the only differences being changes applied to QAT that have not yet >been applied to Prod, and Dev could be Windows if that can provide everything >needed to develop for the end product. > >At the very least I think your three database should be run as separate >clusters, indeed reading the section I edited out from your email about the >usage pattern on QAT and Dev my first thought was "Well if you think oid wrap >around would be a problem just throw an initdb into your rebuild cycle." > >I've seen some useful replies on how to run these separately but am I the only >one shocked that the whole process is happening on a production system? > > >-- >Nigel Andrews > > >---------------------------(end of broadcast)--------------------------- >TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match > > -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC - S/JDBC Postgresql support, programming, shared hosting and dedicated hosting. +1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com PostgreSQL.Org - Editor-N-Chief - http://www.postgresql.org
On Sun, Nov 16, 2003 at 10:47:22AM -0800, Joshua D. Drake wrote: > > >Whoa! You mean these aren't already separate database clusters or > >even separate systems? I am very shocked, you can't do a proper Dev > >--> QAT --> Prod environment if all three systems are run by the same > >postmaster, or on the same host imo. But maybe I'm just over > >cautious, or worked on systems where access to production systems is > >controlled. > > I second this. Use different databases for each. You can run them > on the same machine (there are some real advantages to this) but > create a separate initdb for each... What's the point? You can keep them separate through pg_hba.conf if it's really needed. I don't see how having several clusters, one database each, buys you any security. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Cuando mañana llegue pelearemos segun lo que mañana exija" (Mowgli)
On Sun, 16 Nov 2003, Alvaro Herrera wrote: > On Sun, Nov 16, 2003 at 10:47:22AM -0800, Joshua D. Drake wrote: > > > > >Whoa! You mean these aren't already separate database clusters or > > >even separate systems? I am very shocked, you can't do a proper Dev > > >--> QAT --> Prod environment if all three systems are run by the same > > >postmaster, or on the same host imo. But maybe I'm just over > > >cautious, or worked on systems where access to production systems is > > >controlled. > > > > I second this. Use different databases for each. You can run them > > on the same machine (there are some real advantages to this) but > > create a separate initdb for each... > > What's the point? You can keep them separate through pg_hba.conf if > it's really needed. I don't see how having several clusters, one > database each, buys you any security. I don't think security isn't the issue but instead it's reliability. If you have a bug in your dev system that say causes backends to die and forcing full postmaster restarts, do you really want to be mucking up your production system as well? This isn't hard to imagine if you have extension functions that you're loading as part of your dev system that are different from production.
> > On Sun, 16 Nov 2003, Alvaro Herrera wrote: > >> On Sun, Nov 16, 2003 at 10:47:22AM -0800, Joshua D. Drake wrote: >> > >> > >Whoa! You mean these aren't already separate database clusters or >> even separate systems? I am very shocked, you can't do a proper >> Dev --> QAT --> Prod environment if all three systems are run by >> the same postmaster, or on the same host imo. But maybe I'm just >> over >> > >cautious, or worked on systems where access to production systems >> is controlled. >> > >> > I second this. Use different databases for each. You can run them >> on the same machine (there are some real advantages to this) but >> create a separate initdb for each... >> >> What's the point? You can keep them separate through pg_hba.conf if >> it's really needed. I don't see how having several clusters, one >> database each, buys you any security. Using a separate cluster for each, or at least one cluster for PROD and one for both QAT and DEV does address the original question posed about burning through OIDS during the refresh->modify->test->fix iteration. > > I don't think security isn't the issue but instead it's reliability. If > you have a bug in your dev system that say causes backends to die and > forcing full postmaster restarts, do you really want to be mucking up > your production system as well? This isn't hard to imagine ... > I can see the reliability thing being a big plus, too. While I don't care if developers accidentally drop an important table in DEV, if they hang the production postmaster or otherwise screw up the server, then that is undesireable. The complications (real or imagined at this point) of using different port numbers has me kind of stuck, however. So, help me out with some specifics of the multi-cluster route, and lets pretend I can't afford another machine (actually the hardware acquisition is not a problem, but I don't want to incur the additional server setup and maintenance responsiblity). I create a new cluster with initdb -D /path/to/my/qat/cluster initdb -D /path/to/my/dev/cluster right? Then that means that I need to maintain separate pg_hba.conf and postgresql.conf files for each (in the respective cluster directories), right? Then I restore into the new QAT and DEV cluster my backup from PROD, but exactly how is that done? I guess I have to employ the -p option with psql, something like psql -p 5433 -U postgres -f my_prod_backup.sql myqatdb psql -p 5433 -U postgres -f my_prod_backup.sql mydevdb (I guess I can use the same database name if I'm using separate clusters?) assuming I've modified the QAT and DEV postgresql.conf files to specify that port, or possilby started postmaster with postmaster -p 5433 for qat and maybe postmaster -p 5434 for dev. Actually, the two postmaster commands above are probably wrong. Instead, is what I need postmaster -D /path/to/my/qat/cluster and postmaster -D /path/to/my/dev/cluster so that the proper port is picked up from the respectively-modified postgresql.conf files in the separate cluster directories? And how would I automate the QAT and DEV postmaster startup with server startup, the way the current postmaster is started with all three databases in a single cluster, on my RH Linux host that uses the rc.d/init.d/postgresql script? Would it be necessary to make a qat and dev version of that stgartup script, appropriately modified to use the different cluster directories, or what? ~Berend Tober
<btober@seaworthysys.com> writes: > The complications (real or imagined at this point) of using different > port numbers has me kind of stuck, however. It's really not that hard. > initdb -D /path/to/my/qat/cluster > initdb -D /path/to/my/dev/cluster Right. > right? Then that means that I need to maintain separate pg_hba.conf and > postgresql.conf files for each (in the respective cluster directories), > right? Exactly. Setting the 'port' parameter in postgresql.conf is probably preferable to doing it on the command line. > Then I restore into the new QAT and DEV cluster my backup from PROD, but > exactly how is that done? I guess I have to employ the -p option with > psql, something like > > psql -p 5433 -U postgres -f my_prod_backup.sql myqatdb > psql -p 5433 -U postgres -f my_prod_backup.sql mydevdb ^^^^5434 Yup. > And how would I automate the QAT and DEV postmaster startup with server > startup, the way the current postmaster is started with all three > databases in a single cluster, on my RH Linux host that uses the > rc.d/init.d/postgresql script? Would it be necessary to make a qat and > dev version of that stgartup script, appropriately modified to use the > different cluster directories, or what? This is a possible approach, but there is a lot of hair in the RH startup that makes it less than conducive to running multiple postmasters. In particular, it looks for (or used to; I haven't looked at it recently since RH 7.x) a process named 'postmaster' to see if the server is currently running, which doesn't work very well when you have multiple postmasters. I'd probably replace the existing init.d script with three copies of something simpler that just calls 'pg_ctl <start|stop> -D /the/proper/place -l /log/file/name' -Doug