Thread: is PG able to handle a >500 GB Database?

is PG able to handle a >500 GB Database?

From
Alvar Freude
Date:
Hi,

recently i got a request for a database with a size of ~550 Gig!

So, I think Postgres should be able to handle this, is it?


I guess, that there are about 100 million to 1000 million table entries
and I need at least two tables (one references another one), better
three or more.


Can it handle such thinks, if there are sufficient hard discs? ;)


Thanks & Ciao
  Alvar


--
Alvar C.H. Freude  |  alvar.freude@merz-akademie.de

    Demo: http://www.online-demonstration.org/  |  Mach mit!
Blast-DE: http://www.assoziations-blaster.de/   |  Blast-Dich-Fit
Blast-EN: http://www.a-blast.org/               |  Blast/english

Re: is PG able to handle a >500 GB Database?

From
Tom Lane
Date:
Alvar Freude <alvar.freude@merz-akademie.de> writes:
> I guess, that there are about 100 million to 1000 million table entries

You'd probably find the OID counter wrapping around before too long.
However, as long as you don't assume that OIDs are unique in your data
tables, that shouldn't bother you a whole lot.  AFAIK you should be
able to make it work.

            regards, tom lane

Re: is PG able to handle a >500 GB Database?

From
Alvar Freude
Date:
Hi,

Tom Lane schrieb:
>
> You'd probably find the OID counter wrapping around before too long.
> However, as long as you don't assume that OIDs are unique in your data
> tables, that shouldn't bother you a whole lot.  AFAIK you should be
> able to make it work.

ok -- the OID-Counter is (still) 4-byte int?

As far as I guess now I don't think I need an unique OID -- if Postgres
didn't need it!

I only need two unique string fields in one table, so it might be a good
test if Postgres will be slower with lots of data ... ;-)


Ciao
  Alvar


--
Alvar C.H. Freude  |  alvar.freude@merz-akademie.de

    Demo: http://www.online-demonstration.org/  |  Mach mit!
Blast-DE: http://www.assoziations-blaster.de/   |  Blast-Dich-Fit
Blast-EN: http://www.a-blast.org/               |  Blast/english

Re: is PG able to handle a >500 GB Database?

From
Tom Lane
Date:
Alvar Freude <alvar.freude@merz-akademie.de> writes:
> Tom Lane schrieb:
>> You'd probably find the OID counter wrapping around before too long.
>> However, as long as you don't assume that OIDs are unique in your data
>> tables, that shouldn't bother you a whole lot.  AFAIK you should be
>> able to make it work.

> ok -- the OID-Counter is (still) 4-byte int?

Right.

> As far as I guess now I don't think I need an unique OID -- if Postgres
> didn't need it!

Unless your application logic tries to use OIDs as row identifiers,
duplicate OIDs in user tables are not a problem.

The system does assume that OIDs are unique within certain system tables
--- for example, two tables (pg_class rows) can't have the same OID.
This is enforced by unique indexes on those tables, however.  If you
were really unlucky, then after OID wraparound you might see "can't
insert duplicate key" failures while doing create table or some such.
This could be dealt with just by retrying till it works, since each
try will generate new OIDs.  But the odds of a conflict are pretty tiny,
so I mention this mainly for completeness.

We do have a TODO item to allow OID to be 8-byte, but in the real world
I doubt it's a big deal.

I am more concerned about the 4-byte transaction ID generator ---
wraparound of that counter would be much nastier.  Don't insert those
billion rows in a billion separate transactions ;-)

            regards, tom lane

Re: is PG able to handle a >500 GB Database?

From
Florent Guillaume
Date:
> Unless your application logic tries to use OIDs as row identifiers,
> duplicate OIDs in user tables are not a problem.

Hmmm, that means that the following, which I use, is not strictly correct :

create table t1 (recordid SERIAL PRIMARY KEY, val INT4, name TEXT);
... much later ...
insert into t1 (val, name) values (3465, 'blah');
-- PHP application gets $lastoid for this insert
select recordid from t1 where oid = $lastoid;
-- PHP application now hast the recordid that was created.

Because the last select could return several lines.

What would be the correct idiom ? Explicitly use nextval ? That pretty
defeats the point of SERIAL, no ?


Florent


Re: Re: is PG able to handle a >500 GB Database?

From
"Oliver Elphick"
Date:
Florent Guillaume wrote:
  >> Unless your application logic tries to use OIDs as row identifiers,
  >> duplicate OIDs in user tables are not a problem.
  >
  >Hmmm, that means that the following, which I use, is not strictly correct :
  >
  >create table t1 (recordid SERIAL PRIMARY KEY, val INT4, name TEXT);
  >... much later ...
  >insert into t1 (val, name) values (3465, 'blah');
  >-- PHP application gets $lastoid for this insert
  >select recordid from t1 where oid = $lastoid;
  >-- PHP application now hast the recordid that was created.
  >
  >Because the last select could return several lines.
  >
  >What would be the correct idiom ? Explicitly use nextval ? That pretty
  >defeats the point of SERIAL, no ?

SELECT currval('t1_recordid_seq');

currval is guaranteed to return the value that the sequence has just put
into recordid.

--
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47  6B 7E 39 CC 56 E4 C1 47
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
                 ========================================
     "The LORD is my strength and song, and he is become my
      salvation; he is my God, and I will prepare him an
      habitation; my father's God, and I will exalt him."
                       Exodus 15:2



Re: Re: is PG able to handle a >500 GB Database?

From
"rob"
Date:
currval returns error unless nextval has been called at least once in the
session.

I use <seq>.last_value

Perhaps I'm fooling myself that the latter is doing what I think, but it
seems to work fine and doesn't require an initial call to nextval.

I too thought that OID was always unique.  That is not true?

--rob




----- Original Message -----
From: "Oliver Elphick" <olly@lfix.co.uk>
To: "Florent Guillaume" <efgeor@noos.fr>
Cc: <pgsql-general@postgresql.org>
Sent: Friday, January 19, 2001 5:27 PM
Subject: Re: Re: is PG able to handle a >500 GB Database?


> Florent Guillaume wrote:
>   >> Unless your application logic tries to use OIDs as row identifiers,
>   >> duplicate OIDs in user tables are not a problem.
>   >
>   >Hmmm, that means that the following, which I use, is not strictly
correct :
>   >
>   >create table t1 (recordid SERIAL PRIMARY KEY, val INT4, name TEXT);
>   >... much later ...
>   >insert into t1 (val, name) values (3465, 'blah');
>   >-- PHP application gets $lastoid for this insert
>   >select recordid from t1 where oid = $lastoid;
>   >-- PHP application now hast the recordid that was created.
>   >
>   >Because the last select could return several lines.
>   >
>   >What would be the correct idiom ? Explicitly use nextval ? That pretty
>   >defeats the point of SERIAL, no ?
>
> SELECT currval('t1_recordid_seq');
>
> currval is guaranteed to return the value that the sequence has just put
> into recordid.
>
> --
> Oliver Elphick                                Oliver.Elphick@lfix.co.uk
> Isle of Wight                              http://www.lfix.co.uk/oliver
> PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47  6B 7E 39 CC 56 E4 C1 47
> GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
>                  ========================================
>      "The LORD is my strength and song, and he is become my
>       salvation; he is my God, and I will prepare him an
>       habitation; my father's God, and I will exalt him."
>                        Exodus 15:2
>
>
>


Re: is PG able to handle a >500 GB Database?

From
Florent Guillaume
Date:
> currval returns error unless nextval has been called at least once in the
> session.

But in my example, which is

> > > create table t1 (recordid SERIAL PRIMARY KEY, val INT4, name TEXT);
> > > ... much later ...
> > > insert into t1 (val, name) values (3465, 'blah');
> > > -- now we want the recordid just created

it's ok because a NULL insert into a SERIAL implicitly calls nextval.


Florent

Re: Re: Re: is PG able to handle a >500 GB Database?

From
Tom Lane
Date:
"rob" <rob@cabrion.com> writes:
> currval returns error unless nextval has been called at least once in the
> session.

> I use <seq>.last_value

> Perhaps I'm fooling myself

Yes, you are, unless you never have more than one client attached to
your database.  last_value will return whatever value was last assigned
by any backend, therefore you might not get the value that was inserted
into your tuple, but someone else's.

The point about currval being initially undefined seems moot if what
you're using it for is to recover the serial number that was assigned to
a tuple you just inserted ...

            regards, tom lane

Re: Re: Re: is PG able to handle a >500 GB Database?

From
"Brett W. McCoy"
Date:
On Sat, 20 Jan 2001, Tom Lane wrote:

> > currval returns error unless nextval has been called at least once in the
> > session.
>
> > I use <seq>.last_value
>
> > Perhaps I'm fooling myself
>
> Yes, you are, unless you never have more than one client attached to
> your database.  last_value will return whatever value was last assigned
> by any backend, therefore you might not get the value that was inserted
> into your tuple, but someone else's.

In that case you would call next_val *before* you insert and use that
value in the INSERT statement.

-- Brett
                                     http://www.chapelperilous.net/~bmccoy/
---------------------------------------------------------------------------
Military intelligence is a contradiction in terms.
        -- Groucho Marx


Re: Re: Re: is PG able to handle a >500 GB Database?

From
Tom Lane
Date:
"Brett W. McCoy" <bmccoy@chapelperilous.net> writes:
>> last_value will return whatever value was last assigned
>> by any backend, therefore you might not get the value that was inserted
>> into your tuple, but someone else's.

> In that case you would call next_val *before* you insert and use that
> value in the INSERT statement.

Yup, that works too.  Which one you use is a matter of style, I think.
(Actually I prefer the nextval-first approach myself, just because it
seems simpler and more obviously correct.  But currval-after does work.)

To bring this discussion back to the original topic: sequences are also
4-byte counters, at present.  But there's still some value in using a
sequence to label rows in a huge table, rather than OIDs.  Namely, you
can use a separate sequence for each large table.  That way, you only
get into trouble when you exceed 4G rows entered into a particular
table, not 4G rows created in the entire database cluster.

            regards, tom lane

currval was (Re: Re: Re: is PG able to handle a >500 GB Database? )

From
"rob"
Date:
I use for sequences for row versioning not just numbering.  I'm glad you
pointed this out.  I'll need to revisit that code.

Is there a correct way to "initialize" currval without incrementing the
sequence?

--rob


----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "rob" <rob@cabrion.com>
Cc: "Florent Guillaume" <efgeor@noos.fr>; "Oliver Elphick"
<olly@lfix.co.uk>; <pgsql-general@postgresql.org>
Sent: Saturday, January 20, 2001 11:22 AM
Subject: Re: [GENERAL] Re: Re: is PG able to handle a >500 GB Database?


> "rob" <rob@cabrion.com> writes:
> > currval returns error unless nextval has been called at least once in
the
> > session.
>
> > I use <seq>.last_value
>
> > Perhaps I'm fooling myself
>
> Yes, you are, unless you never have more than one client attached to
> your database.  last_value will return whatever value was last assigned
> by any backend, therefore you might not get the value that was inserted
> into your tuple, but someone else's.
>
> The point about currval being initially undefined seems moot if what
> you're using it for is to recover the serial number that was assigned to
> a tuple you just inserted ...
>
> regards, tom lane
>


Re: currval was (Re: Re: Re: is PG able to handle a >500 GB Database? )

From
Tom Lane
Date:
"rob" <rob@cabrion.com> writes:
> Is there a correct way to "initialize" currval without incrementing the
> sequence?

No, but for the purpose that was being discussed here, I don't see why
you'd need to.  Maybe you should explain the application logic that
seems to require such a thing.

            regards, tom lane

Sorry for switching gear on you without notice.  The logic is something like
this:

When remote node attaches to DB for synchronization then update a
common-to-all-tables version sequence.  (ONLY in this case do we increment
the version sequence)  This is currently handled by an external perl script
between the client and the backend in which I increment the sequences value
via nextval(version).

Any other time the server is accessed (i.e. normal backends with no perl
script) Ins/Upd/Del triggers are called on tables that are "marked" for
synchronization.  The triggers set the effected rows "rowver" column to
version.last_value.  Here I do not want to increment the version sequence
every time a row is changed, so I can't just put nextval(version) into the
trigger function.  That would blow the synchronization algorithm.

The problem is that "normal" db sessions don't have an "on client connect
trigger" (for lack of better term, one that would fire only once when any
client connection was made and before any ins/upd/dels were processed) that
I can use to initialize the current value of the sequence.  The only thing
that was available for version.last_value which I now know is unreliable
between backends.

In the triggers could I perhaps do . . .

1) check some flag to see if I need to initialize the version sequence, or
some flag that tells me that this is the first query executed by the
backend.

2) Initialize the version sequence by doing currval(nextval(version)-1) when
step one is true.

without running into problems across backends?  It would increase overhead a
lot, but I'll live with that.

Am I just fooling myself again?  Perhaps I'm just a fool.

--rob


----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "rob" <rob@cabrion.com>
Cc: <pgsql-general@postgresql.org>; <bmccoy@chapelperilous.net>
Sent: Saturday, January 20, 2001 11:51 AM
Subject: Re: currval was (Re: [GENERAL] Re: Re: is PG able to handle a >500
GB Database? )


> "rob" <rob@cabrion.com> writes:
> > Is there a correct way to "initialize" currval without incrementing the
> > sequence?
>
> No, but for the purpose that was being discussed here, I don't see why
> you'd need to.  Maybe you should explain the application logic that
> seems to require such a thing.
>
> regards, tom lane
>


curval was Re: is PG able to handle a >500 GB Database?

From
"rob"
Date:
I just read this again.  On the first read I thought last_value would give
what was in the current backend and not "someone else's".   When I read it
the second time last_value is actually the behavior I wanted for that
program I described.  Sorry to be so dense.

On a different note.  OID's are not guaranteed to be unique within a table?

--rob

----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "rob" <rob@cabrion.com>
Cc: "Florent Guillaume" <efgeor@noos.fr>; "Oliver Elphick"
<olly@lfix.co.uk>; <pgsql-general@postgresql.org>
Sent: Saturday, January 20, 2001 11:22 AM
Subject: Re: Re: Re: is PG able to handle a >500 GB Database?


> "rob" <rob@cabrion.com> writes:
> > currval returns error unless nextval has been called at least once in
the
> > session.
>
> > I use <seq>.last_value
>
> > Perhaps I'm fooling myself
>
> Yes, you are, unless you never have more than one client attached to
> your database.  last_value will return whatever value was last assigned
> by any backend, therefore you might not get the value that was inserted
> into your tuple, but someone else's.
>
> The point about currval being initially undefined seems moot if what
> you're using it for is to recover the serial number that was assigned to
> a tuple you just inserted ...
>
> regards, tom lane
>


Re: curval was Re: is PG able to handle a >500 GB Database?

From
Tom Lane
Date:
"rob" <rob@cabrion.com> writes:
> On a different note.  OID's are not guaranteed to be unique within a table?

Not unless you put a unique index on its OID column.  Of course, you
won't actually see a conflict until you've been running your database
long enough to wrap around the OID counter...

            regards, tom lane

Re: curval was Re: is PG able to handle a >500 GB Database?

From
Florent Guillaume
Date:
> > On a different note.  OID's are not guaranteed to be unique within a table?
>
> Not unless you put a unique index on its OID column.  Of course, you
> won't actually see a conflict until you've been running your database
> long enough to wrap around the OID counter...


I think it would be a good thing to update the documentation, because
it's misleading at best :

 oid
  stands for the **unique** identifier of an instance which is added
  by Postgres to all instances automatically. Oids **are not reused**
  and are 32 bit quantities.

Bruce's book even starts a section with :

 Every row in POSTGRESQL is assigned a **unique**, normally invisible
 number called an object identification number (OID). When the software
 is initialized with initdb, a counter is created and set to
 approximately seventeen-thousand. The counter is used to **uniquely**
 number every row. Although databases may be created and destroyed, the
 counter continues to increase. It is used by all databases, so
 identification numbers are always **unique**. **No two rows in any table or
 in any database will ever have the same object ID.**



(emphasis mine)

Florent
--
<florent.guillaume@mail.com>

Re: is PG able to handle a >500 GB Database?

From
Florent Guillaume
Date:
> Unless your application logic tries to use OIDs as row identifiers,
> duplicate OIDs in user tables are not a problem.

Hmmm, that means that the following, which I use, is not strictly correct :

create table t1 (recordid SERIAL PRIMARY KEY, val INT4, name TEXT);
... much later ...
insert into t1 (val, name) values (3465, 'blah');
-- PHP application gets $lastoid for this insert
select recordid from t1 where oid = $lastoid;
-- PHP application now hast the recordid that was created.

Because the last select could return several lines.

What would be the correct idiom ? Explicitly use nextval ? That pretty
defeats the point of SERIAL, no ?


Florent

Re: is PG able to handle a >500 GB Database?

From
"Martin A. Marques"
Date:
Sorry if I missed something.

El Vie 19 Ene 2001 11:08, Florent Guillaume escribió:
> > Unless your application logic tries to use OIDs as row identifiers,
> > duplicate OIDs in user tables are not a problem.
>
> Hmmm, that means that the following, which I use, is not strictly correct :
>
> create table t1 (recordid SERIAL PRIMARY KEY, val INT4, name TEXT);
> ... much later ...
> insert into t1 (val, name) values (3465, 'blah');
> -- PHP application gets $lastoid for this insert

pg_getlastoid?

> select recordid from t1 where oid = $lastoid;
> -- PHP application now hast the recordid that was created.
>
> Because the last select could return several lines.

Shouldn't! oids are unique over the whole database server (I'm right on
this?) So you should have only one or non.

> What would be the correct idiom ? Explicitly use nextval ? That pretty
> defeats the point of SERIAL, no ?

SERIAL in Postgres is a primary key + a sequence + a default value on the
column to nextval. Pretty different from informix.

Saludos... :-)

--
System Administration: It's a dirty job,
but someone told I had to do it.
-----------------------------------------------------------------
Martín Marqués            email:     martin@math.unl.edu.ar
Santa Fe - Argentina        http://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-----------------------------------------------------------------

Re: is PG able to handle a >500 GB Database?

From
Florent Guillaume
Date:
> > select recordid from t1 where oid = $lastoid;
> > Because the last select could return several lines.
>
> Shouldn't! oids are unique over the whole database server (I'm right on
> this?) So you should have only one or non.

Well, no, that's the point of this whole discussion. If you give it
enough time, oids can wrap around, which apparently doesn't affect the
functionning of the database but application that depend on unique oids
could get hosed.

Yes, the documentation is wrong.


Florent

Re: is PG able to handle a >500 GB Database?

From
"Martin A. Marques"
Date:
El Lun 22 Ene 2001 17:56, Florent Guillaume escribió:
> > > select recordid from t1 where oid = $lastoid;
> > > Because the last select could return several lines.
> >
> > Shouldn't! oids are unique over the whole database server (I'm right on
> > this?) So you should have only one or non.
>
> Well, no, that's the point of this whole discussion. If you give it
> enough time, oids can wrap around, which apparently doesn't affect the
> functionning of the database but application that depend on unique oids
> could get hosed.
>
> Yes, the documentation is wrong.

Wow. I guess we need to hear what the developers have to say. AFAITIK oids
were unique. This paints a new panorama for me. :-( (not a very good one).

Saludos... :-)

--
System Administration: It's a dirty job,
but someone told I had to do it.
-----------------------------------------------------------------
Martín Marqués            email:     martin@math.unl.edu.ar
Santa Fe - Argentina        http://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-----------------------------------------------------------------

Re: is PG able to handle a >500 GB Database?

From
Florent Guillaume
Date:
> > Well, no, that's the point of this whole discussion. If you give it
> > enough time, oids can wrap around, which apparently doesn't affect the
> > functionning of the database but application that depend on unique oids
> > could get hosed.
> >
> > Yes, the documentation is wrong.
>
> Wow. I guess we need to hear what the developers have to say. AFAITIK oids
> were unique. This paints a new panorama for me. :-( (not a very good one).

Of course if you have a reasonnably behaved database, you won't see a
wraparound in your lifetime.

If, however, like someone else on this list, you have an application
that does 200 requests per second, and that each request generates two
updates, that's 1440000 oids consumed per hour, 34560000 per day,
12614400000 per year, oops that's more than 2^32, oids have wrapped
around nearly three times in the year...

Florent


Re: Re: is PG able to handle a >500 GB Database?

From
Tom Lane
Date:
Florent Guillaume <efgeor@noos.fr> writes:
> If, however, like someone else on this list, you have an application
> that does 200 requests per second, and that each request generates two
> updates, that's 1440000 oids consumed per hour, 34560000 per day,
> 12614400000 per year, oops that's more than 2^32, oids have wrapped
> around nearly three times in the year...

Updates do not consume OIDs ... only insertions of *new* rows consume
OIDs ... that may or may not matter here ...

            regards, tom lane

Re: Re: is PG able to handle a >500 GB Database?

From
Florent Guillaume
Date:
> > If, however, like someone else on this list, you have an application
> > that does 200 requests per second, and that each request generates two
> > updates, that's 1440000 oids consumed per hour, 34560000 per day,
> > 12614400000 per year, oops that's more than 2^32, oids have wrapped
> > around nearly three times in the year...
>
> Updates do not consume OIDs ... only insertions of *new* rows consume
> OIDs ... that may or may not matter here ...

Yes, I meant insert there, sorry for being confusing.

Florent

Re: Re: is PG able to handle a >500 GB Database?

From
Bruce Momjian
Date:
> Florent Guillaume <efgeor@noos.fr> writes:
> > If, however, like someone else on this list, you have an application
> > that does 200 requests per second, and that each request generates two
> > updates, that's 1440000 oids consumed per hour, 34560000 per day,
> > 12614400000 per year, oops that's more than 2^32, oids have wrapped
> > around nearly three times in the year...
>
> Updates do not consume OIDs ... only insertions of *new* rows consume
> OIDs ... that may or may not matter here ...

What about pre-fetching of OID's.  Does that still happen for every
backend?  What about XID's?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

OID/XID allocation (was Re: is PG able to handle a >500 GB Database?)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> What about pre-fetching of OID's.  Does that still happen for every
> backend?

Only ones that actually allocate some OIDs, I think.

> What about XID's?

XIDs are wasted on a postmaster restart, but not per-backend, because
they are cached in shared memory instead of locally.  I've been thinking
about changing the code so that OIDs are allocated in the same fashion.
That would mean an extra spinlock grab per OID allocation, but so what?
We grab several spinlocks per row creation already.  And we could
increase the number of OIDs allocated per pg_variable file update,
which would save some time.

Haven't got round to it yet though, and I'm not sure but what Vadim
might be planning to throw out all that code anyway ...

            regards, tom lane

Re: OID/XID allocation (was Re: is PG able to handle a >500 GB Database?)

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > What about pre-fetching of OID's.  Does that still happen for every
> > backend?
>
> Only ones that actually allocate some OIDs, I think.
>
> > What about XID's?
>
> XIDs are wasted on a postmaster restart, but not per-backend, because
> they are cached in shared memory instead of locally.  I've been thinking
> about changing the code so that OIDs are allocated in the same fashion.
> That would mean an extra spinlock grab per OID allocation, but so what?
> We grab several spinlocks per row creation already.  And we could
> increase the number of OIDs allocated per pg_variable file update,
> which would save some time.
>
> Haven't got round to it yet though, and I'm not sure but what Vadim
> might be planning to throw out all that code anyway ...

Added to TODO:

  * Move OID retrieval into shared memory to prevent lose of unused oids

Also, currently the oid can _not_ be used to determine the order rows
were inserted because one backend can grab its block of 50 and another
backend can start and insert a row first.

If we could change this with litle risk, it would be nice to have in
7.1, but I am sure someone will object.  :-)

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: OID/XID allocation (was Re: is PG able to handle a >500 GB Database?)

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > What about pre-fetching of OID's.  Does that still happen for every
> > backend?
>
> Only ones that actually allocate some OIDs, I think.
>
> > What about XID's?
>
> XIDs are wasted on a postmaster restart, but not per-backend, because
> they are cached in shared memory instead of locally.  I've been thinking
> about changing the code so that OIDs are allocated in the same fashion.
> That would mean an extra spinlock grab per OID allocation, but so what?
> We grab several spinlocks per row creation already.  And we could
> increase the number of OIDs allocated per pg_variable file update,
> which would save some time.

Yes, I think the leak of 50 oids for every backend that asks for an OID
is really a waste.  I can see this change buying us a few more years
without oid problems.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026