Re: changeset generation v5-01 - Patches & git tree - Mailing list pgsql-hackers

From Robert Haas
Subject Re: changeset generation v5-01 - Patches & git tree
Date
Msg-id CA+TgmobxqMj6r-REqq36xAgpvFOBPLzTdWfgZKeg4wWgzF3Mew@mail.gmail.com
Whole thread Raw
In response to Re: changeset generation v5-01 - Patches & git tree  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: changeset generation v5-01 - Patches & git tree
Re: changeset generation v5-01 - Patches & git tree
List pgsql-hackers
On Fri, Jun 28, 2013 at 11:56 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I'm just talking out of my rear end here because I don't know what the
>> real numbers are, but it's far from obvious to me that there's any
>> free lunch here.  That having been said, just because indexing
>> relfilenode or adding relfilenodes to WAL records is expensive doesn't
>> mean we shouldn't do it.  But I think we need to know the price tag
>> before we can judge whether to make the purchase.
>
> Certainly, any of these solutions are going to cost us somewhere ---
> either up-front cost or more expensive (and less reliable?) changeset
> extraction, take your choice.  I will note that somehow tablespaces got
> put in despite having to add 4 bytes to every WAL record for that
> feature, which was probably of less use than logical changeset
> extraction will be.

Right.  I actually think we booted that one.  The database ID is a
constant for most people.  The tablespace ID is not technically
redundant, but in 99.99% of cases you could figure it out from the
database ID + relation ID.  The relation ID is where 99% of the
entropy is, but it probably only has 8-16 bits of entropy in most
real-world use cases.  If we were doing this over we might want to
think about storing a proxy for the relfilenode rather than the
relfilenode itself, but there's not much good crying over it now.

> But to tell the truth, I'm mostly exercised about the non-unique
> syscache.  I think that's simply a *bad* idea.

+1.

I don't think the extra index on pg_class is going to hurt that much,
even if we create it always, as long as we use a purpose-built caching
mechanism for it rather than forcing it through catcache.  The people
who are going to suffer are the ones who create and drop a lot of
temporary tables, but even there I'm not sure how visible the overhead
will be on real-world workloads, and maybe the solution is to work
towards not having permanent catalog entries for temporary tables in
the first place.  In any case, hurting people who use temporary tables
heavily seems better than adding overhead to every
insert/update/delete operation, which will hit all users who are not
read-only.

On the other hand, I can't entirely shake the feeling that adding the
information into WAL would be more reliable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: Move unused buffers to freelist
Next
From: Robert Haas
Date:
Subject: Re: Move unused buffers to freelist