Re: Add primary key/unique constraint using prefix columns of an index - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Add primary key/unique constraint using prefix columns of an index
Date
Msg-id CA+Tgmobz8aTFzfVoAKposUNXrwboSoovmr8xCYXJ=-Cb2TFj=w@mail.gmail.com
Whole thread Raw
In response to Re: Add primary key/unique constraint using prefix columns of an index  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Add primary key/unique constraint using prefix columns of an index
List pgsql-hackers
On Wed, May 23, 2012 at 2:00 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Tue, May 22, 2012 at 11:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, May 22, 2012 at 1:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Jeff Janes <jeff.janes@gmail.com> writes:
>>>> Now that there are index only scans, there is a use case for having a
>>>> composite index which has the primary key or a unique key as the
>>>> prefix column(s) but with extra columns after that.  Currently you
>>>> would also need another index with exactly the primary/unique key,
>>>> which seems like a waste of storage and maintenance.
>>>
>>>> Should there be a way to declare a "unique" index with the unique
>>>> property applying to a prefix of the indexed columns/expression?  And
>>>> having that, a way to turn that prefix into a primary key constraint?
>>>
>>>> Of course this is easier said then done, but is there some reason for
>>>> it not to be a to-do item?
>>>
>>> Um ... other than it being ugly as sin?  I can't say that I can get
>>> excited about this concept.  It'd be better to work on index-organized
>>> tables, which is really more or less what you're wishing for here.
>>> Duplicating most of a table into an index is always going to be a loser
>>> in the end because of the redundant storage.
>>
>> An index on pgbench_accounts (aid, abalance) is the same size as an
>> index on pgbench_accounts (aid), but even if it were larger, there's
>> no theoretical reason it couldn't have enough utility to justify its
>> existence.   A bigger problem is that creating such an index turns all
>> of pgbench's write traffic from HOT updates into non-HOT updates,
>> which means this is probably only going to be a win if the write
>> volume is miniscule.
>
> That seems overly pessimistic to me.  pgbench_accounts only has one
> index on it, and that index is already being used to find the row in
> the first place, so the relevant leaf block is already in memory.  If
> you have a table with 12 indexes on it, then the cost of non-HOT would
> be much higher.  But then again, with that number of indexes it is
> probably already non-HOT anyway.
>
> Since pgbench does not actually attempt to violate the PK constraint,
> I can drop it without altering the behavior of the system.  This
> neglects the overhead of checking the "prefix" constraint were that to
> be possible, but that overhead should be almost entirely CPU, and so
> is negligible to this IO bound workload.
>
> I'm running some tests where I mix the work load of pgbench by doing
> "TPC-B (sort of)" transaction mixed in with a variable number of
> SELECT-only transactions, at a ratio varying between 1:0 to 1:10.
>
> It is often said that the default pgbench is an unrealistically
> write-heavy workload.  So mixing in some SELECT-only is probably only
> going to improve its real-world alignment.  In fact I wondering if it
> would make sense to add a feature to pgbench to make such admixture
> easy to do, rather than the current pain of creating multiple sql
> files, specifying a bunch of -f switches in various ratios, and
> remembering to always specify the correct -s flag.
>
> From preliminary test it looks like an index on (aid, abalance) wins
> at all ratios from 1:1 to 1:10, and at 1:0 it is mostly a toss up.  I
> still want to do a few overnight runs to see how the decay in the
> visibility map, and perhaps autovacuum kicking in or failing to kick
> in, effect things.

Interesting!  If that holds up under more careful testing, it would be
a great result.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Add primary key/unique constraint using prefix columns of an index
Next
From: Tom Lane
Date:
Subject: Re: spgist metapage