Thread: Multicolumn Primary Key

Multicolumn Primary Key

From
Thomas F.O'Connell
Date:
We've got a table that has a definition as follows:


CREATE TABLE linking_table (

    fk        int8 REFERENCES source_table( pk1 ),

    value    int8,

    PRIMARY KEY( fk1, value )

);


I would've thought that the multicolumn primary key would behave as a
multicolumn index is supposed to behave per


http://www.postgresql.org/docs/7.4/static/indexes-multicolumn.html


where the behavior of the index cascades from the left rightward
across any columns specified in WHERE.


But a query like


SELECT COUNT( * ) FROM linking_table WHERE fk = '42';


yields a sequential scan.


If I add an index to fk, then the same query yields an index scan, as
I would expect. Is this because, according to the docs, a primary key
"<bigger><bigger>is merely a combination of
</bigger></bigger><fontfamily><param>Courier</param><x-tad-bigger>UNIQUE</x-tad-bigger></fontfamily><bigger><bigger>
and
</bigger></bigger><fontfamily><param>Courier</param><x-tad-bigger>NOT
NULL"</x-tad-bigger></fontfamily>?


If so, then why do primary keys afford index scans of single columns
specified as primary keys?


This is in postgres 7.4.5, btw.


-tfo
We've got a table that has a definition as follows:

CREATE TABLE linking_table (
    fk        int8 REFERENCES source_table( pk1 ),
    value    int8,
    PRIMARY KEY( fk1, value )
);

I would've thought that the multicolumn primary key would behave as a
multicolumn index is supposed to behave per

http://www.postgresql.org/docs/7.4/static/indexes-multicolumn.html

where the behavior of the index cascades from the left rightward across
any columns specified in WHERE.

But a query like

SELECT COUNT( * ) FROM linking_table WHERE fk = '42';

yields a sequential scan.

If I add an index to fk, then the same query yields an index scan, as I
would expect. Is this because, according to the docs, a primary key "is
merely a combination of UNIQUE and NOT NULL"?

If so, then why do primary keys afford index scans of single columns
specified as primary keys?

This is in postgres 7.4.5, btw.

-tfo

Re: Multicolumn Primary Key

From
Martijn van Oosterhout
Date:
How many rows in the table? If it's a small table PostgreSQL won't
consider using the index.

Otherwise, have you VACUUM ANALYZEd recently?

On Tue, Aug 31, 2004 at 12:00:11PM -0500, Thomas F. O'Connell wrote:
> We've got a table that has a definition as follows:
>
> CREATE TABLE linking_table (
>     fk        int8 REFERENCES source_table( pk1 ),
>     value    int8,
>     PRIMARY KEY( fk1, value )
> );
>
> I would've thought that the multicolumn primary key would behave as a
> multicolumn index is supposed to behave per
>
> http://www.postgresql.org/docs/7.4/static/indexes-multicolumn.html
>
> where the behavior of the index cascades from the left rightward across
> any columns specified in WHERE.
>
> But a query like
>
> SELECT COUNT( * ) FROM linking_table WHERE fk = '42';
>
> yields a sequential scan.
>
> If I add an index to fk, then the same query yields an index scan, as I
> would expect. Is this because, according to the docs, a primary key "is
> merely a combination of UNIQUE and NOT NULL"?
>
> If so, then why do primary keys afford index scans of single columns
> specified as primary keys?
>
> This is in postgres 7.4.5, btw.
>
> -tfo

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: Multicolumn Primary Key

From
Thomas F.O'Connell
Date:
On Aug 31, 2004, at 4:34 PM, Martijn van Oosterhout wrote:

> How many rows in the table? If it's a small table PostgreSQL won't
> consider using the index.

300,000+

> Otherwise, have you VACUUM ANALYZEd recently?

Argh! That seems to have been it. I have pg_autovacuum running, so it
didn't occur to me.

Thanks.

-tfo


Re: Multicolumn Primary Key

From
"Matthew T. O'Connor"
Date:
Thomas F.O'Connell wrote:
> On Aug 31, 2004, at 4:34 PM, Martijn van Oosterhout wrote:
>> Otherwise, have you VACUUM ANALYZEd recently?
>
> Argh! That seems to have been it. I have pg_autovacuum running, so it
> didn't occur to me.

pg_autovacuum only knows about table activity that has taken place while
it's running.  Example:  Table A has an analyze threshold of 1000, if
you perform 999 updates on table A, then kill and restart pg_autovacuum,
you will have to perform another 1000 updates before pg_autovacuum
performs an ANALYZE.  This is a large limitation of the current
implementation.

I tried to fix this for 8.0, but my autovacuum improvements didn't make
the cut.

Matthew


Re: Multicolumn Primary Key

From
Steve Crawford
Date:
On Tuesday 31 August 2004 3:49 pm, Matthew T. O'Connor wrote:
> Thomas F.O'Connell wrote:
> > On Aug 31, 2004, at 4:34 PM, Martijn van Oosterhout wrote:
> >> Otherwise, have you VACUUM ANALYZEd recently?
> >
> > Argh! That seems to have been it. I have pg_autovacuum running,
> > so it didn't occur to me.
>
> pg_autovacuum only knows about table activity that has taken place
> while it's running.  Example:  Table A has an analyze threshold of
> 1000, if you perform 999 updates on table A, then kill and restart
> pg_autovacuum, you will have to perform another 1000 updates before
> pg_autovacuum performs an ANALYZE.  This is a large limitation of
> the current implementation.
>
> I tried to fix this for 8.0, but my autovacuum improvements didn't
> make the cut.

As a quick-'n'-dirty "fix", what about a command-line option to
autovacuum that tells it to vacuum analyze all tables on startup? At
least it would start in a somewhat known condition.

Cheers,
Steve

Re: Multicolumn Primary Key

From
"Matthew T. O'Connor"
Date:
Steve Crawford wrote:
> On Tuesday 31 August 2004 3:49 pm, Matthew T. O'Connor wrote:
>>pg_autovacuum only knows about table activity that has taken place
>>while it's running.  Example:  Table A has an analyze threshold of
>>1000, if you perform 999 updates on table A, then kill and restart
>>pg_autovacuum, you will have to perform another 1000 updates before
>>pg_autovacuum performs an ANALYZE.  This is a large limitation of
>>the current implementation.
>>
>>I tried to fix this for 8.0, but my autovacuum improvements didn't
>>make the cut.
>
> As a quick-'n'-dirty "fix", what about a command-line option to
> autovacuum that tells it to vacuum analyze all tables on startup? At
> least it would start in a somewhat known condition.

I thought having pg_autovacuum do that, but seemed like overkill, but
perhaps a command line option wouldn't be a bad thing.  However, you
could also just as easily do a manual vacuum analyze whenever you like.

Matthew