Thread: Multicolumn Primary Key
We've got a table that has a definition as follows: CREATE TABLE linking_table ( fk int8 REFERENCES source_table( pk1 ), value int8, PRIMARY KEY( fk1, value ) ); I would've thought that the multicolumn primary key would behave as a multicolumn index is supposed to behave per http://www.postgresql.org/docs/7.4/static/indexes-multicolumn.html where the behavior of the index cascades from the left rightward across any columns specified in WHERE. But a query like SELECT COUNT( * ) FROM linking_table WHERE fk = '42'; yields a sequential scan. If I add an index to fk, then the same query yields an index scan, as I would expect. Is this because, according to the docs, a primary key "<bigger><bigger>is merely a combination of </bigger></bigger><fontfamily><param>Courier</param><x-tad-bigger>UNIQUE</x-tad-bigger></fontfamily><bigger><bigger> and </bigger></bigger><fontfamily><param>Courier</param><x-tad-bigger>NOT NULL"</x-tad-bigger></fontfamily>? If so, then why do primary keys afford index scans of single columns specified as primary keys? This is in postgres 7.4.5, btw. -tfo We've got a table that has a definition as follows: CREATE TABLE linking_table ( fk int8 REFERENCES source_table( pk1 ), value int8, PRIMARY KEY( fk1, value ) ); I would've thought that the multicolumn primary key would behave as a multicolumn index is supposed to behave per http://www.postgresql.org/docs/7.4/static/indexes-multicolumn.html where the behavior of the index cascades from the left rightward across any columns specified in WHERE. But a query like SELECT COUNT( * ) FROM linking_table WHERE fk = '42'; yields a sequential scan. If I add an index to fk, then the same query yields an index scan, as I would expect. Is this because, according to the docs, a primary key "is merely a combination of UNIQUE and NOT NULL"? If so, then why do primary keys afford index scans of single columns specified as primary keys? This is in postgres 7.4.5, btw. -tfo
How many rows in the table? If it's a small table PostgreSQL won't consider using the index. Otherwise, have you VACUUM ANALYZEd recently? On Tue, Aug 31, 2004 at 12:00:11PM -0500, Thomas F. O'Connell wrote: > We've got a table that has a definition as follows: > > CREATE TABLE linking_table ( > fk int8 REFERENCES source_table( pk1 ), > value int8, > PRIMARY KEY( fk1, value ) > ); > > I would've thought that the multicolumn primary key would behave as a > multicolumn index is supposed to behave per > > http://www.postgresql.org/docs/7.4/static/indexes-multicolumn.html > > where the behavior of the index cascades from the left rightward across > any columns specified in WHERE. > > But a query like > > SELECT COUNT( * ) FROM linking_table WHERE fk = '42'; > > yields a sequential scan. > > If I add an index to fk, then the same query yields an index scan, as I > would expect. Is this because, according to the docs, a primary key "is > merely a combination of UNIQUE and NOT NULL"? > > If so, then why do primary keys afford index scans of single columns > specified as primary keys? > > This is in postgres 7.4.5, btw. > > -tfo -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
On Aug 31, 2004, at 4:34 PM, Martijn van Oosterhout wrote: > How many rows in the table? If it's a small table PostgreSQL won't > consider using the index. 300,000+ > Otherwise, have you VACUUM ANALYZEd recently? Argh! That seems to have been it. I have pg_autovacuum running, so it didn't occur to me. Thanks. -tfo
Thomas F.O'Connell wrote: > On Aug 31, 2004, at 4:34 PM, Martijn van Oosterhout wrote: >> Otherwise, have you VACUUM ANALYZEd recently? > > Argh! That seems to have been it. I have pg_autovacuum running, so it > didn't occur to me. pg_autovacuum only knows about table activity that has taken place while it's running. Example: Table A has an analyze threshold of 1000, if you perform 999 updates on table A, then kill and restart pg_autovacuum, you will have to perform another 1000 updates before pg_autovacuum performs an ANALYZE. This is a large limitation of the current implementation. I tried to fix this for 8.0, but my autovacuum improvements didn't make the cut. Matthew
On Tuesday 31 August 2004 3:49 pm, Matthew T. O'Connor wrote: > Thomas F.O'Connell wrote: > > On Aug 31, 2004, at 4:34 PM, Martijn van Oosterhout wrote: > >> Otherwise, have you VACUUM ANALYZEd recently? > > > > Argh! That seems to have been it. I have pg_autovacuum running, > > so it didn't occur to me. > > pg_autovacuum only knows about table activity that has taken place > while it's running. Example: Table A has an analyze threshold of > 1000, if you perform 999 updates on table A, then kill and restart > pg_autovacuum, you will have to perform another 1000 updates before > pg_autovacuum performs an ANALYZE. This is a large limitation of > the current implementation. > > I tried to fix this for 8.0, but my autovacuum improvements didn't > make the cut. As a quick-'n'-dirty "fix", what about a command-line option to autovacuum that tells it to vacuum analyze all tables on startup? At least it would start in a somewhat known condition. Cheers, Steve
Steve Crawford wrote: > On Tuesday 31 August 2004 3:49 pm, Matthew T. O'Connor wrote: >>pg_autovacuum only knows about table activity that has taken place >>while it's running. Example: Table A has an analyze threshold of >>1000, if you perform 999 updates on table A, then kill and restart >>pg_autovacuum, you will have to perform another 1000 updates before >>pg_autovacuum performs an ANALYZE. This is a large limitation of >>the current implementation. >> >>I tried to fix this for 8.0, but my autovacuum improvements didn't >>make the cut. > > As a quick-'n'-dirty "fix", what about a command-line option to > autovacuum that tells it to vacuum analyze all tables on startup? At > least it would start in a somewhat known condition. I thought having pg_autovacuum do that, but seemed like overkill, but perhaps a command line option wouldn't be a bad thing. However, you could also just as easily do a manual vacuum analyze whenever you like. Matthew