Thread: make a unique index for foreign keys?
All- I am inheriting a MySQL database and they have created indices for all columns that should be foreign keys (since the version of MySQL that is used doesn't support FKs). Yes-I know-that is why I am trying to migrate to pgsql. This has made me start wondering about the feasibility of always creating indices on foreign keys. I was taught it wasn't necessary in Oracle (I don't know why). I have started wondering if this is a good idea....in most cases they would have to be non-unique indices...but would this help to speed up queries with a join? So searching the archives I found a really nice description from Josh Berkus about rules for creating indices (http://groups.google.com/groups?hl=en&lr=&selm=web-115943%40davinci.ethosme dia.com&rnum=2). One of his rules was to set unique indices on all unique columns. In my case most of the time FK would not be unique....but there are a few cases where they are. From the CREATE TABLE idocs for 7.2 it suggests that an index on a foreign key will help for updates (" If primary key column is updated frequently, it may be wise to add an index to the REFERENCES column so that NO ACTION and CASCADE actions associated with the REFERENCES column can be more efficiently performed"). Can anyone give me an explanation of why or why not I would want an index on the foreign key? TIA- Beth
Beth, > So searching the archives I found a really nice description from Josh Berkus > about rules for creating indices > (http://groups.google.com/groups?hl=en&lr=&selm=web-115943%40davinci.ethosme > dia.com&rnum=2). Thank you! Can you send that article back to me, privately? I'll turn it into an Adventures in PostgreSQL column. > One of his rules was to set unique indices on all unique > columns. In my case most of the time FK would not be unique....but there > are a few cases where they are. From the CREATE TABLE idocs for 7.2 it > suggests that an index on a foreign key will help for updates (" If primary > key column is updated frequently, it may be wise to add an index to the > REFERENCES column so that NO ACTION and CASCADE actions associated with the > REFERENCES column can be more efficiently performed"). > Can anyone give me an explanation of why or why not I would want an index on > the foreign key? Actually, I can only think of one case where you would *not* need and index on a foriegn key: Where the list of referenced values is very small (like 2-5 choices) and the referenceing data table is very large (thousands to millions). In this case (and only in this case) Seq Scans are the fastest way to find referenced values and as such an index is superfluous. For example, say you had a table containing every ice cream retailer in the US: retailers (id int primary keyname varchar address textretail_type int references retail_types(type_id) ); ... and this table had 280,000 records, but you have only 3 retail types: 1 Ice Cream Parlor 2 Supermarket 3 Corner Market In this case, the index on retail_type would almost never be used because any selection of retail type would involve 1/3 of the retailers table and thus a Seq Scan is faster. However, in every other case, you very definitely want every Foriegn Key column indexed. In fact, in the above case, you should probably index it anyway because you might change the data population someday and the overhead of a single INT index is quite small. -- -Josh Berkus
Beth, > So one quick question while I have you ;-) So those indexes would be > non-unique indexes....I thought that I read somewhere that one shouldn't > bother with those types of indexes (was that a total mis-statement...it > kinda seems like it would be) That's incorrect. It's only in the extreme case of non-uniqueness -- for example, 3 possible values in a table of 10,000 records -- that indexes are useless. In all other cases, indexes are a Good Thing. Questions for "Shall I create an index?" 1. Will the column be part of JOIN statementents in common queries? 2. Will the column often be used in a WHERE clause? If so, how? 3. Will the column be used in an ORDER BY clause? If the answer to any of the above is yes, you want an index. What *kind* of index is a more complex topic. But as a reference, about 50% of the columns in most of my databases are indexed. -- -Josh Berkus
Hey Josh- Thanks for the response and guidelines. I think that I figured out something concerning indices and foreign keys...It seems like a non-unique index is automatically getting created for my foreign keys. Does this make sense? I got this hunch from this function I found while peeking through the archives today... pg_get_indexdef(oid); // oid should be the oid of the index you are investigating Using this I discovered some indices on foreign keys that I didn't explicitly create. Is this true? Thanks-Beth -----Original Message----- From: Josh Berkus [mailto:josh@agliodbs.com] Sent: Wednesday, June 12, 2002 4:35 PM To: Beth Gatewood Cc: pgsql-sql@postgresql.org Subject: Re: [SQL] make a unique index for foreign keys? Beth, > So one quick question while I have you ;-) So those indexes would be > non-unique indexes....I thought that I read somewhere that one shouldn't > bother with those types of indexes (was that a total mis-statement...it > kinda seems like it would be) That's incorrect. It's only in the extreme case of non-uniqueness -- for example, 3 possible values in a table of 10,000 records -- that indexes are useless. In all other cases, indexes are a Good Thing. Questions for "Shall I create an index?" 1. Will the column be part of JOIN statementents in common queries? 2. Will the column often be used in a WHERE clause? If so, how? 3. Will the column be used in an ORDER BY clause? If the answer to any of the above is yes, you want an index. What *kind* of index is a more complex topic. But as a reference, about 50% of the columns in most of my databases are indexed. -- -Josh Berkus
> Thanks for the response and guidelines. I think that I figured out > something concerning indices and foreign keys...It seems like a non-unique > index is automatically getting created for my foreign keys. Does > this make > sense? Hmm...nope. Creating a foreign key doesn't create an index. However, creating a primary key does - sure you're not getting mixed up there? > I got this hunch from this function I found while peeking through the > archives today... > pg_get_indexdef(oid); // oid should be the oid of the index you are > investigating > > Using this I discovered some indices on foreign keys that I didn't > explicitly create. Postgres doesn't create indices for you, unless as part of a primary key or unique constraint. Chris
Chris/ Josh- OK-response to Chris below. Synopsis here....simply by creating a foreign key will not create an index. On large tables I SHOULD put a non-unique index on the foreign key (right?) Hmm...nope. Creating a foreign key doesn't create an index. However, creating a primary key does - sure you're not getting mixed up there? You are absolutely right! I am confused!!!! I did the actual test.... //create the tables beth=> create table parent(pk INTEGER primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'parent_pkey' for table 'parent' CREATE beth=> create table child (fk INTEGER references parent(pk) ); NOTICE: CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s) CREATE // now get the index oid for the parent table (there should only be 1 index for the primary key) beth=> select pg_index.* from pg_index, pg_class where pg_class.oid = pg_index. indrelid and pg_class.relname='parent'; indexrelid | indrelid | indproc | indkey | indclass | indisclustered | indisunique | indisprimary | indreference | ------------+----------+---------+--------+----------+-------------- --+-------------+--------------+--------------+--------- 26271 | 26269 | - | 1 | 1978 | f | t | t | 0 | (1 row) // get the index for the parent table--and it is the index for the primary key beth=> select pg_get_indexdef(26271); pg_get_indexdef ------------------ ------------------------------------------ CREATE UNIQUE INDEX parent_pkey ON parent USING btree (pk) (1 row) //now try that same thing for the child table and there are NO indices (also confirmed by \di) beth=> select pg_index.* from pg_index, pg_class where pg_class.oid = pg_index. indrelid and pg_class.relname='child'; indexrelid | indrelid | indproc | indkey | indclass | indisclustered | indisunique | indisprimary | indreference | ------------+----------+---------+--------+----------+-------------- --+-------------+--------------+--------------+--------- (0 rows) Postgres doesn't create indices for you, unless as part of a primary key or unique constraint. OK-I've got it now!!! Thank you so much!!!! Beth
Beth Gatewood wrote: > > Chris/ Josh- > > OK-response to Chris below. Synopsis here....simply by creating a foreign > key will not create an index. On large tables I SHOULD put a non-unique > index on the foreign key (right?) Right, because on DELETE or UPDATE to the primary key, the RI triggers have to check that this doesn't leave orphaned references or do the required referential action (CASCADE). Sometimes this works faster with an index... Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
> Chris/ Josh- > > OK-response to Chris below. Synopsis here....simply by creating a foreign > key will not create an index. On large tables I SHOULD put a non-unique > index on the foreign key (right?) For large tables, I guess you may as well. You can be more scientific about it if you you unhash this in your postgresql.conf: stats_command_string = true stats_row_level = true stats_block_level = true Then you can just use the pg_stat views to see how many sequential scans are being run over your tables and how expensive they are, etc. Chris