Thread: Index on two columns not used
Hi list ! I have two table with a 2-column index on both of them. In the first table, the first colum of the index is the primary key, the second one is an integer field. In the second table, the two columns are the primary key. When I join these two tables, the 2-column index of the first table is not used. Why does the query planner think that this plan is better ? ALTER TABLE geo.subcities_names ADD CONSTRAINT subcities_names_pkey PRIMARY KEY(subcity_gid, language_id); CREATE INDEX subcities_gid_language_id ON geo.subcities USING btree (gid, official_language_id); EXPLAIN ANALYZE SELECT * FROM geo.subcities sc, geo.subcities_names scn WHERE sc.gid = scn.subcity_gid AND sc.official_language_id = scn.language_id; Result : Merge Join (cost=0.00..4867.91 rows=37917 width=240) (actual time=0.037..149.022 rows=39323 loops=1) Merge Cond: ("outer".gid = "inner".subcity_gid) Join Filter: ("outer".official_language_id = "inner".language_id) -> Index Scan using subcities_pkey on subcities sc (cost=0.00..1893.19 rows=39357 width=200) (actual time=0.015..43.430 rows=39357 loops=1) -> Index Scan using subcities_names_pkey on subcities_names scn (cost=0.00..2269.39 rows=40517 width=40) (actual time=0.012..35.465 rows=40517 loops=1) Total runtime: 157.389 ms (6 rows) Thanks for your suggestions ! Regards -- Arnaud
Arnaud Lesauvage wrote: > I have two table with a 2-column index on both of them. > In the first table, the first colum of the index is the primary key, the > second one is an integer field. > In the second table, the two columns are the primary key. > When I join these two tables, the 2-column index of the first table is > not used. > Why does the query planner think that this plan is better ? > > ALTER TABLE geo.subcities_names > ADD CONSTRAINT subcities_names_pkey PRIMARY KEY(subcity_gid, > language_id); > > CREATE INDEX subcities_gid_language_id > ON geo.subcities > USING btree > (gid, official_language_id); > > EXPLAIN ANALYZE > SELECT * FROM geo.subcities sc, geo.subcities_names scn > WHERE sc.gid = scn.subcity_gid AND sc.official_language_id = > scn.language_id; My theory: There's no additional restrictions besides the join condition, so the system has to scan both tables completely. It chooses to use a full index scan instead of a seq scan to be able to do a merge join. Because it's going to have to scan the indexes completely anyway, it chooses the smallest index which is subcities_pkey. You'd think that the system could do the merge using just the indexes, and only fetch the heap tuples for matches. If that were the case, using the 2-column index would indeed be a good idea. However, PostgreSQL can't use the values stored in the index to check the join condition, so all the heap tuples are fetched anyway. There was just recently discussion about this on this list: http://archives.postgresql.org/pgsql-performance/2006-09/msg00080.php. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas a écrit : > Arnaud Lesauvage wrote: >> I have two table with a 2-column index on both of them. >> In the first table, the first colum of the index is the primary key, the >> second one is an integer field. >> In the second table, the two columns are the primary key. >> When I join these two tables, the 2-column index of the first table is >> not used. >> Why does the query planner think that this plan is better ? > > You'd think that the system could do the merge using just the indexes, > and only fetch the heap tuples for matches. If that were the case, using > the 2-column index would indeed be a good idea. However, PostgreSQL > can't use the values stored in the index to check the join condition, so > all the heap tuples are fetched anyway. There was just recently > discussion about this on this list: > http://archives.postgresql.org/pgsql-performance/2006-09/msg00080.php. > Thanks for your answer Heikki. I did not know that joins were not using index values, and that PostgreSQL had to fecth the heap tuples anyway. Does this mean that this 2-column index is useless ? (I created it for the join, I don't often filter on both columns otherwise) This query was taken from my "adminsitrative areas" model (continents, countries, etc...). Whenever I query this model, I have to join many tables. I don't really know what the overhead of reading the heap-tuples is, but would it be a good idea to add data-redundancy in my tables to avoid joins ? (adding country_id, continent_id, etc... in the "cities" table) Regards -- Arnaud
Arnaud Lesauvage wrote: > I did not know that joins were not using index values, and that > PostgreSQL had to fecth the heap tuples anyway. > Does this mean that this 2-column index is useless ? (I created it for > the join, I don't often filter on both columns otherwise) Well, if no-one is using the index, it is useless.. > This query was taken from my "adminsitrative areas" model (continents, > countries, etc...). Whenever I query this model, I have to join many > tables. > I don't really know what the overhead of reading the heap-tuples is, but > would it be a good idea to add data-redundancy in my tables to avoid > joins ? (adding country_id, continent_id, etc... in the "cities" table) It depends. I would advise not to denormalize unless you really really have to. It's hard to say without more knowledge of the application. Is the query you showed a typical one? It ran in about 160 ms, is that good enough? It's doesn't sound too bad, considering that it returned almost 40000 rows. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas a écrit : > Arnaud Lesauvage wrote: >> This query was taken from my "adminsitrative areas" model (continents, >> countries, etc...). Whenever I query this model, I have to join many >> tables. >> I don't really know what the overhead of reading the heap-tuples is, but >> would it be a good idea to add data-redundancy in my tables to avoid >> joins ? (adding country_id, continent_id, etc... in the "cities" table) > > It depends. I would advise not to denormalize unless you really really > have to. It's hard to say without more knowledge of the application. > > Is the query you showed a typical one? It ran in about 160 ms, is that > good enough? It's doesn't sound too bad, considering that it returned > almost 40000 rows. It is quite typical, yes. It is the base query of a view. In fact, most views have a lot more joins (they join with all the upper-level tables). But 150ms is OK, indeed.
Arnaud Lesauvage wrote: > It is quite typical, yes. It is the base query of a view. In fact, most > views have a lot more joins (they join with all the upper-level tables). > But 150ms is OK, indeed. If the query using the view does anything more than a "SELECT * FROM view", you should do an explain analyze of the query instead of the definition of the view. The access plan might look very different. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas a écrit : > Arnaud Lesauvage wrote: >> It is quite typical, yes. It is the base query of a view. In fact, most >> views have a lot more joins (they join with all the upper-level tables). >> But 150ms is OK, indeed. > > If the query using the view does anything more than a "SELECT * FROM > view", you should do an explain analyze of the query instead of the > definition of the view. The access plan might look very different. The views are used as linked tables in an Access Frontend. Some accesses are "select * from view", others might filter on a country_id or something similar. For the moment performance is good, so I think I'll keep a normalized database as long as it is possible ! Thanks for your help Heikki !
Arnaud Lesauvage <thewild@freesurf.fr> writes: > When I join these two tables, the 2-column index of the first table is > not used. > Why does the query planner think that this plan is better ? Hm, is gid by itself nearly unique in these tables? If so, the merge join would get only marginally more efficient by using both columns as merge conditions. Heikki's probably right to guess that the planner thinks it's better to use the smaller index. However, if there are lots of duplicate gids, then it ought to have preferred the two-column merge ... regards, tom lane
Tom Lane a écrit : > Arnaud Lesauvage <thewild@freesurf.fr> writes: >> When I join these two tables, the 2-column index of the first table is >> not used. >> Why does the query planner think that this plan is better ? > > Hm, is gid by itself nearly unique in these tables? If so, the merge > join would get only marginally more efficient by using both columns as > merge conditions. Heikki's probably right to guess that the planner > thinks it's better to use the smaller index. > > However, if there are lots of duplicate gids, then it ought to have > preferred the two-column merge ... gid is the primary key of the first table, so absolutely unique. Thanks for the information Tom ! -- Arnaud
Sorry for the amateurish question, but what are "heap tuples"? Also, my understanding is that the following statement applies only for composite indexes: "PostgreSQL can't use the values stored in the index to check the join condition". I assume that PostgreSQL will be able to use single-column-indexes for join conditions. Is this correct? Thank you, Peter Heikki Linnakangas wrote: > Arnaud Lesauvage wrote: >> I have two table with a 2-column index on both of them. >> In the first table, the first colum of the index is the primary key, >> the second one is an integer field. >> In the second table, the two columns are the primary key. >> When I join these two tables, the 2-column index of the first table >> is not used. >> Why does the query planner think that this plan is better ? >> >> ALTER TABLE geo.subcities_names >> ADD CONSTRAINT subcities_names_pkey PRIMARY KEY(subcity_gid, >> language_id); >> >> CREATE INDEX subcities_gid_language_id >> ON geo.subcities >> USING btree >> (gid, official_language_id); >> >> EXPLAIN ANALYZE >> SELECT * FROM geo.subcities sc, geo.subcities_names scn >> WHERE sc.gid = scn.subcity_gid AND sc.official_language_id = >> scn.language_id; > > My theory: > > There's no additional restrictions besides the join condition, so the > system has to scan both tables completely. It chooses to use a full > index scan instead of a seq scan to be able to do a merge join. > Because it's going to have to scan the indexes completely anyway, it > chooses the smallest index which is subcities_pkey. > > You'd think that the system could do the merge using just the indexes, > and only fetch the heap tuples for matches. If that were the case, > using the 2-column index would indeed be a good idea. However, > PostgreSQL can't use the values stored in the index to check the join > condition, so all the heap tuples are fetched anyway. There was just > recently discussion about this on this list: > http://archives.postgresql.org/pgsql-performance/2006-09/msg00080.php. >
Hi, Peter, Péter Kovács wrote: > Sorry for the amateurish question, but what are "heap tuples"? > > Also, my understanding is that the following statement applies only for > composite indexes: "PostgreSQL can't use the values stored in the index > to check the join condition". I assume that PostgreSQL will be able to > use single-column-indexes for join conditions. Is this correct? Both questions are tightly related: First, the "heap" is the part of the table where the actual tuples are stored. PostgreSQL uses an MVCC system, that means that multiple versions (with their transaction information) of a single row can coexist in the heap. This allows for higher concurrency in the backend. Now, the index basically stores pointers like "pages 23 and 42 contain rows with value 'foo'", but version information is not replicated to the index pages, this keeps the index' size requirements low. Additionally, in most UPDATE cases, the new row version will fit into the same page as the old version. In this case, the index does not have to be changed, which is an additional speed improvement. But when accessing the data via the index, it can only give a preselection of pages that contain interesting data, and PostgreSQL has to look into the actual heap pages to check whether there really are row versions that are visible in the current transaction. A further problem is that some GIST index types are lossy, that means the index does not retain the full information, but only an approximation, for efficiency reasons. A prominent example are the PostGIS geometry indices, they only store the bounding box (4 float values) instead of the whole geometry (may be millions of double precision coordinates). So it may be necessary to re-check the condition with the real data, using the lossy index for preselection. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org
Markus Schaber wrote: > Additionally, in most UPDATE cases, the new row version will fit into > the same page as the old version. In this case, the index does not have > to be changed, which is an additional speed improvement. Actually, when the UPDATE puts a new row version in the same heap page, the index must be updated anyway. All the rest of what you said is correct. There is another reason not to put visibility info in the index, which is that it would be extremely complex to update all indexes to contain the right visibility (and maybe impossible without risking deadlocks). Updating only the heap is very simple. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Hi, Alvaro, Alvaro Herrera wrote: >> Additionally, in most UPDATE cases, the new row version will fit into >> the same page as the old version. In this case, the index does not have >> to be changed, which is an additional speed improvement. > Actually, when the UPDATE puts a new row version in the same heap page, > the index must be updated anyway. AFAICS only, when the index covers (directly or via function) a column that's actually changed. Changing columns the index does not depend on should not need any write access to that index. Correct me if I'm wrong. Thanks, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org
Markus Schaber <schabi@logix-tt.com> writes: > Alvaro Herrera wrote: >> Actually, when the UPDATE puts a new row version in the same heap page, >> the index must be updated anyway. > AFAICS only, when the index covers (directly or via function) a column > that's actually changed. > Changing columns the index does not depend on should not need any write > access to that index. > Correct me if I'm wrong. You're wrong. An UPDATE always writes a new version of the row (if it overwrote the row in-place, it wouldn't be rollback-able). The new version has a different TID and therefore the index entry must change. To support MVCC, our approach is to always insert a new index entry pointing at the new TID --- the old one remains in place so that the old version can still be found by transactions that need it. Once the old row version is entirely dead, VACUUM is responsible for removing both it and the index entry pointing at it. Other DBMSes use other approaches that shift the overhead to other places, but that's how Postgres does it. regards, tom lane
Markus, Thank you for your kind explanation. Peter Markus Schaber wrote: > Hi, Peter, > > Péter Kovács wrote: > >> Sorry for the amateurish question, but what are "heap tuples"? >> >> Also, my understanding is that the following statement applies only for >> composite indexes: "PostgreSQL can't use the values stored in the index >> to check the join condition". I assume that PostgreSQL will be able to >> use single-column-indexes for join conditions. Is this correct? >> > > Both questions are tightly related: > > First, the "heap" is the part of the table where the actual tuples are > stored. > > PostgreSQL uses an MVCC system, that means that multiple versions (with > their transaction information) of a single row can coexist in the heap. > This allows for higher concurrency in the backend. > > Now, the index basically stores pointers like "pages 23 and 42 contain > rows with value 'foo'", but version information is not replicated to the > index pages, this keeps the index' size requirements low. > > Additionally, in most UPDATE cases, the new row version will fit into > the same page as the old version. In this case, the index does not have > to be changed, which is an additional speed improvement. > > But when accessing the data via the index, it can only give a > preselection of pages that contain interesting data, and PostgreSQL has > to look into the actual heap pages to check whether there really are row > versions that are visible in the current transaction. > > > A further problem is that some GIST index types are lossy, that means > the index does not retain the full information, but only an > approximation, for efficiency reasons. > > A prominent example are the PostGIS geometry indices, they only store > the bounding box (4 float values) instead of the whole geometry (may be > millions of double precision coordinates). So it may be necessary to > re-check the condition with the real data, using the lossy index for > preselection. > > HTH, > Markus >
Hi, Tom, Tom Lane wrote: > You're wrong. An UPDATE always writes a new version of the row (if it > overwrote the row in-place, it wouldn't be rollback-able). The new > version has a different TID and therefore the index entry must change. > To support MVCC, our approach is to always insert a new index entry > pointing at the new TID --- the old one remains in place so that the old > version can still be found by transactions that need it. OK, good you corrected me. I had the weird impression that both row versions have the same tuple ID (as they are different versions of the same tuple), and so an index change is not necessary when both versions fit on the same page. Thanks, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in Europe! www.ffii.org www.nosoftwarepatents.org