Re: Normal case or bad query plan? - Mailing list pgsql-performance
From | Mischa Sandberg |
---|---|
Subject | Re: Normal case or bad query plan? |
Date | |
Msg-id | p1Uad.14359$Ia5.5691@edtnps89 Whole thread Raw |
In response to | Normal case or bad query plan? (Gabriele Bartolini <angusgb@tin.it>) |
List | pgsql-performance |
This may sound more elaborate than it's worth, but I don't know of a better way to avoid a table scan. You want to index on a computed value that is a common prefix of your FROM and TO fields. The next step is to search on a fixed SET of prefixes of different lengths. For example, some of your ranges might be common in the first 3 bytes of ipaddr, some in two, some in only one. You create and index on one common prefix of either 1,2 or 3 bytes, for each row. Your query then looks something like (pardon my ignorance in PGSQL) select * from ip2location where ip2prefix in ( network(:myaddr || '/8'), network(:myaddr || '/16'), network(:myaddr || '/24'), :myaddr --- assuming single-address ranges are possible ) and :myaddr between ip_address_from and ip_address_to Although this looks a little gross, it hits very few records. It also adapts cleanly to a join between ip2location and a table of ip addrs. Gabriele Bartolini wrote: > Hi guys, > > please consider this scenario. I have this table: > > CREATE TABLE ip2location ( > ip_address_from BIGINT NOT NULL, > ip_address_to BIGINT NOT NULL, > id_location BIGINT NOT NULL, > PRIMARY KEY (ip_address_from, ip_address_to) > ); > > I created a cluster on its primary key, by running: > CLUSTER ip2location_ip_address_from_key ON ip2location; > > This allowed me to organise data in a more efficient way: the data that > is contained are ranges of IP addresses with empty intersections; for > every IP class there is a related location's ID. The total number of > entries is 1392443. > > For every IP address I have, an application retrieves the corresponding > location's id from the above table, by running a query like: > > SELECT id_location FROM ip2location WHERE '11020000111' >= > ip_address_from AND '11020000111' <= ip_address_to; > > For instance, by running the 'EXPLAIN ANALYSE' command, I get this > "funny" result: > > > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------- > > Seq Scan on ip2location (cost=0.00..30490.65 rows=124781 width=8) > (actual time=5338.120..40237.283 rows=1 loops=1) > Filter: ((1040878301::bigint >= ip_address_from) AND > (1040878301::bigint <= ip_address_to)) > Total runtime: 40237.424 ms > > > With other data, that returns an empty set, I get: > > explain SELECT id_location FROM ip2location WHERE '11020000111' >= > ip_address_from AND '11020000111' <= ip_address_to; > > QUERY PLAN > ------------------------------------------------------------------------------------------------------- > > Index Scan using ip2location_ip_address_from_key on ip2location > (cost=0.00..419.16 rows=140 width=8) > Index Cond: ((11020000111::bigint >= ip_address_from) AND > (11020000111::bigint <= ip_address_to)) > > > I guess the planner chooses the best of the available options for the > first case, the sequential scan. This is not confirmed though by the > fact that, after I ran "SET enable_scan TO off", I got this: > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Index Scan using ip2location_ip_address_from_key on ip2location > (cost=0.00..31505.73 rows=124781 width=8) (actual > time=2780.172..2780.185 rows=1 loops=1) > Index Cond: ((1040878301::bigint >= ip_address_from) AND > (1040878301::bigint <= ip_address_to)) > Total runtime: 2780.359 ms > > > Is this a normal case or should I worry? What am I missing? Do you have > any suggestion or comment to do (that would be extremely appreciated)? > Is the CLUSTER I created worthwhile or not? > > Thank you, > -Gabriele > > -- > Gabriele Bartolini: Web Programmer, ht://Dig & IWA/HWG Member, > ht://Check maintainer > Current Location: Prato, Toscana, Italia > angusgb@tin.it | http://www.prato.linux.it/~gbartolini | ICQ#129221447 > > "Leave every hope, ye who enter!", Dante Alighieri, Divine Comedy, > The Inferno > > > ------------------------------------------------------------------------ > > > --- > Outgoing mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.773 / Virus Database: 520 - Release Date: 05/10/2004 > > > ------------------------------------------------------------------------ > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
pgsql-performance by date: