Thread: Optimizing a query
I've currently got this table: ,---- | n=# \d nanpa | Table "public.nanpa" | Column | Type | Modifiers | ------------+--------------+----------- | state | character(2) | | npa | character(3) | not null | nxx | character(3) | not null | ocn | character(4) | | company | text | | ratecenter | text | | switch | text | | effective | date | | use | character(2) | not null | assign | date | | ig | character(1) | | Indexes: | "nanpa_pkey" PRIMARY KEY, btree (npa, nxx) CLUSTER `---- and was doing queries of the form: ,---- | select * from nanpa where npa=775 and nxx=413; `---- where were quite slow. Explain showed that it was doing sequential scans even though the primary key contained the two term I was selecting on. Today, looking at it again in prep to this post, I noticed that the numbers were being converted to ::text, and a quick test showed that queries of the form: ,---- | select * from nanpa where npa=775::bpchar and nxx=413::bpchar; `---- used the index. I specified char(3) when I created the table simple because npa and nxx are defined as three-character strings. Tagging the queies is a pain, especially as I often do queries of that form in psql(1). (Incidently, there are multiple similar tables, also keyed on (npa,nxx), which show the same problem. The nanpa table above is just a good example.) Should I convert the columns to text? Or create an additional index that expects ::text args? (If so, how?) Or is there some other way to ensure the indices get used w/o having to tag data in the queries? Thanks, -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
Have you run vacuum/analyze on the table? -- Husam -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of James Cloos Sent: Wednesday, December 13, 2006 10:48 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] Optimizing a query I've currently got this table: ,---- | n=# \d nanpa | Table "public.nanpa" | Column | Type | Modifiers | ------------+--------------+----------- | state | character(2) | | npa | character(3) | not null | nxx | character(3) | not null | ocn | character(4) | | company | text | | ratecenter | text | | switch | text | | effective | date | | use | character(2) | not null | assign | date | | ig | character(1) | | Indexes: | "nanpa_pkey" PRIMARY KEY, btree (npa, nxx) CLUSTER `---- and was doing queries of the form: ,---- | select * from nanpa where npa=775 and nxx=413; `---- where were quite slow. Explain showed that it was doing sequential scans even though the primary key contained the two term I was selecting on. Today, looking at it again in prep to this post, I noticed that the numbers were being converted to ::text, and a quick test showed that queries of the form: ,---- | select * from nanpa where npa=775::bpchar and nxx=413::bpchar; `---- used the index. I specified char(3) when I created the table simple because npa and nxx are defined as three-character strings. Tagging the queies is a pain, especially as I often do queries of that form in psql(1). (Incidently, there are multiple similar tables, also keyed on (npa,nxx), which show the same problem. The nanpa table above is just a good example.) Should I convert the columns to text? Or create an additional index that expects ::text args? (If so, how?) Or is there some other way to ensure the indices get used w/o having to tag data in the queries? Thanks, -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster ********************************************************************** This message contains confidential information intended only for the use of the addressee(s) named above and may containinformation that is legally privileged. If you are not the addressee, or the person responsible for delivering itto the addressee, you are hereby notified that reading, disseminating, distributing or copying this message is strictlyprohibited. If you have received this message by mistake, please immediately notify us by replying to the messageand delete the original message immediately thereafter. Thank you. FADLD Tag **********************************************************************
James Cloos <cloos@jhcloos.com> writes: > ... and was doing queries of the form: > | select * from nanpa where npa=775 and nxx=413; If those are char(3) columns, shouldn't you be quoting the constants? select * from nanpa where npa='775' and nxx='413'; Any non-numeric input will fail entirely without the quotes, and I'm also not too confident that inputs of less than three digits will work as you expect (the blank-padding might not match what's in the table). Leading zeroes would be troublesome too. OTOH, if the keys are and always will be small integers, it's just stupid not to be storing them as integers --- integer comparison is far faster than string. Postgres' data type capabilities are exceptional. Learn to work with them, not against them --- that means thinking about what the data really is and declaring it appropriately. regards, tom lane
>>>>> "Husam" == Tomeh, Husam <htomeh@firstam.com> writes: Husam> Have you run vacuum/analyze on the table? Yes, back when I first noticed how slow it was. It did not make any difference. explain analyze says: ,---- | n=# explain analyse select * from nanpa where npa=775 and nxx=473; | QUERY PLAN | -------------------------------------------------------------------------------------------------------- | Seq Scan on nanpa (cost=0.00..5344.60 rows=4 width=105) (actual time=371.718..516.816 rows=1 loops=1) | Filter: (((npa)::text = '775'::text) AND ((nxx)::text = '473'::text)) | Total runtime: 516.909 ms | (3 rows) `---- vs: ,---- | n=# explain analyse select * from nanpa where npa=775::char and nxx=473::char; | QUERY PLAN | ---------------------------------------------------------------------------------------------------------------------- | Index Scan using nanpa_pkey on nanpa (cost=0.00..4.33 rows=1 width=105) (actual time=64.831..64.831 rows=0 loops=1) | Index Cond: ((npa = '7'::bpchar) AND (nxx = '4'::bpchar)) | Total runtime: 64.927 ms | (3 rows) `---- BTW, I forgot to mention I'm at 8.1.4 on that box. -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
Your nap and nxx columns have character datatype, so you should use quotes. Try: explain analyze select * from nanpa where npa='775' and nxx='473'; If that does not work, you could try to influence the planner's execution plan to favor index scans over sequential scan by tweaking a couple of the postgres parameters, particularly, the effective_cache_size. This parameter primarily set the planner's estimates of the relative likelihood of a particular table or index being in memory, and will thus have a significant effect on whether the planner chooses indexes over seqscans. Tweaking such parameters are usually done as a last resort. -- Husam -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of James Cloos Sent: Wednesday, December 13, 2006 2:35 PM To: Tomeh, Husam Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Optimizing a query >>>>> "Husam" == Tomeh, Husam <htomeh@firstam.com> writes: Husam> Have you run vacuum/analyze on the table? Yes, back when I first noticed how slow it was. It did not make any difference. explain analyze says: ,---- | n=# explain analyse select * from nanpa where npa=775 and nxx=473; | QUERY PLAN | ------------------------------------------------------------------------ -------------------------------- | Seq Scan on nanpa (cost=0.00..5344.60 rows=4 width=105) (actual time=371.718..516.816 rows=1 loops=1) | Filter: (((npa)::text = '775'::text) AND ((nxx)::text = '473'::text)) | Total runtime: 516.909 ms | (3 rows) `---- vs: ,---- | n=# explain analyse select * from nanpa where npa=775::char and nxx=473::char; | QUERY PLAN | ------------------------------------------------------------------------ ---------------------------------------------- | Index Scan using nanpa_pkey on nanpa (cost=0.00..4.33 rows=1 width=105) (actual time=64.831..64.831 rows=0 loops=1) | Index Cond: ((npa = '7'::bpchar) AND (nxx = '4'::bpchar)) | Total runtime: 64.927 ms | (3 rows) `---- BTW, I forgot to mention I'm at 8.1.4 on that box. -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ********************************************************************** This message contains confidential information intended only for the use of the addressee(s) named above and may containinformation that is legally privileged. If you are not the addressee, or the person responsible for delivering itto the addressee, you are hereby notified that reading, disseminating, distributing or copying this message is strictlyprohibited. If you have received this message by mistake, please immediately notify us by replying to the messageand delete the original message immediately thereafter. Thank you. FADLD Tag **********************************************************************